Value-First AI Daily - Apr 2, 2026
Value-First AI Daily is your essential source for understanding how artificial intelligence enhances rather than replaces human capability.
Generated via AI Transcription (Gemini)โข 90% confidence
[00:00] **Introduction** Chris Carolan: LinkedIn friends, Value First Nation, welcome to another episode of Value First AI Daily. Your collaborative AI Intelligence Report. It is Thursday, December 18th, 2025. Nico, another day, another model. How are you doing, man?
[00:21] **Gemini 3 Flash Release** Nico Lafakis: Doing good, doing good. Yeah, another day. Another new another new model. Um, not new new, but you know, the offset. So obviously we got the Gemini 3 release earlier this month. And uh not to be outdone by itself per se. Gemini 3 flash releases yesterday. And the crazy part to me, and this is something that we've been talking about for a minute. The crazy part to me is that it is better than Gemini 3 Pro. Like, in a lot of ways, it's better than Gemini 3 Pro. But the cost on it is so infinitesimal. I mean, like, it it shames other model costs. That's what blew me away, like the most, like two things, just the the cost to benefit ratio of this model is definitely a glimpse into the future of of like where things are going in terms of like abundance and end user capability at zero cost. And I mean already like we're we're so near zero cost develop I mean like it pretty much is, right? You're able to use these models for free to start developing software. So we're in a zero cost development environment, and now it's just getting better. Like I don't think that it clicks for people enough that every time you talk shit, in another month or two, you won't be able to talk shit anymore. It's the case of when you see something and you're saying to yourself, well, that's kind of all right, but it it's not that not the greatest. You should be noting to yourself instead, oh, that's pretty cool. I'll bet you two to three months from now it's going to be awesome. That's the attitude shift that you need to make. You need to make the shift from oh, it's all right, but it's not that great and you know, it should be able to do more than what it's doing to oh wow, those features are, you know, they're at an infant level, but I remember when GPT3 was out, and now it's just magnanimously different. So I can only imagine what's going to happen to this thing.
[04:46] **Testing and Analysis** Nico Lafakis: Reason I say that is because like I you know, I didn't really I no, you know, I I try not to buy into the hype whenever people just because, you know, something releases and then immediately everybody, oh game changer and it's the greatest and oh my gosh, it's the end of GPT. So I try not to buy into that and instead to like do my own um do my own like sort of testing and analysis uh to see how how things work uh and like how much better they are, or if they are as good as everybody says they are. And uh yeah, I got to say uh Gemini 3 flash is it's extremely impressive. For and so like why is it so impressive to me? Because it's it's flash. It is it's a tiny model. And if you guys remember, like maybe a year ago, me and Chris were talking about the fact that small language models aren't really a thing. They weren't then. But now they're starting to creep a bit. And the amount of power that these small language models are putting out, as evidenced by Flash, is even greater than the large language models.
[07:09] **Flash versus Pro: Comparing the Models** Nico Lafakis: Now, I can very much guarantee that because of the fact that Flash beat out um Pro in coding and uh logic um based puzzles and and it even beat it on humanity's last exam. Like it it's the the benchmark smashing that this little guy did and the fact that it's on par with 3 Pro, with uh 5.2 GPT, with 4.5 Opus. It's on par. A tiny ass model is on par with the largest frontier models. I don't know what flash is trained on. I got to I would guesstimate that it's at least less than 500 billion parameters. We're talking about a model that's it's it's got to be, I mean for for a light model, it's probably less than 100, be honest. So we're talking about something that's anywhere in the ballpark from 100 to, you know, most of the small language models, right, like llama, they're 8 billion parameter or 20 billion parameter. Some of the downloadable ones they're like 80 billion parameter. Um so let's just play nice and say that this is 250 billion parameters, right? It's up against models that are 1.5 trillion. You got it? Like it's it's up against like big big ass models and it's it's not dwarfing them, but man, the fact that it's at the same level.
[09:24] **Cost to Performance Ratio** Nico Lafakis: And again, the cost to performance ratio on it is insane to me. The fact that you can get that level of performance at I mean if you if you guys have played around with the API for Opus and 5.2, it's uh yes, Opus 4.5 way, way cheaper than Opus 4. I'll give him that much, but still, it's a little bit more than Sonic 4.5, which if you're doing enough computation, if you're doing enough coding, my god, that burns fast. Um being able to use this model at pennies on the dollar, like tenths of pennies on the dollar for the same level of performance.
[09:58] **Three Tier Framework** Chris Carolan: I mean, is that the is that the main value and main reason why like this three model framework is happening where there's a super fast uh I mean that's where I guess the way that they name them differently um related to speed or size or uh complexity whatever, like help us think through like why is it important to have why do they stick to this like three drop like per per model like framework.
[10:43] **Google's Three Tier System** Nico Lafakis: Well, I got to say that at least with with Google, there is there's a three tier system, but you have to be at like the maximum subscription level for the third tier, the max uh version of stuff. So like there is uh Google Gemini Pro Max or whatever, or Gemini Pro High, I think, something like that. Um so yeah, to your point, they they do all have uh three tiers. I understand why why Gemini Pro High is gated. Um I know that GPT doesn't like to do the same thing necessarily and Anthropic doesn't like to do the same thing necessarily again. The the idea, which of course the end users probably aren't taking into account. The idea is that you would use the lower parameter models to solve lesser problems, right? Maybe I wouldn't even say Excel formulas, like you at this point, you are using, you know, genius level shit to pump out an Excel formula. Um even if it's like Haiku. So like Haiku or Chat GPT fast or Gemini fast, it's meant for from a programming perspective, right? It's like pushing out uh a demo model of something or a demo app of something like super fast. And I'm talking about like less than 30 seconds. Um is it, you know, is the quality going to be amazing? I mean, now with Gemini, yes, but generally it's all right. It's enough to go off of.
[12:37] **Marketing and Smaller Models** Nico Lafakis: If we're talking about um marketing and stuff like that, typically the smaller models also mean that they stay on task a bit better than the larger models do. So larger models, you can think of it almost as overthinking a problem or over engineering the problem as opposed to the smaller models tend to just look straight on and take things one at a time. The medium size stuff is pretty much what it sounds like. It's trying to do the best of both worlds. It's trying to apply a little bit of reasoning without overthinking the problem. But once you start getting into the high level reasoning models. Now we're talking about it it should be a lot of these should be gated, but essentially the high level models we're talking about researchers, like days upon days of researching. Talking about, you know, almost near full stack app development. In the case of Claude, I mean it is actually able to quite literally go further. And the the best way I can explain that is the other day I had to garner some information. And so I asked Claude if it was possible to do so. Claude couldn't directly do it, but within the chat window wrote its own Python application to then do it. Now, Haiku wouldn't have done that. Haiku would have just been like, okay, well, I can't do it, but I can give you the code. I can give you the Python for you to be able to put it together to be able to go do it on your own.
[14:30] **Limitations of Models** Nico Lafakis: So there's there's that limitation. I shouldn't even call it a limitation, but there's sort of that distinction between the two, that one is willing to work longer and harder than the other one as well. Um so it's, you know, on task, let's say, if you were to put Claude Haiku versus Sonic versus Opus on a particularly difficult task, you could in in that order is when they would give up. Haiku would give up first, uh Sonic would give up second and Opus would probably go for hours before it actually gave up on the problem. So, yeah, I would say that the difference, you know, low to high is very much related to how difficult is the problem that you're trying to solve. Is it something that's fairly minuscule and should have a very quick, easy answer or is it something that needs to be thought about for quite some time, right? And you can take that from uh at least in GPT's terms, you can go from high end model to answer a singular question to high end model research or high end model agent or high end model, whatever it is you want to do, right? Um yeah, I I think that's that's the major difference and again to see to see something and like so it's it's crazy to me because fast doesn't like Gemini 3 fast doesn't have as much reasoning, nowhere near as much reasoning behind it as the other models. Um well, at least at the other other models at max level, right? So, it is very, very interesting to me to see a post-trained model that is able to perform like that without having to have that much like reasoning behind it. It just it it kind of opens the doors, you know, the other day I said, I'm pulling everything forward by two years.
[17:08] **New Features Coming** Nico Lafakis: I'm glad that I did because like this kind of breakthrough is just like, okay, well, this is nuts. Like we're going to start seeing, you know, we've already gotten 5.2. I'm assuming we're going to see like chat 5.5 or something for Christmas or maybe even six, who knows. Um, would they jump the gun that much just to stay ahead of the pack? There's talk of Sonic, like the next Sonic. I mean, Sonic 5 would be crazy. That would be crazy. I I saw a Reddit leak post about 4.7 uh for Christmas. Um I could see that. We'll see just to like that's the thing that's the interesting part about uh I guess the the positive part about the how how how stuff works in the US uh is them driving each other to just like compete. Um but I'm going to share my screen real quick uh and because like this conversation about which um which level to use just doesn't uh, you can see my screen, yeah?
[18:52] **Collaboration in the Modern Era** Chris Carolan: Yes. Chris Carolan: Just doesn't jive with the way that I want to collaborate, right? I would do the exact same thing. I'd like three up basically on my end. Chris Carolan: And like this idea of if we've got fast here, answers quickly, thinking solves complex problems. Pro thinks longer for advanced math and code. Like I want advanced like conversations where I don't have to like remind of everything and I know the context is coming with it. Um, and I think that's why if we go over here to Claude, it says most capable for complex work, best for everyday tasks, fastest for quick answers. Whenever I've tried to like use the faster models, like I'm never one and done, right? So like as soon as it goes into a bigger conversation, I'm like, oh man, I should have think I should have uh used the other model. That's why it's been very hard to get out of Opus 4.5. Um with thinking like always on because I know I'm very likely to have a continuing conversation where we're adding attachments and context is shifting throughout. Um, and when I've turned that off, even thinking within Opus 4.5, it's like there's a there's a difference uh in the output. Um, so.
[20:43] **Conversation and Models** Nico Lafakis: To watch that I can fully understand that and I I know exactly where you're coming from, but I have started uh I don't know. I guess project projectizing, I guess um conversations. So to your point like if I start with Haiku or Sonic and I'm doing something blah blah blah. And then it becomes semi meaningful, but I'm worried about model choice. Then I'll just push it into a project, like create a project for it. And then after I'm in there and just start using the other model, reference the previous conversation and like keep it going in that regard. Um because that was one limitation that like I kept running into too is like, okay, I don't want every problem to be solved with thinking only because again, it might like over engineer the answer that I need as opposed to just giving me the straight answer. So I definitely flick back and forth between having thinking on and having it off with uh 4.5.
[21:54] **Updates on the Way** Nico Lafakis: Oh man. Je is going to rip this out so quickly. Meanwhile, I just got surprised by uh uh it'll show up when I open this screen, but um. Yeah, this is something that like the the speed, right? Like is is crazy. And there's integrity here like um yeah, this is one of those cases where like one time I I like Claude suggested, can you make a picture? Like, all right, give it a go. Right? It's like, okay, now just give me the prompt. I'll go over to Gemini and then we'll get a real picture going. So, um still understanding the the limitations, but uh um not to steal a thunder, but to steal the thunder. Uh I just have you seen this switch? I did not see this yesterday.
[23:07] **Chat-to-Code** Nico Lafakis: Uh, yeah, yeah, to go from chat to code. Yeah. That is really nice. Um and like because I've been I've been working out of the IDE antigravity and I can't get past a problem won't connect with my Google account for some reason. So all I can use is the is the little IDE part and not all the fancy bells and whistles of antigravity. Wow. Uh and so naturally, now that this is an option here, I'm going to see what this is all about at some point. Um but uh yeah man. Speed, speed, speed in all the ways. Uh. So I I got to tell you it is very, very refreshing. Um and yeah, I mean like so I gave the same prompt to Gemini uh flash and 5.2. Gemini is already done. Uh and it looks really good for what it is. Uh Claude is still going and uh GPT uh hasn't even started yet. It's still in planning mode and like serious, serious planning mode. Almost like it's never going to stop planning or something. Um.
[24:47] **Future Updates** Chris Carolan: Have you noticed the auto the auto next steps from Cloud Code? Uh, no. I started out yesterday. Oh. Where uh. Yeah. Um I'll I'll show after this is done running. But basically it'll like preload the next line in the command line. Oh, so it's like a code hint type thing. And you can just hit enter and it'll And it was like 80% of the time I was doing that yesterday.
[25:30] **Haiku and More!** Nico Lafakis: Right, because it can like that's the thing is if you gosh, I don't know because I'm not that great with with this stuff, but you know, it's like if you set up a front end or something like it's just a web UI or something like that, it's just HTML. And then like you want to go deeper than that, right? It will it will do that, right? And then what's interesting about it um is that before it was asking and it was saying like, hey, you did you did this and you know, you want to do logins? Do you want to start setting up a back end, right? And yeah, to your point, if that's now just the next Yeah. thing and it still does that, but then it preselects the option that it suggests Yep. in the text above and I'm like, yep. And I was like, first time I was like, am I really just hitting enter here? Is that how easy this is getting? I mean, should it be more difficult than that? Right. Right? Like people keep. Oh, you can tap. On about that like but the job becomes too easy or like what's the point of learning coding or something like that. And it's like, I don't know. I don't know what the point of it is either. Like it's weird to me because it's just something that gets in the way of the end product to me. Like it's not something that's like, oh shit, like it was so learning all this, right?
[26:55] **Learning Curve** Chris Carolan: Thanks for joining us today. Chris Carolan: Hey, Christine, what's happening? Oh my gosh. Um awesome, awesome um individual and. Yeah. This is the learning curve, right? This is the real learning curve right now. Like which model to use in which situation. Um there's a difference between oh, let me just show somebody real quick what I what I mean. Mhm. Versus let me get a working version up that's going to um you know, follow the prompt to a T. Like the difference between one sentence prompt outcomes and like give me a comprehensive prompt with all of the specs and now I need you to hit them all exactly. Right. Right? Uh is when I mean even now like if the instructions are pretty good, I mean fast the smaller models, that's what they're designed to do. I think. Um is if complexity has been removed and they can remove complexity on their side, they you don't have to think. Like you get out of that thinking mode where like, no, don't think about it because then you'll get you'll get off track. I mean again, just like humans. Um and this is an interesting this is a display. Right. here right now.
[28:45] **Overthinking and Overengineering** Nico Lafakis: To me it's crazy. Like and I'm I'm not even joking. Like Gemini was done a while ago. Like. Yeah, by the time I shared the screen. It was pretty much right after I hit enter. I wish I was joking. It took like maybe 15 seconds and it did this, right? Now again, it's not like it's not as good as Opus 4.5, but like Opus 4.5 with thinking and all, right? Like that's what it's up against in terms of like how well did this come out by comparison. And it's like 10, 20, maybe 100x the cost. Right. Right. Right. That's the the biggest thing is like, okay, you got this, but it cost you like, yeah, exactly. It cost about 100x, right? 5.2 is still going. The odds that this thing is going to be done soon. And it's not even this is just on auto. This is not even on thinking, like I didn't even force it to do anything, right? So, that's the thing with Open AI is like I really I applaud you guys for every time that you say that your models are just like excellent at coding. I'm not trying to say that they're not and, you know, not trying to say that that they're uh bad at it or something or or whatever. It's it's just I don't know. Like they they they really do. Like they they overthink the shit out of it. Um they over engineer the hell out of it. And I think it has a lot to do with you know, GPT trying to guess what it is that you're going to want basically and trying to put that in ahead of time essentially, like give you options that maybe you didn't even ask for. Um even the code amount is it's not I don't think it's more than what Claude came up with, but it's definitely close.
[30:49] **Code Comparisons** Nico Lafakis: Yeah, it's definitely close. So that's that's the other aspect, right? For Opus to be able to push out that much code that quickly. But that that was the other thing I was reading about this morning. So Claude did nearly the same thing at 1300 lines. Yeah. And Gemini Fast did it in 366. Can you imagine if I give Gemini Fast like a few more things, like a few more instructions or whatever, in under a thousand lines. I'm very sure in under a thousand it will get to the same level as cloud code. And that's kind of the point really for me is you can get this level of complexity from this cheap model in like 1/4 or 1/6 of the time. So, even the time that it took to get this from Claude, I think you could have prompted it into Flash and gotten it to the same level, right? Like, I mean now knowing what it is. This show will end before GPT will finish. Like that's this is sad. This is really sad. Yeah. what I think. I I do think um.
[32:30] **Competition is Driving Change** Nico Lafakis: Man, if if Open AI if Open AI wasn't first out of the gate, they would be in serious trouble right now. I think because they're first out of the gate, they've got the user base, right? And this is all in context of Google has always had the user base, so it's like their game to lose. Oh, yeah. But to be on the island, I don't know. It's going to be super interesting um because they've kind of chosen to not niche, I I think. Um and that's why you know, I think that's why Anthropic is able to uh they're they're on they're seemingly on secure footing because of the way they've gone to market and focused on because I remember like hearing them say like, no, we're going to focus on like coding. Like that's where we're going to be the best. And I was like, oh, I got sad about what could happen to chat. Like you're not going to develop keep developing like my the part I like. Uh and that's not true either. It's just like being able to prioritize in that way. It would be hard for me to pick out like what um other than being available to as many people as possible with with frontier models, what is the priority for Open AI? Like that could be that could be the priority. That's what I've heard um you know, said. But like as far as output, like does that mean they'll never have the highest quality output in any one specific domain?
[34:30] **Open AI's Excelling Competencies** Nico Lafakis: I think for now, so far as I can tell, which is kind of odd because Dar was building Claude for this. But so far as I can tell, I think that GPT like excels extremely when it comes to medical and when it comes to like um regulatory stuff. So like law like legal or medical. I think it's really really good at that. It might be really good at math, too. Not entirely sure. Okay, so we went up to 492 on this. Um it might be really good at it. I can't say for sure. It failed to generate, you son of a bitch. What? Oh, yeah, right? All that time with GPT and still nothing. So, in that time, we got basically a mimicked version of what we have in Cloud with the miniature check lists. Right? SOP library, I didn't even know. I didn't even ask for that. Here we go. All right. Well, we're we're at time. Um sorry, sorry uh Chat GPT. Good good effort. Good effort, buddy. Um. But uh until tomorrow. Uh everybody have a great day. I'll see you then. See you guys.
More content you might be interested in.
Subscribe to Value-First AI Daily and never miss an episode.
Courses, playbooks, interactive tools, and data model examples. Everything you need to transform your CRM.
Your donation helps us provide free resources and office hours to the community