WEBVTT

0:00:15.120 --> 0:00:20.640
<v A>Programming throwdown Episode 187 Agentic Coding Take it away, Jason.

0:00:21.600 --> 0:01:34.630
<v B>Hey everybody. Okay, so quick intro topic here. I was talking to a lady, I won't use her name, let's call her Jane. I was talking to Jane, friend of mine, and she has been at her company, which is kind of like a hardware company. So it's not really Jane's specialty. It's a hardware company that needs some software, right. And Jane's unhappy there. And there's all sorts of reasons why Jane's unhappy. It's understandable. And she's at nine months. And so they have a one year Cliff. And so she has to kind of tough it out for three more months. And it just made me think about vesting schedules because I think that companies, you know, haven't really thought this through. There's like a cargo cult mentality where, you know, oh, this company did you know, a this type of vesting schedule. So we're just going to copy them. And I think that some bad behavior has been copied and when you start breaking it down, it never really makes sense. And so I'll start by criticizing the one year.

0:01:34.630 --> 0:01:37.190
<v A>Cliff, you want to explain what it is though?

0:01:37.190 --> 0:05:44.100
<v B>Not everybody. Oh yeah. Okay. So, okay, if you're in, let's like wind it back. If you're in, let's say sales, you get a commission, right? So you build a portfolio of companies that want, you know, some product and you get a commission. And if you're doing some service based sales and you could potentially get a commission every single year, you could have a customer that is going to use your service for the next 20 years. But because you were the salesperson that like got the deal, you will just keep getting commission for 20 years. Potentially. That's how sales works, right? And so for engineering and other, you know, they want to have an incentive structure. But, you know, it doesn't make sense for us to do commission because we're not that close to the customer. So what they'll do in our case is they will give equity. So they'll say if it's a private company or if it's, you know, yeah, not a public company, they might give you a percentage of the company. You might get options, you might get RSUs or shares. If it's a public company, you're going to get shares. So and the way this becomes an incentive is they price the shares based on your start date. So for example, if a company's stock is trading for $80 a share, they're going to tell you okay. For the next four years, you're going to get this much number of shares, you know, every month or every three months or there's some schedule, right? But let's say the price jumps up to $160 a share. Let's say the price doubles. You're still getting that number of shares. And so you can end up in a position where if the company does well, even if you don't get more equity later on, just that one grant could double in how much you're getting every month just because the company did well. And so you're kind of. It aligns your incentives with the company's incentives. Okay, so that's. That's equity. Now, you know, they're not going to give you four years of equity, right. On day one. Right. Because you would just leave, right. Like any rational person, if they said, here's four years of, you know, upfront money and there's no strings attached, well, then, yeah, you would just go to every single company collecting four years and work for a month or something. Right. So obviously. So they're going to space it out. Makes a ton of sense. Now, where I think a lot of companies go wrong is they have this one year cliff, which means that for the first 12 months, you won't get any equity. And then after 12 months, they will give you the past 12 months worth of equity right then and there. And then you'll start accruing, you know, monthly or quarterly or something like that. I think this came about. I mean, I haven't done the sort of historical work on this, but I think this came about back when a lot of these companies were very small. And so each person had to be a line item on your capital table. So when you're a small company, you have what's called a cap table, and it lists out who are your investors, basically all the people who own parts of your company. And so, you know, I guess companies wanted to sort of try people out without polluting their cap table with all these names of all these people who, you know, after a couple of months, it wasn't a good fit. That might be the inspiration. Maybe you could argue that it, you know, saves a bit of money because people who leave before a year, you know, you don't have to pay them as much. Right. Um, so that's the idea behind it in practice. It's kind of like, you know, that joke. What is it like, communism, like, works in theory, but it's never. Do you have the saying? Have you heard the saying? It's never been applied. Right.

0:05:44.260 --> 0:05:45.060
<v A>It's like something.

0:05:45.060 --> 0:06:44.730
<v B>Yeah, yeah, I've. Yeah, yeah, there's, there's some saying, yeah, we're not, we're not good at politics here. But there, there's some kind of saying where it's like, oh, communism works. That just wasn't true communism. So it's always things where like the cliff never actually is implemented this way. Every single company I've worked at, you know, there have been people who for one reason or another had to go within a year every single time we prorated their equity. I have never in my entire career seen a time where somebody, let's say, worked six months and after six months we decided they weren't a good fit or there was a reduction in force or what have you, and we didn't give that person the six months of equity. So. So it's kind of like, it's kind of like a stick, but it's never really used. And then on the flip side, you have cases like Jane here where you know, she has to kind of stick around for three more months. So that's my kind of rant on, on, on cliffs. What's your take on one year Cliffs?

0:06:46.250 --> 0:09:58.560
<v A>I mean, I guess I've worked a lot and I've never seen the fact that like, oh, in the first year you're gonna like somehow learn something about the person or like I, I mean I've never seen people just be like, oh, this isn't working. Let's let them go before the first year as like a, that way we don't have to pay them. Which I guess would be like the fear as an employee. But I've just in general never seen someone let go in under a year. Like because it's, you kind of give them a period of ramp up, whatever and by the time you kind of like get through that and you're like, oh, they're really not, you know, sticking around the process to sort of let them go at most big companies is long enough that they'll get past a year easily. So if they decide to stay, they'll make it. And so that's a good point. Hearing that between your pro rating story, it's just, I think, yeah, like you said, I feel like it's just a stamped out template and it's probably been, you know, legally tested enough that they're hesitant to, you know, do something different. It also by having the longer, you know, like the four year vesting schedule rather than just giving something for one year and then just re giving it each Year, you know, as like over the next 12 months I think is for hope that the stock will grow and sort of give you. It's not like a golden parachute, but sometimes people refer to it as like handcuffs or golden handcuffs is that the hope is a price goes up and therefore you, you, you don't want to leave because you wouldn't get an equivalent pay package somewhere else. But I will say just very recently a lot of software as a service companies had a really huge stock decline as like AI tools like we're going to talk about today are rolling out because people are nervous about future earnings. And so there's a lot of employees who are looking to basically bail or leave because huge portion of their compensation. I mean a lot of places it can exceed 50%, it's easily 30%. Like it's a very big portion of your salary. And people at these public companies just treat it as cash. So the fact that it's not in fact like there's a very happy to get into the, you know, finance game theory of it, but maybe for another time. Even the fact that they're issued as RSUs is really an accounting trick to basically say that basically Wall street agrees to treat that cost different than a salary cost. And so therefore it looks a little bit better to have our issues as a separate line item. And so if all those things you got rid of, it's just another sort of quirky way of paying people. Most companies already have a stock purchase program, right. Where you can either, you know, encouraged or get a discount to buy stock with a portion of your salary. So there's various ways they already get sort of joint ownership with the company. So yeah, I really think it's sort of time for a revamp. But it's possible to happen with these disruptions where if some companies are really like, if people are really worried about what's going to happen then shifting some compensation from you know, stock based to cash based for public companies, that's the point of being public, right. Is that these things are like mark to market. That's much harder for a startup that either doesn't have the cash or doesn't have a reliable market value.

0:10:00.240 --> 0:12:17.590
<v B>Yeah, yeah, totally right. And then the other which like kind of got popular in like a certain niche of companies is this like 10, 20, 30, 40 vesting schedule. So this, I've never worked at a place that has this. Amazon is definitely the most famous. I think Snapchat had this for a while, although I think they abandoned it. But the idea was that you would get 10. So if your grant is, let's say 100 shares, to make it simple, you get 10 of those shares your first year, you get 20 the second year, 30 the next year, and 40 the final year. And so it would kind of backload the equity. I never understood that. I always felt like, shouldn't it be the opposite? Because the first year you don't have any refreshers or anything. And so it never really made sense to me. That, that's one that just boggles me. I really don't know why they do that. I mean, people have said cytically, oh, it's because I think the average tenure at Amazon is very low. So the average tenure. Okay, kind of side rant here, but like people say the average tenure at Bang is, is four years, but it's actually much longer than that. That's only if you measure all the people who have left. Right. You can't really do it that way. You have to assign a tenure to the people who haven't left yet. You know, if you give them a tenure of infinity, well, now the overall, the average tenure is also infinity, right? If you give it zero, well then, if you don't count it, well, then you get four years. But neither of those are really appropriate. So the tenure is pretty long. You know, at Amazon people said it was, I think 18 months or something. Again, it's probably longer than that, but it's definitely a lot less than the other companies. But, but I feel like that's probably not the reason they did the October 20, 30, 40 would be because they plan on getting rid of people after two years. But I think it's kind of what you said, Patrick, that, you know, they just came up with something and then it's just too hard to change.

0:12:19.670 --> 0:14:38.310
<v A>I do have a separate thing we should move on. But the, it's like reverse survivorship bias. So there's this thing where people first start thinking about, you know, writing code to back test stock trading. So they take the s and P500, you know, top 500 companies, and you look back in time. But what you are missing out by doing that is like the stock index changes over time. So if you take companies that exist today, you're not looking at all the companies that have failed. So if you want to go 10 years ago and play Ford, you need to start with the set of companies that were in existence 10 years ago, not the set of companies in existence today. Because when you go backwards in time, you're going to lose companies that haven't yet. Started, but that's fine. But you're also going to lose companies that failed, which is hugely impactful. And so they call that survivorship bites the same thing as planes which returned from World War II. And analyzing where they had bullet holes misses the fact of, like, that's not where you should, you know, patch up the bullet holes. There's like a famous meme about this, right, because you need to look at the ones that went down, right? Like, those are the ones. But these come up like you were, you were sort of mentioning in, in the, in tenure computation, which I didn't actually realize. So unless your company is like super old to where people retired and everything, you don't like, sort of reach steady state. And then you, then you can't account for, for sort of flex. The other one I saw it is someone was talking about, you know, we, we've talked previously about starting to, to run and learning to run or whatever. You know, maybe not when we were younger. And people are talking about this is standard of qualifying for the Boston Marathon, where you have to run pretty fast in a marathon. And they were talking about the number of years it takes to qual, like, train before you qualify for the Boston Marathon. But they could only take people who, who had qualified for the Boston Marathon and saying how long they trained for. But like me, I've run for a few years. One, I'm not anywhere close to qualifying. And even if it was my goal to qualify, I don't know that I could. It just takes a lot of work to get there and I'm not particularly naturally talented for it. So I would never show up in the numbers even if I had it as a goal and trained for 10 years. So what does it mean to say on average people took two years? It's. On average people who succeeded, succeeded in two years. Right. Like, what about all the people who didn't? You have to put that number in or else, unless you know that's going to happen to you, you don't know if that's a, a valid statistic.

0:14:38.870 --> 0:14:41.550
<v B>Yeah. We call this a type 2 error or a false.

0:14:41.550 --> 0:14:42.550
<v A>Oh, it has a name. Good.

0:14:42.820 --> 0:14:57.060
<v B>Yeah, it's all the, the things that, that, you know, that went wrong that you never saw. So all the people who didn't qualify for the Boston Marathon because they sprained their ankle and you just never saw it because they never applied. Yep.

0:14:57.780 --> 0:14:58.340
<v A>All right.

0:14:59.460 --> 0:15:03.900
<v B>All right. Actually, well, we, we should at least just put a, put a bow on this. I mean.

0:15:03.900 --> 0:15:05.260
<v A>Oh, sorry. Yeah. Go for it, Amazon.

0:15:05.260 --> 0:15:38.250
<v B>Well, no, I Mean, so, okay, the question is, should you blacklist or should you just not apply at any company? Well, okay, almost every company has a cliff. You're not getting around that. Maybe, maybe they'll all listen to us and they'll get rid of their cliffs. I know OpenAI got rid of their cliff, their one year cliff about a year ago, but. Okay, so we're not gonna get away from that. But like the, the 10, 20, 30, 40, should people still go to those companies? I guess it's. I guess maybe it's just not the most important thing, you know, it's not a.

0:15:38.330 --> 0:16:04.960
<v A>That's what I was going to say. If you're comparing two companies, I think the incentives are poor for fit and change ability for the backloaded, things like 10, 20, 30, 40. So I would be, it would be a notch against. But I don't think it's a. If that's the way to break into the industry, to get into a tech company, if that's the people that are going to, you know, pay you what you feel you're worth, then like, I don't think it's a reason to not go there.

0:16:05.600 --> 0:16:07.280
<v B>Yeah, I think that's fair.

0:16:07.280 --> 0:16:12.880
<v A>So unless you think there's foul play, unless you have evidence that they're purposely churning people out within a year or two.

0:16:13.760 --> 0:16:37.150
<v B>Yeah, I think we're in the same boat. I mean, I would say maybe a little stronger. You know, for me it's definitely a yellow flag. Companies that do 10, 20, 30, 40, it is definitely a concern. Uh, but again it's. If you have a good boss, a good team, it seems like the work you're interested in, then it could easily overcome it. Um, all right, on to news. Patrick, what's your news?

0:16:37.310 --> 0:18:58.840
<v A>All right, so my news, probably most people. Well, actually, I don't know, I'm a little curious. I don't feel like it got as much coverage. I just. As it should. But it is now passed, which is the Artemis 2 mission, which was the United States NASA trying to send people not in orbit but in a loop around the moon. So they were influenced by the moon's gravity to sort of slingshot around the backside of the moon and back to Earth. And this happened as of this recording like last week. And it was a, you know, very exciting thing. But it was a little, I don't know, under underappreciated initially. I think once it was happening, the news coverage kind of picked up, but a lot of people just like weren't talking about it. It didn't have hype if you ever, you know, talk to someone who was around for the original moon landings, which are now, you know, whatever, 70 years. It was everything. It was, it's a major deal. It was like absolutely wall to wall coverage. And I feel this was just, you know, that's cool and people just kind of moved on for it. And I've been trying to come up with a theory as to, you know, sort of why and there's ones, you know, just the world is a little bit of a different place politically. Our country here in the United States is, you know, and what do you say, a bit of turmoil, I guess, internal stuff. So it's hard to get, you know, everybody to rally on any cause. But then I think that as well, I feel with the amount of launches for things like Starlink and the International Space Station, I think it's a bit just of it feels like, yeah, okay, we do this all the time. And I think people don't understand the sort of energy difference needed to go to the moon versus go to low Earth, low Earth orbit. And so anyways, I just want to give a shout out for, you know, the work of getting back to the moon. And you know, I, it's a little controversial whether or not moon base will be set up, but definitely, you know, as far as like technology demonstration and the ability to push the envelope as well as potentially unlock lots of new resources for manufacturing, it's a, it's definitely an exciting time and you know, it's going to be a little bit of a gap before the next Artemis Artemis 3. But you know, Artemis 2 was an exciting watch if you got the opportunity to tune in during it. If not, plenty of YouTube retrospectives go back and check out all that happened.

0:18:59.720 --> 0:19:43.090
<v B>So, so they, so this already is. I follow space and for some reason space doesn't, doesn't follow me either. Like I don't recommend it anything about space. But so, so the, the whole project started and ended. So the astronauts just slingshotted, the moon came back and everyone's safe and yes. Yeah. How did, yeah, I mean I should even like us, you know, non space enthusiasts like that should have showed up on if you I have the. Oh man, I'm not gonna go on a huge tangent on how people get their news. I get my news from swiping left on my Android phone and I just get the Google news that's kind of everybody gets so like really that could have been a good place for it.

0:19:43.490 --> 0:20:17.730
<v A>Yeah, I, I, maybe that is a, maybe it's more of a commentary on people's individual funnels or however you want to call like individual filters that we all have into our new systems now where they're all biased. But I guess if you do not that people below a certain age turn tune into the news. But here since they launched, I mean I live in Florida, so they launched from Florida, so we got some local news coverage of it. So we do sometimes watch the local news just to hear, you know, local events. But yeah, in general national news. It seemed a little undercovered in my opinion.

0:20:18.450 --> 0:20:20.610
<v B>Yeah, yeah, totally, totally agree.

0:20:20.770 --> 0:20:54.820
<v A>They're back. Successful. Yeah. So four astronauts from the United States and one from Canada. So three from the United States and one from Canada went into orbit a sort of half orbit around the moon they went again like sort of slingshot it around the backside. So they took a bunch of pictures. They didn't get super close. They got close enough that it was very big in their field of view, but they could kind of see the whole thing from side to side. Not super low. And so by virtue of being so far away from the moon, when they went around the backside, they actually had a record for farthest travel for a human away from Earth.

0:20:55.140 --> 0:20:55.780
<v B>Oh, wow.

0:20:55.780 --> 0:20:59.900
<v A>And so they went further than any human has gone before. If we want to start trying to

0:20:59.900 --> 0:21:22.980
<v B>change Star Trek, that's cool. You know, but to your point, when the SpaceX people caught the rocket, when it landed in the chopsticks or whatever you want to call that thing. Okay, yeah, that I saw that was all over Google News. So I do think that this wasn't able to get the kind of attention it deserved for whatever reason. Agreed.

0:21:22.980 --> 0:21:45.020
<v A>But a shout out either way, you know, I think it's something exciting and you know, I'm here for the I. There's a lot of people, young people who are have their imagination captured by these sort of like big technological feats. More so than increasingly like yeah, another smaller, you know, phone or you know, a better chatbot.

0:21:45.020 --> 0:21:45.220
<v B>Right.

0:21:45.220 --> 0:21
<v A>I mean people are going to start coming of age who just grew up with that stuff. And so I'm hopeful that some of the at least even growing up for me, you know, going deep under the ocean, going into outer space, there's always things that just feel a little special.

0:21:57.800 --> 0:25:18.940
<v B>Yeah, yeah, that makes sense. Cool. All right, my news story, first news story is the Gemma 4 release. So Gemma is a family of vision language models or multimodal models that, that are totally open source, open weight and are small enough that you can even run them on your phone. You could definitely run them on your desktop or laptop. And I actually did something kind of cool with Gemma 4 already. I got it to. I fine tuned a very small Gemma 4 model to try to correct grammar, and it actually works really, really well. I tried this in the past with really small models and I never got good behavior. So my vision was to build something into my phone where no matter what app I'm on, it would just. Maybe it would be a custom keyboard, I don't know, but it would just correct my grammar. So it'd look at like the entire, you know, content of what I was saying. And. And it would go through and fix grammatical things. And the models were never very good and they were hard to fine tune, et cetera, et cetera. But Gemma 4 actually kind of crossed that barrier where I ran through a fine tuning on my desktop and okay, real quick thing on fine tuning. So, you know, fine tuning basically just means continuing training. And so even though it's a small model, it's small because of all these tricks, but the tricks don't work at training time. So think of it as like you bake a cookie and then you put icing on it. But if you put the icing first and then bake it, or if you try to rebake a cookie with icing on it, the icing gets all hard and it gets kind of not very good. So it's kind of two different processes. And so what you need is to. It might be that to bake the cookie or in this case, to tune the model, you need like a pretty beefy setup and then you can sort of, you know, shrink it later. So. So I got my desktop together. I, you know, fine tune this model. Oh yeah. So there's. There's a whole bunch of tricks now. Low rank adaptation, all these other tricks where you can fine tune it even if you don't have, like a really beefy desktop. Mine's okay. It's got 16 gigs of VRAM, which is pretty good, but not enough to just do a pure fine tuning. Long story short, the tooling is amazing. It's come so far. You don't have to be like a super red in expert to do these things now. I literally just handed it a text file that I got off the web, a CSV that had, you know, a bunch of grammatical mistakes and then the corrected sentences, and then it went off and it just. The validation loss went from like 50% to 80% or validation, you know, accuracy. So the fine tuning definitely did something and it might actually work. So I haven't Finished it yet, but hopefully I'll have something that just corrects grammar for every app. That would be pretty neat.

0:25:19.670 --> 0:25:26.310
<v A>Oh, that's awesome. I was gonna ask how you got good training data. Cause I would give it poor training data if I just gave it my own conversations.

0:25:26.390 --> 0:25:26.950
<v B>Oh, yeah.

0:25:26.950 --> 0:25:27.910
<v A>It doesn't seem correct.

0:25:28.070 --> 0:25:56.360
<v B>I just went on online and said, hey, I need a giant data set full of grammar errors. And I found one. It's called the. I think the C200 C200M data set. But if you just go online, look up grammar error correction data set, you'll find it. It's the first result, and it's got way more grammar corrections than I'll ever use in my entire Life. It's like 200 million of them or something. Oh, wow.

0:25:57.720 --> 0:26:31.970
<v A>Yeah, I mean, I. I would love to try the fine tuning. Unfortunately, I've always. We've talked about this before. I've always lagged in playing video games. I always play older video games. I never had a good gpu. Now I kind of want a good, better GPU than the one I have for both playing video games and for playing with, like, model training. And. And I. It's a sixtortion. I like, I know, like, I could buy one, but then I see articles saying, oh, this is a, you know, $500 GPU. And I go on eBay and it's like, $800. I'm like, no, I'm not. I'm not doing that. Like, I'm sorry. It's like, it's like a principled thing. Like, I just. I can't.

0:26:32.050 --> 0:26:42.730
<v B>I. No, if I had to start over, I would. Well, Well, I use my GPU for gaming too, but. But if I didn't have the GPU, I would probably just use an EC2 instance for me.

0:26:42.730 --> 0:27:00.760
<v A>Yeah. That's what I need to do. It's just one more step, one more, you know, unbounded. Like, it's. It's. Rather than. I spend this money and I do whatever I want. You know, it's like, okay, each time I do this, if I make a mistake is. You'll pay a little more psychologically. But you're absolutely right. I mean, way cheaper to do that stuff often with rented and probably better hardware.

0:27:01.480 --> 0:27:03.560
<v B>Yeah. Than you would have at home. So. Yeah.

0:27:03.560 --> 0:27:36.840
<v A>Yeah. That's something I definitely would love to get into. And I think there's something to be said too, to the approach you're taking. And like, small models in general run a lot faster, run in more places. So even if you have the ability to run a big model or subscription having like a small model that you can use for specific tasks. And we're going to talk about, you know, potentially future, more agentic sort of work, having things offloaded to scripts or to small models to do. And just the speed, the tokens per second that you get is just so much faster. I think you're going to see more of that blended stuff.

0:27:38.130 --> 0:27:39.410
<v B>Yeah, totally agree.

0:27:41.090 --> 0:27:51.010
<v A>Turning the corner to yet one which is the news all over the place. This, this time it's okay. It's a small. You gotta click the link in the show notes or just search it. Hatress.

0:27:51.250 --> 0:27:51.690
<v B>I'm.

0:27:51.690 --> 0:28:04.690
<v A>I hate this game so bad. I guess that's why it's named Hatress. First of all, I watched the. I don't know if it's summoning salt. Whoever does the, you know, Tetris like infinity scores and the broken levels.

0:28:04.930 --> 0:28:05.730
<v B>Speedrunners.

0:28:05.950 --> 0:29:27.260
<v A>Well, I love these. I watch them. I'm like, I'm gonna play some Tetris. I suck. Like, oh, this is the worst. It's so bad. Like I, you know that I'm like, you know, Tetris strategies. Anyways, so I've been playing, I think it's a Game Boy advance homebrew game, Apotris on a little Retroid handheld that I have. So if you're interested, there's like sort of a modern interpretation. So it has like some niceties and, and a lot of customization. So I've been playing that a little bit, but still suck at it. So Hatris though, instead of. So okay, people don't want Tetris is there's all the different shape pieces. There's various ways of selecting them. So sometimes it can be literally just whatever the next piece is is random. Sometimes it's selected from what's called like a draw bag. So they'll put all the pieces in a bag metaphorically and then draw one out at a time. So you kind of. The longest you have to wait for a certain piece is bounded. That's what more sort of modern interpretations do. Hatris instead says, I'm going to run a little like, you know, program heuristic machine learning thing, whatever the computer. And it's going to pick the worst next piece. So you start off getting one of the, I don't know what they're called, like the Z pieces. And you kind of just keep getting Z pieces which are kind of hard to put together.

0:29:27.420 --> 0:29:27.900
<v B>Yeah.

0:29:27.900 --> 0:29:44.660
<v A>But then unless you force it into an option where there's two ways for you to score A line. It's just going to give you whatever piece won't arrange to fit in the notch that you have left. So you have to, like, carefully plant. And to make it worse, it lets you go as slow as you want.

0:29:45.060 --> 0:29:45.380
<v B>Right.

0:29:45.380 --> 0:30:12.970
<v A>So you feel like this has got to be easy. But I literally. It's like getting one point, and then I'm, like, flipping the table, excited, you know, the whole thing going off. It's amazing. But, yeah, zero is where I was started. And maybe I'm just bad. So you can be like, oh, man, that Patrick guy. Like, it clearly just sucks. Like, I. I'm way good at this. Go play it. Maybe you'll score a lot of points, but it is enormously frustrating. I can only play for, like, two or three times, and then I'm like, not.

0:30:12.970 --> 0:30:16.970
<v B>I. I'm done. I can't. I can't. I've got to try it. Hilarious.

0:30:17.370 --> 0:30:21.050
<v A>If you get more than zero, don't email me, because I don't want to know.

0:30:24.500 --> 0:30:28.780
<v B>Oh, man, this is so good. Yeah, I'll have to try a report back. All right. All right.

0:30:28.780 --> 0:30:32.740
<v A>He's going to be doing it while I'm going. A little monologue and give him time to play. Just. Yeah.

0:30:32.740 --> 0:30:35.700
<v B>At the end of our episode, it's like, oh, I got 10.

0:30:36.580 --> 0:30:38.500
<v A>Oh, dude. Yeah, yeah, I'll hang up.

0:30:40.500 --> 0:30:48.820
<v B>Oh, man. All right. My second news is dripwarts School of Drip. So this is.

0:30:49.570 --> 0:30:52.690
<v A>I don't have to click the YouTube link. I already. I already know what it is.

0:30:52.930 --> 0:30:53.730
<v B>Do you really?

0:30:53.890 --> 0:30:55.050
<v A>Yes. Okay.

0:30:55.050 --> 0:31:12.690
<v B>So I play board games every Monday with a group of guys, very fun group. And one of them showed me this. This has got to be the most viral AI video that that's been created. I mean, I don't think anything has even come close to this in terms of virality.

0:31:13.330 --> 0:31:29.910
<v A>I disagree. I think there's been viral videos that aren't clearly AI Their AI that just. It just. This is. This is the most overtly AI thing to go viral. Like, it's clear that it's not bad AI. Like, it's. But it's clear that it's AI. Like, it's not real.

0:31:30.070 --> 0:31:30.390
<v B>Right.

0:31:30.390 --> 0:31:32.910
<v A>I think there are ones that have been purported to be real that were

0:31:32.910 --> 0:31:36.830
<v B>just viral and like, AI. Oh, okay. Yeah, that's true. Yeah.

0:31:36.830 --> 0:31:38.150
<v A>And I don't have evidence to be.

0:31:38.150 --> 0:32:41.190
<v B>Okay. So, yeah, what I mean specifically is, like, this is one where the whole point is it's AI. No producer would ever make this. And we all know that. And so. And it's. It's it's just hilarious. Amazing premise is, you know, Harry Potter, but if, if it's kind of like gangster style but also like, like, you know, high fashion, you know, all mixed together. Very, very funny. I just burst out laughing almost immediately. This is great. I mean, I feel like there. I'm actually surprised that OpenAI killed Sora because I do think that there is a play there. You know, there's, you know, really like opening it up and letting the whole world think about what are funny things that we could mash together and sharing that. That seems like super powerful. I mean, Maybe that's what TikTok is going to become. Is that

0:32:43.910 --> 0:32:48.910
<v A>okay? But you. Oh, I don't know. I don't know how much conspiracy theory, tinfoil hat we want to go about. Why?

0:32:48.910 --> 0:32:49.870
<v B>Yeah, go, go all over.

0:32:49.870 --> 0:32:59.590
<v A>Okay. So no, no, I also thought the same thing because Sora was like trying to do a thing. Like my kids kind of knew what it was, which is always like a pretty big thing for, for tech.

0:32:59.590 --> 0:32:59.990
<v B>Yeah.

0:33:00.790 --> 0:33:50.870
<v A>So we don't have it in the news of the episode now because like I think it's still in flight. But Anthropic has been teasing this mythos, right. Their Next sort of LLM version after Opus 4:6, I guess. And it's supposedly, you know, earth shattering, which to be fair, they all claim to be before they come out. So I'm making no statement there, but the rumor is that it was sort of like 10x more training than the last one, but that there's some sort of do you know, like the thing that happens with Grokking. So when a machine learning model trains and it hits some sort of like asymptote and it seems like it's just sort of stabilized in performance, but actually under the hood it's sort of like self organizing and then it, it's able to like sort of reach a new level after it sort of gets through this barrier of not actually getting better loss.

0:33:51.110 --> 0:33:51.470
<v B>Right.

0:33:51.470 --> 0:34:03.360
<v A>So your loss, your validations aren't getting better, but the model is sort of actually organizing itself to. To the point where then it sort of like unlocks additional capacity and then trains further. That's Patrick's non mathematical.

0:34:03.360 --> 0:34:10.040
<v B>Yeah, it's like defragmenting your hard drive kind of and then you can. It can run fast enough for you to do something else. Right.

0:34:10.040 --> 0:34:23.640
<v A>And so supposedly the rumors go that Anthropic's mythos took like whatever ten or a hundred x what the last one did, which was already insanity, but that it, it, it was a Huge unlock.

0:34:23.640 --> 0:34:23.840
<v B>Right.

0:34:23.840 --> 0:34:48.680
<v A>So there's all these scaling rules and so it would break the sort of projected sort of regressions fit to the sort of like how much training versus performance you get. And supposedly, if that's true, right, then OpenAI needed to free up the SORA services so that they could use the additional hardware to get the training budget they needed to basically not get leapfrogged.

0:34:50.120 --> 0:34:53.240
<v B>Wow. Yeah. I mean, that would be remarkable if true.

0:34:53.700 --> 0:35:09.140
<v A>I mean, to be fair, it's like a rumor about a rumor that is like, it could have just been. It's not making money. But like, I, it feels there had to have been some reason to sort of like deprecate it and not just wait for the new version to be better.

0:35:10.180 --> 0:35:59.190
<v B>Yeah, I mean, I, I, well, okay, I'll tell you my. I don't know if this is a conspiracy theory or just a theory. My theory was that, you know, OpenAI's brand recognition is suffering a lot. You know, like Sam Altman's name recognition is suffering a lot. Did you see the Onion interview of Sam Altman? No. Oh, my God. It's hilarious. I mean, obviously completely fake. It's just, you only see a transcript. There's no video or anything. But very, very funny. But. And I felt like SORA is kind of a big brand risk. Like even this Harry Potter thing. Absolutely hilarious. But, but I don't think the brands, you know, Balenciaga, I think Louis Vuitton or literally by name, I don't think they're happy with that. So I, I felt like they were trying to just limit their, their exposure.

0:35:59.190 --> 0:36:01.590
<v A>But they had a big partnership with Disney.

0:36:01.590 --> 0:36:01.870
<v B>Right?

0:36:01.870 --> 0:36:06.070
<v A>Like, like a billion dollar, like potential value.

0:36:06.790 --> 0:36:22.440
<v B>Yeah, yeah. And it's all just gone. It's kind of wild. I mean, I, yeah, I would chalk it up to the same thing, like a brand thing, even for Disney. But, but maybe, maybe it really is the compute. We'll have to wait and see when this new mythos comes out. What's going on there.

0:36:24.680 --> 0:36:28.200
<v A>All right, Talking about science fiction. It's time.

0:36:28.600 --> 0:36:29.080
<v B>Nice.

0:36:29.480 --> 0:36:45.480
<v A>For Book of the Show. We're going to take turns this time. So I'm going to do a Book of the show and Jason's going to do Tool of the Show. So my Book of the Show. Late. Better late than never. They say Project Hail Mary. So if you've well timed with Artemis, which I think was coincidental.

0:36:45.800 --> 0:36:51.760
<v B>Real quick, not to interject too much, but I am just starting this book, so you can't try not to spoil it.

0:36:51.760 --> 0:36:59.400
<v A>Yeah, yeah, no, I'LL be good. I'll be good. I always undershoot, I think. But it's also, there's movies, there's trailers. It's very hard to avoid.

0:36:59.720 --> 0:37:00.320
<v B>That's true.

0:37:00.320 --> 0:37:16.840
<v A>There's like levels of spoilers, but I'll be even better than, I think the trailers. I think trailers are really good. Revealed too much. Um, but Project Hail Mary is a book about, you know, science fiction, about a, you know, person out trying to, you know, save a dying Earth.

0:37:16.840 --> 0:37:17.200
<v B>Right.

0:37:17.600 --> 0:37:28.160
<v A>And there's lots that goes into that. It's a bit of. If you've, you know, kind of gone through Andy Weir's other book. Oh, why is the name escaping me now? The one.

0:37:28.160 --> 0:37:29.040
<v B>The Martian. Right?

0:37:29.040 --> 0:37:29.760
<v A>Oh, the Martian.

0:37:29.760 --> 0:37:30.160
<v B>Thank you.

0:37:30.160 --> 0:38:15.180
<v A>I was going to say a different sci fi Mars book. I was like, no, that's wrong. Thank you. The Martian. It's a little bit the same, right? It's like sort of tech science grounded. He does a pretty good job about that. But then also like this sort of hopeful, like, you know, you want to root for the good guy, you know, and not have like, sort of the same chaotic, you know, black, mysterious, dark. Like, you know, is this good or bad? You know, it's just like you want to root for someone and so it's kind of in the same, same theming. And I will say I was encouraged the book for a very long time do like Martian Man. I don't know. Like, the book was good. It's a short read compared to most of my recommendations. Most of the books that I read, very easy read. I went on a vacation recently and actually like first two days of the vacation, I basically read the whole book.

0:38:15.420 --> 0:38:15.980
<v B>Oh, wow.

0:38:16.300 --> 0:38:22.940
<v A>We did a very long flight, you know, leaving the country, going to a different continent, so.

0:38:22.940 --> 0:38:23.540
<v B>Oh, wow.

0:38:23.540 --> 0:39:32.620
<v A>It was a very long flight. So to be fair, okay. It was still a lot of reading. Book was good. Book was like, you know, pretty good for what it is. Like, you know, not super deep but, you know, well grounded. Had some, some issues, but, you know, just around the edges. I'm not super critical of science stuff generally. Other than just being like, yeah, right, that's. Come on. That's, that's. Nobody knows such different disciplines to this level. You know, that's just unrealistic. But other than that, you know, plausible, I guess. But I, you know, the movie, people love the movie, but I feel like I wasn't as keen on the movie as I was on the book. The book was definitely better than the movie, but the book was not as good as was Hyped up to me, but it was so short. It's got to be worth it for. Just, like, sometimes you need the casual stuff, right? Like, you can love the deep grindy. This is really making me think. But then, you know, sometimes just taking that. That sort of like, I'm that way about Marvel movies. A lot of people rag on Marvel movies, superhero movies. I actually like them. I just. I want to go in and watch something I don't have to think so hard about. Like, you know, I don't want it to be, you know, some deep, brooding, you know, at the end, was he in the dream or was he awake or, you know, okay, I don't.

0:39:32.620 --> 0:39:33.940
<v B>I don't. Whatever. I don't want to know.

0:39:33.940 --> 0:39:35.860
<v A>Like, just tell me what happened.

0:39:37.860 --> 0:39:43.660
<v B>Yeah, I'm right there with you. I mean, you know, I go into a movie. I saw the Super Mario movie with the boys.

0:39:43.660 --> 0:39:45.300
<v A>Oh, I want to see this. Yeah, yeah.

0:39:45.300 --> 0:39:53.550
<v B>They. They loved it. And you just. You have to go in with the right expectations. You're not going in to expect something really high concept.

0:39:54.030 --> 0:40:03.710
<v A>Yeah. So Project Helmet is like a near, near future. Very grounded, not super, you know, far flung, but definitely, definitely worth a read. It's pretty easy read.

0:40:03.790 --> 0:40:04.110
<v B>I.

0:40:04.110 --> 0:40:14.350
<v A>You know, on a scale of such books, I guess, and reasonably short, so. I've heard the audiobook is also really good. I normally listen to audiobooks. I actually read this one on my Kindle, so.

0:40:14.350 --> 0:40:14.750
<v B>Oh.

0:40:15.160 --> 0:40:18.440
<v A>Um, I don't know if the audiobook is. Is as good as it's cracked up, to be honest.

0:40:18.440 --> 0:40:24.280
<v B>I'm digging the audiobook. I mean, I've only just started, but voice is fine. So. Yeah, I think it's a good voice.

0:40:24.280 --> 0:40:40.920
<v A>All right. Shout out for the audiobook. But, yeah, most people probably know this now because the movie's out. My daughter's reading the book after the movie and having a good time as well. So I don't think, you know, it's one of those. They're. They're just different. They're the same story. It's pretty true to each other, but they're still pretty different in the, you know, depth of content they have.

0:40:41.540 --> 0:40:52.260
<v B>So you shared your project Hail Mary literature with your daughter? And I shared my dripwarts School of Drip with my son. Which one of us is the better father?

0:40:53.060 --> 0:41:07.440
<v A>I. I showed my kids that video too. Just to be clear. Just typically there is some vulgarity. So if you, like, just be mindful if you. If that's something you're concerned about. There is a. There are a few minor Profane.

0:41:07.600 --> 0:41:15.400
<v B>There's definitely. Yeah, there's probably some F bombs and stuff like that. To be fair, I only showed it to my. To my 12 year old, so I don't think I would show it to a six year old.

0:41:15.400 --> 0:41
<v A>But just watch it first. That's what we're saying.

0:41:18.080 --> 0:42:16.690
<v B>Yeah, totally. Definitely a viewer. What is discretion advised or something. Cool. All right. Yeah. And if you do read the book and get it from the library, instead of paying for expensive movie tickets, you could turn around and give that extra money to us by following us on Patreon. We do really appreciate all of our patrons. All the money just sits in an account where we use it to help out the show. We try and get more folks interested in the show, especially folks who are starting their career. And so all of us collectively really appreciate your donations. Okay, so tool of the show. Patrick's going to skip this time. Okay, Patrick, this is going to either going to blow your mind or you already know about it. I wanted to play Final Fantasy 6. That's the one with Edgar and Sabin and Terra. Basically. It's. It's actually Final Fantasy 3 in the US but they call it.

0:42:17.570 --> 0:42:18.610
<v A>Okay, I know which one this is.

0:42:18.690 --> 0:42:48.310
<v B>So it's with, you know, Tara has like got a connection with the ESPers, and it turns out I'm not gonna spoil it, but basically love the game, love the story. Haven't played it in probably 20 years. So I thought more than 20 years. So. So I thought I want to play it, but, like, I know I'm just going to beat it. I mean, if I could beat it at 12, I'm sure I could beat it at 40. Whatever. So I was like, how can I play this game and get the story and experience but, like, still be challenged? Right.

0:42:48.550 --> 0:42:49.070
<v A>Okay.

0:42:49.070 --> 0:45
<v B>Oh, man. So I found out that people make. You know, I've always done like ROM hacks for translation, so I could play like English versions of Japanese games, but there's a whole ROM hacking community just for making games either harder or more interesting or both or what have you. And so I played Ogre Battle. It's not Ogre Battle hard type. There's an Ogre Battle mod which I can look up, but basically made the game harder but also balanced all the units and everything. There's separately an Ogre Battle hard type. But that was just very frustrating. Like, at some point, what I don't want to do is to play the same game but have to grind for like a hundred hours, you know, like, that's not fun. What I want is just like a harder experience, but taking roughly the same amount of time. You just have to be more strategic. Right. So I finished that ogre battle mod, and then I found this amazing Final Fantasy 6 mod. It's called Tea Edition, and it adds just an insane amount of content. For example, there's achievements. There's like all these extra side quests. There's just a ton of content. There's new bosses. There's actually like. There's like different mechanics. So, for example, you know, generally in Final Fantasy, you almost never. You almost never like, cast certain spells. Like, you know, they're kind of in the game for continuity. But like, when do you really ever, like, poison your enemy? It's pretty rare, but. But now it's like there's different bosses that have certain weaknesses, and that kind of encourages you to use the whole gambit of spells. Phenomenal. I mean, I'm about halfway through it. I was worried that, you know, when you play a ROM hack, you know, there's always the risk that the ROM hack, the hack designer just isn't as thorough as the original game designer. And like, you'll get maybe 40 hours into the game and now you're just stuck. Like, you. You just. You can't make progress. And so you're. You wasted your time. But this TEA Edition is super popular. It's got a thriving community. It's been around forever. Tons of people have beaten it. And so you know that it's like you're going to get to the end, but it's super, super fun. I've been playing it on my phone and it's. It's a blast.

0:45:20.380 --> 0:45:25.540
<v A>Oh, that's awesome. I have never played Final Fantasy. I always called it Final Fantasy 3,

0:45:25.540 --> 0:45:30.460
<v B>but 6, I guess. Yeah. Wait, so you never played 3?

0:45:31.020 --> 0:45:31.420
<v A>No.

0:45:31.900 --> 0:45:33.380
<v B>Oh, man. Oh, yeah.

0:45:33.380 --> 0:45:35.980
<v A>7. I did 7, but never 3.

0:45:36.700 --> 0:45:47.490
<v B>Yeah, I actually stopped at 3 because I never had a PlayStation. Oh, I missed out. I actually. I played 7 a year ago, but, you know, my childhood, I missed out on seven.

0:45:47.730 --> 0:46:09.410
<v A>Is it worth, like. I guess so. When. Sometimes when you play the old games, like, if you don't have nostalgia, they're really tough to play for, like quality of life reasons. Just like lots of random stupid stuff you gotta do or just like, you know, they tried to make the game longer. I, you know, I don't know, stuff like that. Like, is this game still playable or do you gotta have nostalgia for it or is this gonna be like, nah, there's better games to play.

0:46:10.850 --> 0:47:13.300
<v B>It's really hard to say. Okay, I would say the story is phenomenal and probably worth playing, even if it's the first time. Yeah, the story is very, very good. I feel like the pace is. Is good. I'm debating whether if you go straight to T Edition, that might be difficult because you are kind of expected to. Yeah, I would play the regular edition. T Edition would really be for people who want to play it a second time, but I would definitely go back and play. If you're going to play the regular version. There's some Android ports or iOS ports that are probably better than playing it on emulator, but I think it's a great game, very solid. It shocked me the first time I played it where I thought I was, you know, at the end of the game. And it turns out you're only at the halfway point. Kind of like if you ever played a link to the past.

0:47:13.620 --> 0:47:14.060
<v A>Yes.

0:47:14.060 --> 0:47:22.500
<v B>Super Nintendo. Yeah. Remember when, like you fight Ganon and then like he throws you into the dark world or whatever they call that, the other world.

0:47:22.500 --> 0:47:25.140
<v A>Do we have to give spoiler alerts for like 30 year old guests?

0:47:28.340 --> 0:47:43.140
<v B>It's kind of like that where I. Although in that game it was more obvious because, like, okay, clearly there aren't just three things to do in the whole world right here. It's less obvious. Right. I literally thought, well, the game's over and I was only halfway done. Ah, nice.

0:47:43.780 --> 0:47:51.700
<v A>Yeah. I'm ashamed to admit I also have only ever made it, like through the very beginning of Chrono Trigger. Everyone's like, it's the most amazing RPG ever.

0:47:51.780 --> 0:48:19.100
<v B>But yeah, I'm. I'm bad with RPGs. Yeah, Chrono Trigger was very good. I think the, the. I could see people getting stuck in Chrono Trigger because there's some parts where you just have to kind of persevere to make the other parts kind of worth it versus this, I would say Final Fantasy 3. There's constantly something interesting going on. All right, all right, Well, I am

0:48:19.100 --> 0:48:23.140
<v A>trying to go back and play more of these, so I will. I'll add it to the list. We'll see if I get to it.

0:48:23.220 --> 0:49:36.040
<v B>You should totally play the ROM hack. It's weird it says that. It doesn't. There's glitches on emulators, but I think that was back in the past, just quick, quick history of emulators. So, you know, the machine has its own instruction set and Patrick probably knows this way better than I do, so I'm going to try my best. You know, it has like, you know, move things over here or this other type of instruction or whatever, like at really low level. And your computer doesn't have all the same instructions. Right. Your phone probably has different instructions in your desktop. And so it like it has sort of translate the instructions from, you know what it would give a real Nintendo to what it has to give your computer. And back in the day, like they couldn't just translate it perfectly because it became really expensive. Like there might be an instruction that was super fast on Nintendo, but it'd be really slow on your computer. And so, and so when they made these hacks, I guess the hacks only worked on the Nintendo without, without glitches. But now the emulators are like absolutely perfect because the machines are just so fast. So. So there were these warnings about, oh, if you're playing on emulator, the sound won't work. But everything worked perfectly.

0:49:36.360 --> 0:49:47.440
<v A>I play almost all old games with the sound off because I find the like repetitive chip tune thing, like only listenable in very small doses. So I basically play with everything on mute.

0:49:47.440 --> 0:49:49.510
<v B>So. So. Ah, yeah, me too actually. Yeah.

0:49:49.510 --> 0:49:51.270
<v A>But even on my Switch, that's like

0:49:51.270 --> 0:49:56.230
<v B>in my case it's because I'm like in the car or, you know, I'm in the passenger seat or something where I don't.

0:49:56.230 --> 0:49:57.110
<v A>Thank you for clarifying.

0:49:58.390 --> 0:50:03.989
<v B>Yeah, I just don't want to bother people usually. Yeah.

0:50:04.790 --> 0:50:10.070
<v A>All right, it is time for the agents to code to the rest of our podcast.

0:50:10.070 --> 0:55:46.410
<v B>Yeah, I mean, why are we even here? Like, can't I just press a button on generate script? Oh man, I mean, what a, what a crazy time we're living in. So quick, quick history lesson. You know, IntelliSense has been around forever. I don't know if folks remember the word intellisense, but. And I don't know how it worked. I mean, I can take a guess. I mean, I guess that, you know, it ingests all of your code on your. On your local computer and then it, you know, builds up some kind of statistical models and then when you start typing, you know, or maybe it doesn't even need statistics, maybe it's totally deterministic. But you start typing, you know, my object.cac and it just fills in, you know, cache, cache, triangle or something like that because it knows that that function exists and there aren't any other functions that start with cac. And so, so it'll just pop that up next to your cursor and you could hit tab and it'll just auto complete that function name for you. So that's been around forever. And that's been great. I had like an Emacs extension at some point a long time ago that did this. So even in the terminal you could do it and everything that's been around forever. And then, you know, the large language models started to get really good. And, and so, you know, there was, they. They started doing autocomplete on steroids where you could use GitHub Copilot and you could start typing, you know, for. And it would just autocomplete, you know, the entire for loop and the contents of the for loop or something. If it was pretty easy to infer from, from other parts of the code base. So that was really cool. And then cursor came out. And so cursor had this neat feature where it would autocomplete at your. Starting at your cursor, but then it would jump around. So for example, you know, in most languages you have some type of import. So in Typescript it's called import. In Python it's called import C. It's pound include. Right? And so let's say as part of the autocomplete, it ended up needing to use a library that you weren't yet importing. You could tab to complete that code and then it would give you a little notification type thing that there's more work to do somewhere else in the file. So you could type tab again and. And it would jump to the top of the file and show you what it wants to autocomplete over there. And then you could type tab a third time and you would, it would do the imports as well. And so, you know, this kind of continued to the point where I think at some point cursor had multi file like you could just kind of keep hitting tab and it would jump around your code base doing various things. And so this is all great. And so then, so people were using that I was using it is very exciting. And then Claude code came around and that was a huge game changer. So this is where instead of just extending an idea that you had partially written, you could kind of ask it in English to do something with your code and it would go off and do it. And that is. Was pretty wild. I remember when that first came out. I post, actually I put my. My most popular post on LinkedIn was basically talking about how awesome Claude code is and a whole bunch of people like ripping on me. This one guy, I'll never forget it. If you're out there listening to us, you're a jerk. Stop listening to us. But this one guy was like, you should be better than this. Like, do better, you know, like, like, you know, like Claude code and these things are. He called it a stochastic parrot, which I found out later is like a common pejorative for LLMs. He's like, you should do better than this. You know, you have a programming podcast and you're trusting these stochastic parrots and just insulting my character and everything. That guy was a jerk. But, but, but it was a very controversial post. It was one of these things that, like, I didn't do it on purpose, but it got, it got, I think, like one and a half million views. And there's definitely, like, pro, you know, proponents and, and antagonists. But I was right, in hindsight. I told everyone early on that this Quadco thing's amazing. Actually, what I'd said that I think was so provocative was I said, I have not written a single line of machine code in my whole life. I run a hundred percent of my code through the compiler, and so I'm already not writing pure code. So if I, if I start running 100% of my code through Claude and I just tell it in English what to do, it's really not changing anything. And that's kind of where I still stand. I mean, I use it constantly now, you know, even today, there are times I have to go in the code and we'll talk about all of that. But in general, kind of where I'm coming from is I'm a big fan and I think it's pretty amazing, pretty amazing times we're living in.

0:55:47.930 --> 0:57:00.550
<v A>So maybe to like, you know, expand a little on the switch from those early sort of like, you kind of explained how cursor was in the early days to, you know, maybe what, like Claude code or Codex or the Gemini, you know, solutions are today. I think we've also seen a lot of iterations, like around the edges. So things like what the mcps, you know, around retrieval, augmented graphs, rags around, like, you know, basically, in my opinion, these are kind of like around how to let the LLM, which is just next token prediction, right? How to allow it to do tool calling, which is something we talked about in the podcast a long, long, long time ago, where we said, hey, these chat things would be really cool if they could reach out and do web searches or connect to Wolfram Alpha or. Okay, well, anyways, turns out people are already working on that. We just didn't know. So everybody had the same good idea, good. But then, you know, so basically, how to interact with tools, and there's been a lot of iterations through, you know, how that works to, to, you know, I don't say it's settled today, but to where it is today. And then the other one is around sort of managing the context window.

0:57:00.550 --> 0:57:00.830
<v B>Right?

0:57:00.830 --> 0:59:00.010
<v A>The hey, how much of your code base of your problem space of your conversation, how much of that can be held and for the context. I think it's been interesting that the models themselves have gotten bigger. But one of the things that isn't super obvious is that even if you will see something like Gemini can handle a million tokens as context, that doesn't mean it's as efficient, both in terms of like how fast it runs, but also in how the quality is at a million tokens versus 10,000 tokens. And so a lot of these models have degraded performance as they reach up to those million tokens. It is unintuitive because it's not like a hashmap or a list or an array that just grows and you get some weird cache effects. But generally, you know, they're kind of well understood. There's these unintuitive mechanisms. So aggressively managing what's loaded into the context and things like compaction is super important. And I think it's been very interesting to see, you know, something like taking your whole code base, understanding how to use tools to search in your code base, how to learn, like which pieces to extract and load up and sort of assume. And I will say it's still not perfect. You still see. Sometimes I see it even using, you know, a tool like cloud code where it'll. We talk about hallucination, it'll hallucinate API calls, but it's in a loop. Now that's the agentic part where it'll try to compile itself and realize like, wait, why did I call, you know, feature or you know, object.foo. foo isn't. Doesn't exist, it's dot bar. And you know, it'll figure it out. But for whatever reason, like this, you know, hallucinating of foo just assumed that, you know, you have a list class, therefore you have an insert. And maybe you don't have an insert because of, you know, some esoteric reason. And so you still see around the edges. But a lot of that is because it's trying to keep the context down in the first pass, which causes own problems. If it loads up all your code base, if it tries to understand all of this, the speed can. Can really, you know, get bogged down.

0:59:01.050 --> 1:04:55.480
<v B>Yeah, I mean, the other thing that, that I think you touched on, but just double click on it is the, the, the actual intelligence of the model goes down as you put more information in it. And so this creates a weird trap where someone will start with a completely blank slate and say, build me a website for E commerce. And it can do that, but a lot of that is based on its sort of innate knowledge from looking at many E commerce sites on GitHub and things like that. And so it'll build something and you'll feel like it really understood the nature of what you built. But a lot of it might be kind of copied, right? And then as the context grows and you end up with more and more bespoke information about your particular, you know, product that you're selling, then it needs to keep more and more information in its short term memory. And as it does that, the intelligence starts to go down. And so you see this trap, um, and so it's, it's just something to be aware of that when you're working with bigger projects, you shouldn't expect the same level of intelligence that you have, you know, at a smaller scale. But, but yeah, the loop thing and the tool calling, super important. Okay, so we'll explain a little bit how this works under the hood. So in the beginning it takes your question and the AI can do one of several things, right? It can answer your question just by emitting some text or it can call a tool and there's several tools that it can choose from. And so this technology has been around for, you know, is older than Claude code, to Patrick's point. So when it's done calling a tool, the tool information is added to the context. So just to back a bit. So if you ask it, you know, what's the distance from here to the moon? Because I want to slingshot some astronauts, right? So it'll start giving you that answer. But as it's emitting those tokens, it's also using what it said to generate the next token. So it's possible, I mean it might be hard to make it do this, but it's possible for it to generate half of an answer and then to actually say, wait a minute, stop, I'm heading in the wrong direction. And it to generate a, a totally different answer. Like mathematically that's plausible. It might be hard to craft a question that would cause that every time, but it can happen. So, so similarly, when it calls a tool the tool output, it's as if it said those tokens. So in the sense, like it's now part of the context and it's used to generate the next thing that it says. So the model can either answer you or it can call a tool. When the tool comes back, that information is, let's say, part of the context along with your question. Then it can call another tool or it can answer you and whether you know however many tools you want to allow it to call. And all of that is up to the discretion of the developer. Right? But at some point it's done calling tools, it gives you an answer, and that's it. So the idea with Agentic was what if a tool was itself like another question? Like, what if we made this recursive? And so there's two basic ways you can make this recursive. One is where the tool call is actually another question that starts a session within a session. Another way is to say, well, a tool call can actually give me back a list of more tools to call. Both of those are implemented in Claude code. So you might say something like, I want to remove all the lint errors in my code base. And it'll come back and say, okay, I'll run a tool call and we'll run the pyrite and we'll look at the errors that are involved that might come back with, you know, 10,000 errors. And the model will actually say, okay, you have a ton of errors here. You know, we're not going to just fix this in one diff. A diff, by the way, or a patch is also another tool call. But we're not going to fix 10,000 errors in one patch. So I'm going to create a bunch of subtasks, and those subtasks are like isolated questions that I'm going to ask myself. So the model will ask itself, like in another process or another. Another context, you know, fix all the lint errors where the capitalization is wrong in the variable. Just focus on that. So now the model is really like the initial model is now an orchestrator that's orchestrating all these sub questions. And the thing Patrick was saying is the nice thing is if anything is wrong, that's okay, because the expectation is to be eventually correct. So the model, you know, as long as it has a way to verify it can come back and try to compile the code and say, oh, I made a mistake. I'm going to call another tool call to fix it, et cetera, et cetera. And this continues until there's some kind of stopping criteria, which again is created by the developers. At that point, the model hands the reins back over to you.

1:04:57.320 --> 1:07:49.230
<v A>There's a lot of, I think maybe unintuitive to outsider like interplay as you're describing between what you. When we talked about fine tuning before, but targeting these coding benchmarks and coding applications and code is like a way of doing sort of long term planning, which has been something difficult for LLMs to do. But also the interplay between the harness of Claude code and like or you know, Codex or any of them and the underlying model, right? How, how do you tell it what tool calls to use? You have like a tool call language or do you just let it use a command line? Do you use mcps right, like internal for stuff? How do you sort of like craft the system prompt? How do you craft like the each turn, right, like how, how far to let the model think before the next. Like there's this interplay between how the model was trained and how you prompt it, guide it, harness it, skeletonize it, structure and even ask separate questions about like hey, how do I, like how would I plan to do this? Or how do I decompose this task? So how do you decide, you know, whether to kind of have it do more of a one shot like here's what I'm trying to do, just do it and then saying, let me first ask for it to decompose it into tasks. Then I'm going to say for the first task like output the task as JSON and then for each thing in the JSON array, prompt it again. Right. Like you get all this interplay between how you invoke the LLM and how the LLM works. And then we. I don't think we've yet seen, to be honest. You know, we might talk about in a future episode, but things like openclaw or like the various. More like computer use things, which is the LLM's really starting to train on this use case, right. Sort of actually getting to where they themselves are, you know, making sure that during training they're understanding and working with this tooling better. And so for now it's a lot of. I don't want to say hackery, that's not the right word but like a lot of sort of humans iterating or having LLMs iterate the LLM harness for the LLM. Oh my gosh, it's just LLMs but. But you know, I think there's a lot of nuance and you'll see people talk about how Codex works, how you know, Claude code works, how the Gemini thing. But there's also like open code, right? Which is a version of Claude Code but, but with open source and you know, bring your own backend as a more like supported sort of method methodology and I think all of them have nuanced difference. In fact just last week cloud code had its source code leaked and people were kind of deep diving and seeing oh hey, there's like, you know, monitoring of how upset the user is. There's like all this stuff in it you kind of wouldn't assume. So I don't think there's a settled out harness approach yet. And then the question is like how much does the harness need to match the individual model is, is unclear because every model is, is sort of slightly different too.

1:07:50.590 --> 1:09:23.320
<v B>Yeah, yeah, great points. Yeah, I think it's still pretty early days. One sort of thing that really surprised me is how well it works on things that aren't even coding related. You know, there's recently Google released a set of skills so we didn't really talk about this, but a skill is basically a set of tools with really detailed prompts behind each of the tools. So you know, there might be a skill where it's about reading and writing Google Docs. And so you'll get a set of tools that let you sort of do the mechanics, you know, add to a Google Doc, read a Google Doc, insert characters into a doc, etc. But then you also get this really detailed markdown of what is the true sort of platonic nature. What is the, what is the nature of a, of a Google Doc? Like what actually is it? And yeah, and so you can, you can plug that Google G suite skill into or I think they're calling it Google Workspace skill into any of these coding agents and then they have that power. So I think that this is going to be pretty disruptive to a lot of industries. You know, I'm thinking like finance, medical, legal, et cetera. In the same way that it's disruptive to coding, it's a very generic system.

1:09:23.560 --> 1:09:23.960
<v A>So

1:09:27.160 --> 1:11:54.080
<v B>the way that you know, Claude code worked that's pretty different from any of its predecessors was that the tools that it had were extremely generic. You know, people were at the time building very specific tools like here's a tool to access, you know, my bespoke database. But I'm not going to give you just open SQL access because you might just drop all the tables in my database. So I'm going to give you like this tool adds a customer record and this tool deletes a customer record like very, very specific tools. And Claude code came around and said well, going to give you a tool called Bash, which can do anything. And we're going to give you a tool called Patch, that can just patch any file anywhere and a read file tool and read a chunk of a file tool, et cetera, which is super, super generic. And you know, in the beginning it's kind of scary, right? And so that's why by default, most of the tools kind of ask for your permission if you're going to patch a file, et cetera. Um, but then over time, I feel people have just gotten more and more comfortable where. I think most people probably run with like a, A workspace, YOLO kind of mode where, where, hey, as long as you're in my code base, you can read and write whatever you want and you don't have to prompt me all the time. And so people are getting more comfortable with it. I was using it recently to create notes where basically I wanted a midterm kind of help guide for my students. And so I said, hey, here's the textbook and here's the midterm. Come up with sort of a help guide. And kind of what Patrick was saying in the beginning, the help guide was okay, but not great. And I realized I was kind of asking it to do a lot know. It's a big textbook, it's a big midterm. And I said, hey, break the textbook up into chapters and give me a study guide for each chapter. And that kind of forced it to, you know, use many tasks and each task having only to read a chapter. You know, as we talked about, the model gets dumber if you give it more information. So because each subtask only had to think about a chapter, I actually got a much better summary just from changing the prompt.

1:11 --> 1:12:04.760
<v A>Yeah, I mean, I guess that's like a good transition into our next thing. Like a set of maybe like learnings guidelines. I do agree with the like, general

1:12:04.760 --> 1:12:08.040
<v B>use and the bright future, but I

1:12:08.040 --> 1:12:48.510
<v A>mean, I guess like the first one, and we kind of talked about this with like, hallucinating, with like needing to break down, you know, sort of plans. We're going to talk about some of these in a minute is at least if you're working in code, but even if you're not, like, even if you're working, you know, I've been playing around. Maybe we'll talk about in a future episode, but like using a lot of markdown files and something like Obsidian to track stuff and to track thinking and doing basic file manipulation. But use git even if you're not going to like push it up to GitHub. Even it doesn't matter. Like, and I will say like, a lot of these tools have skills natively to do like git work trees, which I will not lie, I have never used a git work tree in my life. I don't know how it works.

1:12:48.510 --> 1:12:49.950
<v B>I don't know how it works either.

1:12:49.950 --> 1:12:56.150
<v A>Okay, good. I'm not the only. But, but like sophisticated ways of using git, they know how to do it. You just have to tell them you

1:12:56.150 --> 1:12:57.510
<v B>want it done right.

1:12:57.510 --> 1:13:42.950
<v A>And then anytime it's sort of like working, check it in. If you make a change, check it in. Like you can always go back and then you can always tell the LLM. Like if you want to try rolling forward, which we'll talk about in a, in a future tip, you can say things like, hey, like this last change didn't work right? And then it will be like, oh, okay, yeah, like let me revert it or let me, you know, look at the diff. Right. And so it has access. So git is like something we talk about kind of, you know, basic technical person software engineer skill. But definitely here. And again, I use this for personal workflows without any remote repository. It's just a local thing, doesn't matter. It's just to have like a history that is like archived of my instructions of, you know, what I was doing of my code.

1:13:43.110 --> 1:13:54.070
<v B>Yeah. And the agent is amazing at writing detailed git commit messages too. So you just tell it, hey, do a git commit here of all the untracked files and with a nice message.

1:13:55.349 --> 1:16:10.850
<v A>And if you do that in your like session it will often include notes from the session, like things you were trying to do that wouldn't necessarily be inferable from the code. Um, and then I guess that'll go to my next thing is before you start, you know, attempting, I guess like the term is sort of broadly vibe coding where like to Jason's point, you try not to actually write code. Although that term is I think a little derogatory. But like whatever. And people have. We can talk about the ethics or morality of it some other time. But you know, it just is here. So. Okay, but I think there's like the similar term is one shotting which is like, hey, make me a website. And you just like let it go. You don't do any. That may be a way. I don't know that it works amazingly today. But what I will say is all the tools I've used have like a planning, you know, sort of flow. And in the planning flow the, you know, Tool is not writing anything to disk, it's not making any changes. It is simply attempting to break down your request into what it's going to do to actually go ahead and extract most of the needed context. And it used to be that it was just telling it like, think harder and like make a task list. And then it would go to the task list. More recently I've seen the tools get better and basically what they'll do is, you know, hey, I want to add this feature, I want to build this thing. It will go like, do the research, put snippets of code and then you can review it of varying lengths and spend time going two, three, four times. The better you get that, the better like the output will be. And you can ask questions like why did you do that if it's in a domain you don't know about? Or like just force it to reconsider like, is this the only option? There's like lots of things you can do and there's skills that will also help to. Although sometimes those skills go out of date as the harnesses roll forward in terms of like, do you need to do those extra steps or not? But then once you sort of get through the planning, a lot of them will then clear the context and keep only the like. I have minimized the description of the work to be done, including function calls, pertinent snippets and all of that. And what you'll find is the execution runs much faster and is less likely to run out of context while doing the thing you described in your planning.

1:16:11.730 --> 1:18:43.630
<v B>Yeah, I mean, so we should talk about running out of context. So, so you know, as we talked about earlier, your model has a context window, which means the model can only take in so many tokens at a time. It's not a recurrent model model, so it doesn't sort of accumulate anything. And so every time it, you know, reads in a question or a tool call result, it reads in the entire context and then makes a decision. And so because of that, a context has to have a limit. When the model hits the limit or gets close to it, it does what's called compacting and it sucks. And, and, and as of today, it hasn't really gotten much better. So basically the way compacting works is pretty simple. A, the context is made up of a list of messages. So a message might be something you asked it. A message could be part of its response. Because you know, now you have this multi question answer session. Right. A message could also be a result from a tool, a Message could be a request to call a tool. These are all messages, this whole ledger of things that have happened. Right. And so in compaction, you know, it's going to preserve some of the things that are really important. So when you told it something and it responded, there's probably enough important information that those will get kept in their entirety. But when a tool responds with like an entire PDF and and it read the whole thing, it's going to try to summarize that. So it's going to summarize that whole PDF down to maybe a paragraph or two. And so compaction is all about sort of summarization and then continue to summarize until you have 90% of your context window open again. The problem with the summarization is that details often really matter. And so when you compact, often you're in this really weird state. I've seen situations where I ask a question, it compacts and then it actually answers like the previous question I asked. Again, that's just probably maybe a bug, I don't even know. But there's weirdness. So you really want to avoid compaction and task decomposition is the best way to do it.

1:18:45.500 --> 1:19:26.340
<v A>Yes, is another pointer notes but yes, you know, decomposition is a skill that I think both like for the purposes of what we're talking about like getting the. The mind but also more generally just like separation of concerns like object oriented things. Like things do not go out the window. But in a time when the LLM I will say currently tends to generate a lot of spaghetti code. It's not pure spaghetti code but it also doesn't spend cycles is the wrong way to reason about like it's not true but whatever. It doesn't spend tokens trying to think through like you know, finding duplicate. It's just happy to copy and paste stuff.

1:19:26.340 --> 1:19:26.580
<v B>Right.

1:19:26.580 --> 1:19:49.390
<v A>And not think like oh I should move this out to a common predecessor function or pre pro. Like you'll a lot of that stuff won't happen. It will also happily do super inefficient things. And the more things for a unit I don't like it is trying to do the worse that becomes. So really letting it be step by step even more than maybe it's step by step is. Is very important.

1:19:50.910 --> 1:23:22.520
<v B>Yeah, totally kind of related to this. There's now the file is a little different. It hasn't standardized yet but for most of these the file is called agents MD and agents has to be all caps. This is like a magic file. So whenever you start Claude code in a directory it's Going to look for an Agents MD file and it's actually going to recurse back through the file tree and look for all of them and add them all up. So here's where you can put things like anytime you change the code, you should significant change to the code. You should run unit tests and here's how you run them. Or don't use the system version of Python because we have a virtual environment. Use the virtual environment Python. These are all instructions that you put in your Agents MD file and that way you don't have to say it every single time you have a question. So think of whatever you put there as being sort of prepended to any question that you could ask. And that's been extremely important. You know, if you want to have a code that doesn't have all sorts of lint errors and just, you know, really kind of cruddy design, then that's where you kind of enforce all of that. Over time, I've been making this more and more complicated. I have one now where, you know, if a file gets too big, it breaks it up and it's sort of instructed to, you know, break the file up semantically. Group the file, the code semantically. Originally I said, hey, if a file gets too big, break it up. And I was ending up with like CubePart 1 py cube underscore part 2 py. So then I had to go and add, you know, no, okay, when you break it up, it has to be semantically, it has to be meaning behind each of the file names. So these are all the kind of things you'll, you'll do as you're iterating through. But you know, you could end up with something that, granted it's going to burn more tokens, but, but it kind of keeps a healthy ecosystem so that if you do end up having to go back and look at that code, you could do it. I had an issue recently where there was a config parameter. I wanted to configure the size of the output video. And so I said, hey, the size of the output video is too small. Add a parameter and by default make it 8 so that the video is 8 times as large. So it's like, sure, done. And I do a run and the video is the same size. So I said, hey, you know, the, the video is, you know, is only, you know, this many pixels by this many. It should have been bigger. And, and the, it was actually arguing with me. It was like, well, you know, the domain might have gotten smaller. And so I actually Made the video bigger, but it was now smaller and so it canceled out. I was like, whoa, wait a minute, wait a minute. Run all the, you know, lint tests and everything and then come back and it's like, oh yeah, you know, I, I didn't pass this variable through and blah, blah, blah. So like having good code hygiene is important even for the AI. And so even if you're doing a side project, just keep around an agents MD file and dump it into every project you do so that your AI will have good code hygiene. Otherwise it will make mistakes just like a person.

1:23:23.320 --> 1:23:58.570
<v A>I think maybe one day this will be different. But just like your analogy earlier where we pass all our code through a compiler and maybe now we pass all of our instructions through, you know, an agent to code all of our stuff. I do think some classic debugging stuff is kind of like just like those skills that people have learned and I don't know, it'll be tougher to learn them in the future perhaps. But I feel like are still like you're like how you're saying when you see a problem, it's like, well what, how do I get you to agree with me that why does that come. That comes because you are a manager and you've had to argue with, you know, individual contributors under you before. Like, why don't you write a test?

1:23:58.810 --> 1:23:59.130
<v B>Right?

1:23:59.130 --> 1:24:48.470
<v A>If you just don't go look for the bug, there's gonna be like, I didn't see one. Like, did you really try? Like, let's write a test. Write a test. Change of. Oh, oh, yeah, yeah, you're right. Okay. You know, like those skills are still, you know, understanding where the most likely source of problems are, especially in your domain and in your experience. Right. Like I, I feel at least for the time being are going to continue to be, you know, sort of very useful. And then also there's different ways of building a system, but certain ones sort of stack together nicely and so can we can like the LLM training ever really be equally good across like all the different ways of developing? Right. Like imagine one person wants to develop in an agile way and one person in a water flow method and one per like, is it really going to be equally good at all? It feels unlikely.

1:24:48.710 --> 1:24:49.270
<v B>Yeah.

1:24:49.510 --> 1:26:05.940
<v A>And so depending on which. But you may still want to stick in your lane or your company has policy or whatever. Right. Like I, it's very interesting if, if it's just like your code sucks, I rewrote all of it. Well, hang on, how am I gonna like get buy off for that, like, how am I gonna get acceptance testing right? You know, this is all very difficult, difficult questions. It does lead to your, you know, sort of debugging example. One of the things that I hear a lot of back and forth. I try a little bit of both, but it's just something to be aware of on different levels. There's like rolling forward with the issue and then just restarting. And that restart can be like we talked about, reverting to an earlier git commit. It can be, which some tools support better than others. It can be clearing your session or closing it out and starting it again, which erases your history, basically, unless you, you know, force resume it. Because sometimes something in the context window is tripping it up or polluting it and it's just stuck in some weird loop and literally just exiting and starting again. Sounds stupid. And asking the same question now we haven't talked about it. It is still kind of slow. Like it sucks. Like it generally takes a long time. Even if you're asking a simple question, which is not how it is with humans. Like if you were just having a debate with a human, like, you know, they, you ask them something simple, they're

1:26:05.940 --> 1:26:07.060
<v B>going to respond very quickly.

1:26:07.710 --> 1:26:40.960
<v A>But you do need to be aware of the difference between sort of adjusting your questions, trying to fix bugs, and just like nuking it and starting back and saying like, let's try this again, but I'm going to state it a little different and being aware that even if you copy and pasted the same prompt, you're not actually going to get the same answer because there is randomness in the LLM that's inserted on purpose through things like temperature and other things. So I don't, I haven't tested it, but I would not be survivor to assume that if you paste in the same question twice from the same base state, you aren't guaranteed to get the same answer.

1:26:40.960 --> 1:26:41.720
<v B>Oh, totally.

1:26:41.880 --> 1:26:49.240
<v A>Okay, I'm like not crazy. I know that's true in the chatbots, but I haven't tried it in Claude code or Codex or anything.

1:26:49.480 --> 1:27:18.700
<v B>Even if you set the temperature to zero and you fix the random seed and all of that, you're still not going to get a deterministic answer because your question is batched with other people's questions. And the, the low level CUDA operators will give slightly different answers depending on what data is around you. And so, so yeah, you, there's no way, zero way you can guarantee determinism.

1:27:19.419 --> 1:27:20.620
<v A>Ah t. I learned.

1:27:20.700 --> 1:27:26.060
<v B>Yeah, unless you're running on your own Hardware with a batch size of one in a very efficient way.

1:27:27.260 --> 1:27:29.020
<v A>And you're waiting a day for every question.

1:27:29.900 --> 1:27:56.350
<v B>Yeah, yeah, exactly. Okay, so just want to wrap up on this. I get asked this a lot. Is software engineering dead? How is this going to work? Will I have a job after we graduate? I teach at UT now, and so tomorrow I have office hours. And I guarantee you tomorrow someone's going to ask me this. I know this because they emailed me today saying they're going to ask me.

1:27:56.990 --> 1:27:57.790
<v A>That's not fair.

1:27:58.480 --> 1:28:33.760
<v B>But I get this every single week. And so, you know, my. I guess I'll give my overall take. And I'd love to hear your take, Patrick. I think that, you know, using a systematic way to solve problems, that is always going to be in demand, right? The demand for doing that has done nothing but go up. You know, I'd like to think that my great, great ancestors were also engineers, like building catapults and stuff. Maybe they weren't. Maybe they were tyrants. I don't know. Maybe they were.

1:28:33.840 --> 1:28:35.520
<v A>Probably just died of dysentery, dude.

1:28:36.160 --> 1:28:41.440
<v B>But before they died of dysentery, they were making some pretty kickass trebuchets. That's what I'd like to believe.

1:28:41.760 --> 1:28:42.160
<v A>But.

1:28:42.160 --> 1:29:54.290
<v B>But we actually work more hours than they did, right? Like we work more hours than medieval peasants. And so clearly the demand for, or the demand for like systematic thinking and turning, you know, abstract problems into like really concrete problems, that's not going away. What, what is going away is like the very rote, you know, like I have this project in Java and I need to port it to C and that's going to take two years. Like that's going away. And so, and so, you know, maybe the supply of software engineering jobs actually does. Does take a permanent hit. But that was never really the point. The point wasn't really to write software. The point was to build something cool. And if that, as long as we stay true to that, then there's always going to be a ton of opportunity. And if, if, if, you know, we have thousands of years of years of history showing that, you know, it's just going to get more busy for everybody. So. Oof.

1:29:54.850 --> 1:30:55.820
<v A>Yeah, I, my. We sort of talked about it, I think, like curiosity building. I mean, I think these are things you see throughout history, like certain people had, and the way they applied it was maybe different. The opportunities for applying it were maybe different. But I mean, I think of all the things in the world that you can build having maybe call it style, like having the persnicketiness to keep through working and, and enforcing that. It doesn't work the first time or didn't match what you expected. Like the question would be, I guess we were talking about Sora earlier. Someone was saying, you know, oh, Sora will kill Hollywood. Right. Like, everybody will just watch their own, own movie. I don't actually think that's true. I think for me, the most likely outcome is a new maybe order of magnitude, maybe two orders of magnitude. Literally a hundred more creators and movies get made, but there are still ones that just like, like you were talking about drip boards. I wouldn't have had that idea, but I enjoyed it.

1:30:56.060 --> 1:30:56.540
<v B>Yeah.

1:30:56.700 --> 1:32:23.580
<v A>And so therefore, like, somebody had taste like someone. I call it whatever you want. I don't know. They had ambition to like, prompt it to do that thing that I never would have. And I think the same will be true in software. Deciding what to build. Now. It may not be that. It may not be. I don't even say like, fair in the same way that you just go to college, like, get a good job and you're guaranteed. In the same way that like, you know, you can hear the criticisms about, you know, 40 years ago you did A, B and C and you like, you know, worked X years and then retired and it was a good like. And that may not be true anymore. May or may not. I'm not here to say. I'm not trying to get on politics. It's like there's no guarantee, like you were saying, you know, medieval peasants, it's completely different than us today. You know, like, things change. And so my take on software engineering is the skill set feels valuable. Understanding how systems work, how they get built, how to debug stuff. Right. Like this feels useful, but maybe there's less of a discrimination between learning it as a mechanical engineer, an electrical engineer, a software engineer. Maybe a lot like the lines blur because you can reach further outside your domain. But maybe there's a big difference between how those people interact with tools like Claude code. Less difference between them than working with an ide. But then maybe someone who's a journalism student works with cloud code as well, but in a fundamentally sort of different interaction model.

1:32:23.580 --> 1:32:24.140
<v B>Right, right.

1:32:24.220 --> 1:32:43.780
<v A>They're building something very different. They're building smaller things. They're building what you know. And maybe they don't want to buy the thing you're building, but maybe you're building for an end product. Or we, we. I think there's probably going to be seismic shifts, but it's very difficult to predict a specific Shift. Yeah, that was a lot of words to say.

1:32:43.780 --> 1:34:51.810
<v B>Very little. Boom, done. Throw the gauntlet down. Yeah, no, I think that makes a lot of sense. I mean, I think that, you know, for teachers, for all these folks to use Claude code for their specific task. Oh, here, here's another thing. You know, we seem to have forgotten the importance of data and particularly data around user experience. So like your people say, oh, I'll just make Salesforce myself for my company and I'll save my company a hundred K a year. It's like, yeah, but Salesforce isn't just like a database and some front end code and backend code. It's also like a decade plus of user research. Like, oh, you know, we, we allowed everyone to just add and delete people in Salesforce and then some rogue employee just got pissed and wiped the whole Salesforce database for one of our clients. And we learned not to do that anymore. We learned, oh, we actually need rbac. We need role based access control so that Joe Schmo, who you know is like entry level, who's an intern for the summer, you know, when his internship is over that the last day, he can't just go and download like all the personal information out of Salesforce, right? So like that's a lesson they learned. And there's probably, there's definitely like, you know, they probably learned like 10 lessons a day for 10 years. They learned like tons of, tons of lessons. And the AI is not going to have all those lessons because it's not going to be in the GitHub comments. You know, it's like Joe Schmo wiped the database for ExxonMobil on his last intern day and that's why this code is here. Like that's not there. So I don't think SaaS companies are dead. I don't think it, I think I know companies are like, we're going to cancel our Salesforce contract and in house it. I expect that to be a complete and utter disaster. All sorts of weird security problems over the next 12 months and ultimately for SaaS to make a comeback. And I'm not even in SaaS, but, but I have no horse in the race. But that's where I see it going.

1:34:52.210 --> 1:35:28.580
<v A>I also feel like it's a focus thing. Like do you. Salesforce is expensive, but I think people are maybe too worried about some of the expenses. It's like if you can have focus and just outperform it, right? Like, is that really what you want people spending time on? Like, even if it's easy I don't know. I mean, I don't know how much it costs, but it's like, maybe it is for a certain size company or people, like, doing very low margin work, but growing your margins is probably like finding a way to be more effective in some new area rather than just, like, reduce costs. There's like, always that balance, that tension in business.

1:35:28.980 --> 1:35:29.420
<v B>Yeah.

1:35:29.420 --> 1:35:41.260
<v A>And you're right. I feel like it's overblown, that suddenly it'll just make sense for everyone to do it. Some people will, like, the total addressable market will probably go down when there's lots of companies who are just paying too much per seat.

1:35:41.260 --> 1:35:41.460
<v B>Right.

1:35:41.460 --> 1:35:50.660
<v A>Like, it's. It's the total amount they spend. They can employ a small group of engineers to do this, but then in a bunch, it just. It's. It takes away focus.

1:35:51.300 --> 1:36:25.050
<v B>Right, right, right. Yeah. Because when you say re Implement Salesforce, that's not good enough. You know what I mean? Like, you can't just go to an engineer and say, do that because Salesforce is enormous. You probably don't need all of it. And even if you did need all of it, how would you faithfully reproduce it? So chances are they say, redo Salesforce and you end up with some janky thing that can't handle load and just doesn't behave the way you expect. And guess what? Has no documentation either. Or if it does, it's AI generated and it hasn't been looked at and reviewed by a person.

1:36:25.530 --> 1:36:33.610
<v A>This is broken. Talk to Claude. Talk to. Talk to chatgpt. There we go. Open. Open. AI. Ring, ring, ring. Sam Altman, can you fix me?

1:36:35.850 --> 1:37:19.940
<v B>Oh, my gosh. So, yeah, I mean, I'm not a betting person, but I would probably try to buy the dip on SaaS. I don't. Again, I don't know where it is. I wouldn't try to try to convince anyone that they could time the market, but just feels like. Feels like SaaS is a little underrated at the moment. All right, so, yeah, we will definitely cover open claw. Someone requested that. There's been a lot of requests for things adjacent to this. We wanted to really set the foundation, talk about the history behind this, and build the sort of first floor of this tower that we're building on this topic. So hope you all appreciated it. And yeah, we'll catch y' all later.

1:37:19.940 --> 1:37:21.020
<v A>All right, see you next time.

1:37:38.150 --> 1:37:59.750
<v B>Music by Eric Barndaler Programming throwdown is distributed under a Creative Commons attribution Share alike 2.0 license. You're free to share, copy, distribute, transmit the work to remix, adapt the work. But you must provide attribution to Patrick and I and share alike in kind.

