October 8th, 2025 ×

Is Coinbase Really Writing Half Their Code With AI?

Topic 0 00:00

Transcript

Wes Bos

Welcome to Syntax. We got a really interesting one for you today. A couple weeks ago, Coinbase came out and said that they are writing 40% of their code with AI. And I thought that was really interesting because you you hear these stories of, like, yeah, you have these, like, brain dead vibe coders pooping out an app in an afternoon, and then you hear stories like like Google says, like, 80% of their code or something. And and then when I talk to people at Google and they say, yeah. It's we're just converting Angular apps to, like, a more modern version or something like that. You know? So it's I I'm always so curious when companies come out and say, we are writing x percent of our code with AI. I'm so curious. I just wanna be like, but what are you using? Like, what tools are you using? What are you using it for? What kind of code are you writing? You know? I wanna get into the nitty gritty. So Coinbase, not a small company, rather large company. You know? Lots at stake.

Wes Bos

Lots Financial has come out and said that they're writing 40% hoping to get to 50% of their code with AI, and I immediately was like, we gotta get somebody on. And it turns out I I do you remember, Kyle? We went out for beers many years ago in Toronto?

Guest 1

I think it was Toronto JS.

Guest 1

It was quite a while ago. Yeah.

Wes Bos

Yeah. So I I went to message you, and I was like, oh, I think I know this guy already. So

Guest 1

welcome, Kyle. Thanks so much for coming on. Oh, absolutely. Yeah. Thanks for having me. And and, yeah, I, you know, remember that time fondly. And then, ever since then, I think I I follow you on Instagram for, lately, home improvement tips and, you know, general DIY, inspiration. So, yeah, definitely have, you know, moved through the the sphere of influence, in in my personal feed.

Wes Bos

Oh, that's great. But was it you that told me the story about Walmart and toilet paper?

Guest 1

It was. Yeah. Yeah.

Wes Bos

So Yeah. This guy is responsible for the spooky story episodes that we do every single October, which is, we tell story horror stories and whatnot, and and he told me this. I'm not sure if we can get into it, but, it it I appreciate you telling us that because it it jogged my memory, and I was like, we gotta do, syntax on this. It's really good. Yeah. Yeah.

Guest 1

A a good episode about testing and production would be a, a a good theme there.

Wes Bos

Cool. Well, who are you, and and what do you do at Coinbase?

Guest 1

So my name JS Kyle. I've been at Coinbase for about four or five years Node. And, currently, I'm I'm leading a developer experience team that is so, I I guess, tasked focus on this, sphere of AI enablement. So how do we enable engineers across Coinbase to to use AI, to leverage AI for their coding tasks, for their developer productivity tasks, kind of all along the sphere of, the work before the work, the work during the work, and then, you know, ensuring quality all throughout. So the time at Coinbase has been filled with different experiences as the, you know, company has has changed in size and and shape. And currently, quality is is one of our leading, initiatives that we've been so focused on. So as we're exploring AI, we're doing so with very strong guardrails for quality. And and so, yeah, I'm really excited to talk about, I think how we're approaching AI enablement as well as safety and how that ultimately impacts the developer productivity down the road.

Scott Tolinski

Yeah. When you say quality, could you give the audience an idea of, like, what JS that like, how do you how do you even quantify quality?

Guest 1

Yeah. So at a large company, quality has many different shapes and sizes. We are always laser focused on our customers, Scott trying to measure from the customer perspective what quality looks like. And, obviously, as the company grows, you can feel further away from the customer.

Guest 1

You can have a lot of noise throughout the system. So there's obviously some high level metrics around incident severity and kind of outer loop impacts. But then on the on the inside impacts, or excuse me, the inner loop impacts, we're looking at how developers are, I I guess, able to test, able to kind of go through their checklists of what is the bar that we're holding. And it's we have service scorecards. We have all sorts of, I I guess, internal measures that we get to developers to kind of, you know, stay stay at a certain high bar.

Guest 1

But what quality really boils down to JS, can a developer use a tool frictionless, painlessly? Like, we all love our tools, and, ultimately, I think that's where the quality starts JS if you're using a quality tool, you're gonna have quality output. That's kinda my, my thesis. And so as as we're kind of, you know, enabling AI and enabling other kind of developer productivity wins along the way, we've done a lot of work around, you know, our internal build systems and and how we are building apps at scale. We want we want the tools to be really high quality.

Guest 1

And then, ultimately, we want, really fast iteration speeds. We want quick deployments.

Guest 1

All the observability inputs that that we kind of need in order to gauge, the success of our systems.

Wes Bos

Okay. So let's probably people are dying to hear. Let's dig into this. It's like you say 40% of code is Okay. Is written AI. So your are your developers just poop, poop, poop in the box, pooping out a whole bunch of code and then ship it? You know? Like, it's it probably not given that, like, the the the stakes of what Coinbase is. Right? Yeah. When we're talking about lines of code,

Guest 1

specifically in that metric that, I think everyone's, pretty aware of, we're talking about lines added or deleted. So one of the first, you know, pieces of critique that that that we kind of get on this is, they they shouldn't just be adding code to a code base. As you add more code, the complexity increases, quality usually diminishes. These are all pretty correlated metrics.

Guest 1

So, you know, lines added, deleted.

Guest 1

As developers are, you know, using their AI tools, look, like, I I kinda wanna, you know, move back and talk about the tenants of AI for developers. The key tenants that I kind of have in my mind is AI should always be attributed. So we Yarn we are essentially asking these systems to produce code on developers' behalf. So developers are still taking responsibility for the code that is produced.

Guest 1

So strong attribution JS kind of a key tenant.

Guest 1

And and, you know, along those lines, AI should never be making decisions. AI should never be kind of the final piece between, a feature and the customer. So everything that, I I guess, these lines of code are producing is merely a suggestion.

Guest 1

And it this is you know, it falls into the review, steps that we all know as software engineers we're able to, interact with every day. So code is produced. It goes into review. And this is actually where I think one of the key problems around AI and velocity is going to emerge down the future is code review. You need really strong code review these days when it comes to flagging potential risks, having good quality rules, you know, through your kind of, like, lint type, all, you know, the the kind of quality build system. Yeah. And then, ultimately, having having the right approvers to say, yes. This is the right code. We're gonna ship it to production, and then, you know, lean on all the strong parts of of the deployment flow. So I I wouldn't say that the code that we're writing is going straight to production.

Guest 1

We have a lot of really high quality guardrails intact that, you know, ensure that the right feature is landing in front of customers. What does that actually look like?

Wes Bos

So let's say a dev in your company gets a ticket to add a feature to to your dashboard.

Wes Bos

What does the, like like, start to end look like for a developer that is writing the Node, maybe maybe even the PM before that of figuring out what to do all the way through to, like, who's Node viewing it, testing it, security, etcetera.

Guest 1

The different ways that AI is used, I think, indicates how comfortable developers are with AI. So so we kind of introduce tools along the along the way. So you kind of talk about from, you know, ticket to to code.

Guest 1

We, in some cases, have product managers that are, you know, creating really robust tickets, and perhaps they're leveraging AI to do that. So they're, you know, culminating a bunch of different docs together, perhaps using some sort of inputs to create a really high quality ticket.

Guest 1

And in some cases, we have workflows where that ticket, can be worked on by Claude, can be worked on by an agent, essentially picked up right away, and that agent will work in the background, push a branch to GitHub, at which point it lands in kind of the traditional review flow. I I'd say that's the future that we're aiming for JS this kind of it'd be great if, you know, 60% of our work, ticket wise could be kind of just done by a background agent. And and where the attribution piece comes in here is, you know, there's an engineer that's kind of responsible for maintaining these background agents for their PMs. There's, PMs who, you know, feel compelled enough to just author Node. And and this has always been allowed, in in in systems JS if you're a product manager and you want GitHub access, you wanna end up in the review flow and kinda triage that all the way through, we let that happen. So this is where the lines are are blurry, but they've also always existed. And and so that's probably the the case that we're aiming for. I'd say the day to day piece for developers Yarn using cursor, using cloud code, using, any of our kind of other agents that that we allow within Coinbase. They might be using it in tab completion mode, which which is probably the most fine grain control of of the code that you're writing. And then there's agent mode, obviously, that is kind of fully agentic, and there's different levels of, enablement there. We can either, you know, turn on yellow yellow mode and and let, the the agent kind of develop however it wants.

Guest 1

But yeah. I'm I'm gonna kind of, you know, go back to my original point.

Guest 1

All of this code gets to this review step. And at that point, it it's kind of the same step throughout of strong developer review, strong strong testing, quality bars. So so how the code gets there, I I I think is, you know, enabled in different ways. But, ultimately, there there's the the traditional steps that the code has to fall in before it it gets to production. Yeah. I I'm curious about I mean, you mentioned

Scott Tolinski

a a lot of different types of tools here. Is is everybody on the same set of tools? And are those mandated from up high in terms of, like, you're using Cursor, you're using, you know, Versus Code, you're using Cloud Code, etcetera?

Guest 1

Yeah.

Guest 1

So Cursor JS a really high quality tool. We were early to use Cursor, middle of last Yarn, and it it's it's just so powerful because what I talked about earlier where it's blurring the lines for what is an engineer. You know, we have product managers. We have data science. We have all sorts of users throughout the company who are going to Cursor as kind of their first place to say, hey. I have some code in a in a, you know, GitHub repository, and I wanna chat with it. I I I don't really know exactly what I'm doing, but I just wanna do some data visualization, get some understanding.

Guest 1

I just wanna learn a little bit more. So Cursor is kind of the all integrated solution that I I think just about a 100% of our engineering and plus, a good good handful of of our PMs and other job functions are using. Cloud Node, we have, some custom tooling that enables cloud code. We we have other tools that that we're kind of exploring internally, such as, you know, Goose, codecs, other kind of open source alternatives.

Guest 1

And, really, what, you know, we don't wanna happen is, for us to go too deep on a tool, and then we all know how the AI story is unraveling. There's a new tool tomorrow. Right. Yeah. So how do how do we stay on top of that JS, we we kind of have these good primitives that let us, pull in the models with the right sorts of context.

Guest 1

All developers are encouraged to use these. I wouldn't say that there's strong, mandatory enablement, but, yeah, it's it's definitely I I think everyone wants to use them. And that, you know, our MCP culture kinda backs it up too. We have so many MCPs that, teams have kind of created these custom agents with. So, they might be, you know, running running a, an agent inside of a Docker environment that, calls different MCPs, and those are now passed around, from team to team saying, oh, I automated this really, you know, unique on call situation. And Node that that's just kind of exploding, JS this kind of, sourdough starter bespoke, you know, configuration.

Wes Bos

That's really interesting to me because I'm always curious about, like, how do large companies who have probably a lot of domain specific stuff inside of their company Yeah. That is only, for their company. I'm like, how how do you make AI work for that? So you're saying, like, you're making all these MCP servers that that help. Do you have any examples of of stuff you could share?

Guest 1

Yeah.

Guest 1

So we we have probably 30 or 40 developer facing MCPs today, from from all the tools that we use. Cool. You know, things like Datadog, PagerDuty, all of the kind of internal suite of of tools that we use.

Guest 1

And it's it's wild just the amount of creativity that we're seeing. And our kind of approach is enablement. It's not prescriptive.

Guest 1

You have to use this tool in a specific way. So we have all these, you know, agents that I was mentioning, but then we also have, like, lane chain, lane graph internally that teams are starting to really write super robust, fully agentic flows. You know, this is this is kind of the, the catalyst effect that's happening JS we every time we add a new a new NCP integration, someone runs to that and says, I've been waiting for this. It unlocks this this, you know, bespoke flow that that I've been doing. And and, yeah, the, the, you know, excitement around these, like, tools is is really, really new.

Guest 1

I don't think we've ever had so many developers clamoring for, you know, Python in production in in production in the sense of producing a a production internal instance. But, yeah, we've never had so many devs that are just running towards new domains because they're saying, cool. I can work with lane chain lane graph because I use cursor, because I can now kind of, you know, iterate in a new language that, I I wasn't fully comfortable with before.

Scott Tolinski

Yeah. That's amazing. I I I'm I'm curious on on your take on this. I I've noticed that when I'm switching tools, they interact with agents in a different way, and I'm having to translate it across tools. Do you find that MCP is a more solid surface to be able to keep those tools in rather than using and adapting custom agents for whatever tool you're using at any given day?

Guest 1

MCP is not a perfect, it's not a perfect abstraction. I think it's a and and it's such an interesting world where I I think you've seen even so many vendors saying Wes have an MCP available now, and it's kind of this overused term. It's like it's just a protocol. It's just a way that, you know, you're you're interacting with a a service remotely in a very generalized way. But then, But then, yeah, you see agents like Cloud Code, that natively call GitHub, for example.

Guest 1

And that is is is a much tighter way for agents to to work. It's just, to to call an API directly rather than proxy through an MCP that it has to kind of figure out what to do with. So, hey, I I think as developers are kind of transferring through tools, this is where there there might be very specialized, use cases where it makes sense to, you know, go deep on on a tool integration and, you Node, use the native API instead of an MCP.

Guest 1

But, broadly, I I think a lot of our tasks are are generic enough to where MCPs are kind of, you know, letting us prototype really fast, letting us use something, instantly in a really easy format. But if if something gains more adoption, if if, say, we produce a, an an agent that does really unique bug triaging, for example, we might move away from an MCP and move towards a direct API call just to get that, you know, even finer level of of control.

Guest 1

It's kind of the one two automate philosophy that we've used at Coinbase of we're gonna prototype everywhere. Cool. Someone else likes it, and now we're gonna make it into a a production thing. And that's that that that's in general, I think, how DevEx, my team kinda works is, we we see what this team is doing in in one part of the company that, might be a more ambitious part of the company or, you know, less, regulated part of the company. Man, it's just exploding over there. It's doing great. All these hundreds of developers are are now enabled. How can we bring that to other parts of the company to to fit their kind of, you know, compliance profile in in a very safe way? And,

Wes Bos

so so, yeah, try to, you know, uplevel the, I I guess, innovation to other parts of the company is is kind of a a key part of of working at this company at scale. And what kind of code are you primarily writing with AI? Like, is this React components and CSS, or is this, like, back end code? And, like, I don't even know Node is the stack of of Coinbase.

Guest 1

Yeah. So we're primarily, you know, React TypeScript, in in the front end and, very heavily Go on the back end. Traditionally, we were kind of a Ruby shop and, have have, you know, grown into a a a Go company over time. There's been a huge push early on to grow test adoption. And this is this is probably where a giant bulk of our AI code was was written early on was, I wanna use Cursor to go and and bring in the, test coverages that that I need. So we started Scott mandating, but, you know, putting out bars in places saying, okay. If you wanna use Cursor, if you wanna use all these AI tools, you're gonna have to meet these certain, criteria of, you know, test test coverages, deployment frequency, scorecards, etcetera.

Guest 1

And, so teams early on would say, okay. We're below the bar. We're gonna write a bunch of tests. And it's one thing to have high test coverage, but it's another thing to, you know, have them be very accurate tests. And we've we've created certain tools for this with, you know, providing the right sorts of context, mocks, etcetera to to create really good tests. So so yeah. Early on, a a lot of test coverage. I think Wes we're seeing the bulk of the company write AI generated code today is is probably 60% TypeScript.

Guest 1

I think that, you know, it's just a much more, fungible place where, we're able to iterate on pretty quickly.

Guest 1

And, yeah, traditionally isn't quite as sensitive secure as, you know, some some of the more back end systems.

Wes Bos

And JS that, like, internal tooling, or is that, like, customer like, is the is the coinbase

Guest 1

definitelyboth..com? Yeah? Okay. Yeah. Definitely both. Coinbase.com, it's a pretty highly attractive repo for AI today.

Wes Bos

And does it how much Go does it write? Or you say it's it's mostly TypeScript, but does it it still dips into Go as well?

Guest 1

Oh, yeah. Yeah. Yeah. A Scott of our back end engineers are using Cursor. You know, this is a it's a static language, and it it's actually a pretty simple language.

Guest 1

So it's a great use case for AI. Look. Again, this these are core systems. And in order for these teams to be using AI, we need to keep the quality part really high. And and, yeah, there's so many different ways to do that, but we're we're not just letting people vibe code and push to production with compact and systems by any means.

Scott Tolinski

Do you find that it it the your AI tools handle some aspects of the code Bos better than others? Like, I I know you've mentioned languages specifically, but even as far as languages go. Yeah? There's different tools that that have done well. So we've we've heard, Solidity devs, some of our kind of, like, blockchain systems engineers,

Guest 1

they were not having great luck with Cursor and kind of however, you know, Cursor JS kind of enhanced on on their context. So, they they they started using cloud Node, and whatever cloud Node training data or or sampling is from, it JS just a lot higher of a bar for them. For it's trained on more solidity data. So there's there's different tools that I think maps to different code bases pretty well. I think the, you know, very generic parts of developers' jobs of setting up a new a new route, setting up a new screen, the copy and paste of of what we're already doing from you know, I I I wanna, you know, create a new route, essentially.

Guest 1

Very good at that. It's very good at pattern matching and pattern learning.

Guest 1

More specialized and and, like, new architecture JS not necessarily the place that I would leverage AI. I think pushing it into a place where there's not existing parts of the code Bos to look at, maybe

Wes Bos

the the best practice that it's proposing is not the best practice that, you know, you as an engineer agree with. So letting it write net new is is not necessarily the best use case for it. Interesting. Because we get that question all the time, and and maybe you could answer this. If someone says, we have an absolutely massive code Bos. We have strict ways that things need to be done. If if someone comes to you and says, like, hey. We have this code base that's that's 15 years old. There's there's ways that we do it. Right? We have we do have some tooling around it.

Wes Bos

But how do you get AI to write code that that looks and acts like the code that we have been written? Like, what what would you tell someone to do? Wes they fire up cursor? What tools do you need to in order to support that? Onboarding new tools to existing systems

Guest 1

and kind of downloading all the cultural, context that you've created is is is an ongoing task. We've had developers have good success with, you know, the Cloud MD files or the cursor rules.

Guest 1

Internally, we've created, code conventions, kind of AI code conventions that are applied generically to different tools.

Guest 1

And, you know, historically, this was, this was something that we've always done.

Guest 1

Two, three years ago, we'd have, weekly, monthly meetings of, you know, senior staff engineers getting together saying, what is the best way to do tests? What is the best way to, you know, you use state management? And we'd kinda put it on put it on paper, document it, set it in stone, and and preach that throughout the rest of the company.

Guest 1

And now we're able to codify that quite well with with with AI. So it takes time, and and there's some tools out there that are doing it quite well in terms of learning about your code conventions over time.

Guest 1

Greptile is one review tool that that we've started, experimenting with, at coin in Coinbase Wes it actually looks at review comments, and it starts to build an internal, registry of things that it's reviewed on or or things that other developers have called out and review.

Guest 1

And so that's where I think, you know, the the long tail problem is is is going to be around capturing context, in in a really nice way. I think the CloudMD file, I've seen a lot of engineers start with that. They say, hey. I I got cloud in a codebase. It committed a file.

Guest 1

Maybe it gets updated in the future. Doc writing is something that, different teams have always had different levels of success with. How can we automate that with AI to then turn those into AI rules? It JS a conversation and and tooling that we're building. But, yeah, for for teams that are just getting started, it it's different levels of success with with, different AI tools. I think Cursor has its own way of kind of indexing and creating, creating a map of your code base, whereas, Cloud Node just uses JS everywhere. And and it, you know, it it kind of man manually crawls your your code base. So, experiment with the right, tool and and, you know, for for the size and shape of your code base is my my suggestion.

Wes Bos

I'm really excited about the new Go based TypeScript compiler, because I think that given how fast it is, it's going to make the prompts much easier to to, like, kinda spider through the pieces of the code Bos that it needs. Yeah. So I'm curious to see how that

Scott Tolinski

Yeah. Integrating integrating that with an agent would be really sweet. Yeah. Yeah. I I think it's just a matter of time with that. And if you want to see all of the errors in your application, you'll want to check out Sentry at sentry.i0/syntax.

Scott Tolinski

You don't want a production application out there that, well, you have no visibility into in case something is blowing up, and you might not even know it. So head on to century.i0/syntax.

Scott Tolinski

Again, we've been using this tool for a long time, and it totally rules. Alright.

Wes Bos

So now that CoinVoice is writing 50% of its code with AI, do you do you just fire half of your staff?

Guest 1

Definitely not. So Yeah.

Guest 1

So let's talk about velocity. You know, there's always a there's always a demand for more velocity.

Guest 1

This is everyone says, we wanna we wanna create features faster. We wanna iterate faster. We wanna be hyper competitive. What we hope to see and and this is this is where the metrics are are very hard today. And and so, yeah, we can talk a little bit about metrics here Wes code coverage is probably one of the only finite output metrics that we really have nailed down today. We can say AI code, you know, generate generated, lines of code is is is going up as AI's usage is going up.

Guest 1

On the other side, we have a lot of input metrics. So we have input metrics around what number of AI tokens are engineers using. And, you know, we see that growing at an adoption rate right now. It'll probably get to a resting rate until some sort of tool comes along that consumes twice the amount of tokens. What we really hope to see is, developer velocity increase in in a very long tail way.

Guest 1

And right now, there's a lot of there's a lot of issues in in that kind of developer velocity, story. There's issues around I, as developer, am able to produce three PRs a day.

Guest 1

Awesome. But reviewing those three PRs a day is still just as painful as ever, or testing those is just as painful as ever. Or, you know, I I'm staggering out the deploys because my my system, the way that we we, you know, promote code hasn't really been modernized in a while. So there's all these kind of existing pipeline problems that JS you, you know, put more code into the wood chipper, you're gonna kinda create this backlog downstream.

Guest 1

You're going to see these systems, not scale well. And and and that's, I think, what we're really hitting internally right now. The other side of it is engineers are able to focus on more long term work. They're able to, you know, focus on on things that they weren't able to before, because I had to, you know, produce a feature by the end of the week, but I also had to go and ensure that, you know, our SLO reporting was right, that, our customer complaints were addressed, that kind of, you know, triage of of our of our bug tickets happened correctly. So there's all these kind of, like, work around the work tasks that Yeah. People are now able to spend more and more time doing, which, hypothetically, should increase the quality over time.

Wes Bos

Yeah. Yeah. If you're always putting out fires like that, and there's no time to to do the deep work, and then the product will suffer.

Guest 1

Exactly. Yeah. Yep. So, yeah, long story short, more code written by AI doesn't necessarily translate to less engineers in the long run. You know, perhaps it does scale it at some point. I perhaps, you know, teams won't be constantly hiring forever because, you know, obviously, organization building, there there reaches a certain sweet spot where, yeah, more people just they don't solve more problems better.

Wes Bos

Yeah. Yeah. And you mentioned some stuff around code review as well. Like, that's that's a bottleneck right Node. And I see it in so many of these open source project, especially some of the open source projects where it has a very specific skill set. You know? Like, I was looking at the OBS WebSocket implementation in c the other day. You know? Like, there's one guy that understands how that works. Mhmm. And it's really frustrating because you see all these pull requests of people being like, I don't know c, but I just typed the problem into a box, and maybe this works. Let me know. You know? And and now there's this huge backlog of, like, we've got one guy who knows what he's doing. So is is AI helpful in that code review space as well?

Guest 1

Absolutely.

Guest 1

So, yeah, the the example of the the one engineer who is responsible for an entire open source library, this is everywhere.

Guest 1

Wes we adopted Expo early on at Coinbase, and I remember there was, like, one engineer who maintained a v eight fork for Android.

Guest 1

So shout out to Kudo if if, you know, you Yarn listening because, yeah, he's keeping so many, code bases alive. But, yeah, the, the kind of the the code review step is probably more important than ever. This is, where there's there's more demand, and there's probably so much opportunity here too.

Guest 1

And what what Node review has traditionally been, especially ESLint open source and kind of very async teams JS, okay. On on Monday, someone submitted a PR. Awesome. I left some comments. I repeated on Wednesday, left a couple more comments.

Guest 1

We were going to good Vercel on Thursday, but then, you know, CI broke, so we're still triaging it. And that's pain.

Guest 1

And and so what if we could automate 30 to 40% of that? What if we could say, okay. That first round of comments, the low hanging fruit, just, you know, have AI kind of run through that with these code conventions that we've built, with this knowledge, you know, graph that we've we've developed. And, you know, how do we kind of, like, take out the initial round of review? Cool. What if we get to a point where, CI and integration is a little flaky, something's breaking in the PR? Why not just have a cloud code and GitHub actions bot triage on that? Mhmm. Say, hey. It looks like a task was changed on master.

Guest 1

Here's a here's a, you know, piece Node to fix that, and now your branch is back up to date. So kind of the the branch health and and so if we can if we can fix some of these just pain problems around code review, as well as scale the idea that everyone's gonna receive a really high quality review by default, that's probably one of the biggest opportunities as more code is written and quality bars don't wanna be diminished. Yeah. Man,

Scott Tolinski

that it it's so fascinating to hear you you talk about these these systems at a at such a scale considering, like, it feels like a a just a giant version of the same things that I'm I'm doing on my own. So, like, to hear, it's so affirming in a way to know that, like, these are the systems that work at this type of scale.

Scott Tolinski

And it's also fascinating how you guys are essentially reducing latency there. You're reducing that latency to to iterate, to fix, all that stuff. And this is kind of like a a broad question, but what's the one

Guest 1

tool in your arsenal right now that gets, like, the most bang for buck? I I mean so, yeah, cloud code is just such an amazing tool. And I'm I'm gonna go back to that because it's it's a great primitive. And and the way that we've kind of brought it to Coinbase is we have our own internal, you know, list of of models that we've kind of allowed, and and and they're provided through a light LLM proxy, and, we're able to, you know, configure that with our tools.

Guest 1

And so Cloud Node is is is, it's it's very portable. We're able to use it in GitHub actions like I mentioned. We're able to use it in containerized environments and scripted environments.

Guest 1

And, you know, you're able to provide so many different pieces of context to it, and it's it's this kind of puzzle piece that just fits in so many different systems. You Node? We we wanna build, more advanced triaging bots for, hey. Our our our century, instance picked up an error, and Cloud Code picked it up, tried to work on it, and here's a PR for it. I think I think, you know, those types of systems interactions, have a great future in front of us and reduce the latency like we're talking about. So a new bug is detected.

Guest 1

Instantly, we're able to kind of put stream together these automations.

Guest 1

And the cloud code is probably one of the best, Wes maintained feature full pieces of software that are out right now. I I I haven't had the same experience with codecs or, I guess, other offerings on the market.

Guest 1

But that said, you know, we kind of need tools that are specialized to models. I think Claude fits that really nicely.

Guest 1

And I think perhaps the entire, developer ecosystem right now is just banking on Anthropic being a a long term competitive partner here.

Guest 1

We hope that, hey. Just just, you know, keep creating a great tool that lets us, you know, write code. But what if we get to a day where that support runs out and we don't have a tool that's built quite as ergonomically around a model? A little bit of a risk. I I you know, we we kinda felt that with GPT five coming out that, hey. What's gonna be the right tool here? Cursor's working on this problem as well. You know, they wanna be more than an IDE. They're offering a full suite of, different integrations with their Cursor agent.

Wes Bos

Man.

Wes Bos

And what about cost for all of this stuff as well? Like, are you sensitive to how much this Scott, or are you all just on the, like, what, $29 a a user per month or or whatever? Or are you, like, straight up API Wes spending thousands of dollars on a feature?

Guest 1

Yeah. So costs are an interesting one. You know, a broad statement right now is relative to the cost of an engineer's time. Model costs are just it doesn't it's not quite there. So so we wanna let engineers use the model that they think is the right tool for the job. This is, I think, still a gap in the tooling JS why am I as an engineer having to think about not only the problem I'm solving, but what, like, what what model do I send it to? What tool do I use? You Node, what what context window? These are all little problems I think are gonna be ironed out over time.

Guest 1

I I think there's AMP code in the space that's doing a really nice job here, having an oracle that kind of, you know, decides what is the best model to use for a task. And and and, yeah, the cost optimization is not something that I'm necessarily concerned too deeply about today. I think, you know, clamping down on Opus over usage in certain environments is definitely, something that that that, you know, to keep an eye on. But with Cursor, we we've been able to work out some pretty good cost controls and some good, kind of model management

Wes Bos

features. And That's good. And and what about security as well? Is it is it fine just to be sending Coinbase's code to these random models? You guys feel comfortable with that? Yeah. I mean so

Guest 1

I think every every org should be comfortable with the zero data retention model.

Guest 1

You know, every every model provider JS doing this. And I think where it gets a little fuzzy is different integrations having indexing and, having, you know, some sort of, like, database persisted

Wes Bos

of of, I guess, context that is provided to them. Yeah. That's the messy part that we have to go through our due diligence with every time we work with a different tool. Yeah. I I I didn't think that you didn't think about Sanity, but people are I'm I'm sure any time you talk about especially, you can talk about something that has money, you know, essentially a bank, people are Yeah.

Guest 1

I mean, just just to just to, like, you know, double click on that point a little bit more. I I've been so impressed by the level of diligence with our security teams.

Guest 1

And I think, you know, the the willingness to to move at a speed that lets us keep up with this ecosystem. This is a very fast moving ecosystem.

Guest 1

We've worked out a framework of where, you know, risks are. I I think a year or two ago or you know, AI was very spooky.

Guest 1

It was, we don't want to, you know, train on our data.

Guest 1

We don't know where the where the data JS going. We need data controls to make sure the right types of data is going to AI. And and, yeah, I I think now we have a really mature framework that limits any sort of, you know, exposure.

Guest 1

Yeah. The, the, you know, teams just ultimately need to get comfortable with certain model providers. I think there's we have a white list internally of of, you know, essentially, what models we've deemed appropriate, based on, some some kind of analysis, and we don't use any models that are, I Wes, excluded from that list.

Scott Tolinski

So I would imagine you guys are tuning these systems constantly.

Scott Tolinski

Like, how often are are the knobs being

Guest 1

tweaked on on these systems that you're you're implementing? I think it really depends on the system and and the technology. So I we don't do much fine tuning. I don't think we're you know, developer productivity is a place that necessarily warrants fine tuning. I think there's other parts of the business that, it it makes a lot more sense, some of our compliance controls and ML teams, the work that they're doing. On the kind of, you know, fine tuning, tweaking, turning the knobs, I think managing context windows is, one of our of our daily tasks of Yeah. Yarn you just just providing, you know, a a single query that you want really finite results for, or are you providing just a giant, you know, piece of context for the model to iterate through? So so that's that's one challenge. And then some of our more, bespoke systems, so it's more link link chain, link graph agents. You know, you can mess with temperature controls and kind of different interpretations and yeah. You're kinda creating a persona at that point, and and that's where a lot of the tuning comes in. Have you guys have you guys created any agents with, I guess, Linkchain or or or Maxtra, any of the kind of, like, agentic frameworks?

Scott Tolinski

No. I haven't. I've I've been mostly just, like, augmenting mine into death as I go. Yeah. Just the my cloud my cloud agents or my open code agents. I'm just tweaking them constantly, but I haven't been using any,

Guest 1

outside tools other than those. I was just gonna say, it's it's kind of a fun prototype space to be in, though, right, with cloud code Wes Oh, yeah. Oh, yeah. It's this it's this thing that you can just, like, manipulate in different ways. You can add bash scripts around it. It's endless. And it and it it's actually become a problem for me because I'm just constantly messing with it instead of building with it. So yeah.

Wes Bos

Yeah. It's a it's a hard part.

Wes Bos

And kinda on that point, I wanted to talk about using AI as as part of the application. So this whole thing has been talking about using AI to write Node.

Wes Bos

And I I feel like we don't necessarily talk about, like, putting AI in the application itself. And you said one of your tenants was AI should never make a decision, which means that, like, yeah, you probably shouldn't put that as as part of some of your flows. But, like, at Coinbase, what does it look like for people who say, instead of having to build some sort of algorithm for this or a function to work, can we just throw it to an AI and and come back with the result?

Guest 1

Yeah. That that's it's outside outside of my Scott, primarily. What broadly the suggestion has been is, and and, you know, we have a lot of controls around this Wes if if you plan on, you know, there being some sort of AI touching any sort of the customer journey, there's a very long process to kind of go through to to enable that because we want it to be, you know, highly accurate for it to be essentially scaled in the right way. Mhmm. And so so that's a very different problem set. I actually think one really interesting area that, you know, I I haven't I haven't seen us leverage yet.

Guest 1

You can do the on device SDK. So you can use the Apple intelligence through an Expo SDK now. And I I I don't have any good use cases quite yet, and I don't know how you do things like observability

Wes Bos

and you know, I have so many questions about it. I would Yeah. Love to try to use in a personal project or something. Man, the the one example I always give people, I'd like, I'm I'm big on, like, the on device or the smaller model that is good for one thing. But just, like, one simple example is we should be done with date pickers. We should be able to just type in the last three months and, like and anytime I talk about this, people are like, well, there's a library that's pretty good at regexing it. Like, no. Yeah. Yeah. Yeah. Like hot garbage. I wanna I built one, and I wanna say the last game that Michael Jordan did until, three weeks before, somebody was something specific was born. You know? I wanna I wanna be able to type in literally anything I want and for it to grab grab the information, figure that out, and bring those into Unix time stamps for me. And for me, it's spelling, Wes. For me, like,

Scott Tolinski

Google is still the best spell checker because whatever machine learning they're doing in in spelling. Right? Like,

Guest 1

Apple spelling, you could be one letter off, and it's like, I don't Node. Suggestion. Yeah. You gotta copy paste it into Google to find the real spelling. Right. Yes. Come on. So I think all all problems boil down to forms again. It's like, if if we're doing for forms online, are you saying we should just all these forms should just be AI AI clicks now? The whole thing about the future of UI in general and, like Yeah.

Scott Tolinski

Like, okay. We have all these bespoke UIs. I love bespoke UIs, but, like, you always have to learn them. You have to figure them out. When ultimately, you're trying to accomplish a task, like, what does that look like? Right? You know? I don't know. The the there's there's too many head Node y things there when you get into too much of everything becomes a chatbot. You know? I think there's an opportunity here to create an open source library that's just, boom, a single form that says,

Guest 1

hey. Tell me tell me about yourself. Yes.

Guest 1

How many days after the Super Bowl were you born? Yeah. Boom.

Guest 1

And that's your validation.

Guest 1

Send it off to a to a back end server.

Wes Bos

Oh, that'd be great.

Wes Bos

Anything we haven't touched upon yet that you you'd like to talk about? You know, the developer journey is is

Guest 1

really the the biggest piece of, what we're focused on. I you know, lines of code is exciting. Like I mentioned, it's it's it's an output metric today. All the input metrics that kind of, you know, go into how how someone does their job day to day. We're in this crazy time where, we don't wanna put too much of a microscope on that or, you know, try to massage that story too much. And I think AI is simply a tool at the end of the day. It's not going to replace what, you know, developers are making decisions on. It's it's not a magic bullet that's gonna solve every problem, you know, universally with the right ergonomic or success output.

Guest 1

There there's been some cool studies online like the, what was the meter study? I I don't know if you guys saw that one, but it was, they sampled 15 or so open source developers on how AI impacted them. And it it ultimately produced some negative results saying, yeah, it didn't make us faster. In fact, it made it made us slower. And, digging into that study a little bit was really interesting because you you learn, okay, there's some, you know, senior staffish level engineers who are just mandated to use AI for every task, and that actually slowed them down. So it's it's not something that I think we can we can dictate to every engineer, across our company, across any company, and say you have to use AI a 100% of the time, it's just not it's not going to be right. And, ultimately, what I hope for in the next three, four, five years, especially as we're hiring more entry level engineers, is that we have AI native. So I I wanna see folks who, you know, come to Coinbase saying, this this is the model that, you know, I prefer to use. This is the tools and and kind of the workflow that I really like to, you know, work with.

Guest 1

How can how can Coinbase, you know, fit that for me? Yeah. Like, I I think we're gonna get to a a place where there's more AI natives, I guess, than there are people who are who are just poking at it in different ways. Yeah.

Wes Bos

And I saw something the other day where someone said, you you're if you wanna get hired, like, you're gonna come to a job interview with your backpack full of these 20 agents that you built, and they're gonna wanna hire you because you're bringing the guns. Do do you think we will get to a spot where we hire somebody because they got the sauce?

Guest 1

I I mean, yeah. Haven't we always kind of done that in some ways of, oh, you know, show me your GitHub profile. Yeah. Show me, you know, We wanna obviously bring on people who have who have have the experience to to solve the problem how however they need to in a really high quality way. Yeah. I I I mean, on that backpack analogy, I've I've I think most jobs I've gone to, I've, you know, transported around my bash profile or, you know, all the kind of, like, little n files that you come with, and those were my productivity hacks. I think now, I I wouldn't be surprised if we start hiring engineers who say, hey. I have this agent that I use. Can we onboard them too? Yeah. It's so funny, Wes, because I that that's such a good,

Scott Tolinski

like, way to say that. Can I onboard this agent? But I remember when I got my job at Ford, one of the reasons why, like, that interview went so well JS when they came to me, I was like, I have this whole Sass, CSS toolkit, and they're like, I want Yeah. That's true. You know? Yeah.

Wes Bos

Same thing. Point. Man.

Scott Tolinski

Alright. So now's part of the show where we get into sick picks and shameless plugs. Sick pick being something that you're just enjoying in life right now. Shameless plug anything you wanna, plug and just, you know

Guest 1

Kyle, do you have a sick pick for us today? I guess things that things that I'm enjoying in life right now. So I I'm currently on my, my four by four sabbatical, and, I'm just doing a ton of woodworking around the house. Nice. Oh, right on. And, Yeah. It's it's it's going great. So I don't really have any sick pics here. Maybe the ultra shelf people are just creating these amazing floating shelf brackets.

Guest 1

I don't know. But but, yeah, I I think if you're a, engineer, someone who develops in front of a screen all day, find some sort of thing to do with your hands JS kind of my, like, pitch to people. I've been doing pottery, woodworking, trying to do things very away from screens in my free time, which is just such a dichotomy coming back on, and now we're just moving at the speed of light with agents. But it it it really helps to, I think, solve problems in a different way. So, would recommend that. And then, excuse me. What was the second the second piece here? Just anything you'd like to plug. Yeah.

Guest 1

Shameless plug.

Guest 1

Yeah. You know, I I don't really have any plugs right now. I'm, I'm currently, you know, really loving the developer ecosystem, seeing kind of, like, you know, what what cursor and and and those teams are doing. I think the plug I have is is probably just, to follow along, I think, what Coinbase is doing with our agentic systems.

Guest 1

We do actually you know, we're doing a lot in kind of the cryptocurrency, agentic protocol sphere right now, and there's some really cool work being done around the x four zero two protocol with Cloudflare. I think this is going to unlock kind of a future of agentic payments and, you know, how do agents kind of transact ESLint.

Guest 1

So, yeah, I I I don't wanna, you know, necessarily show it to everybody, but, it it seems like a really interesting space to start thinking about JS as you're building perhaps ecommerce or any sort of checkout steps, what if an agent were to interact with that, and how do they do that? How how do you transact value between, you know, two individuals? Yeah. That that that's probably my my my plug for the day. Sick. Love it.

Wes Bos

Awesome. Well, thank you so much for coming on. I appreciate like, this is such a fantastic interview. You're incredibly well spoken.

Wes Bos

It's such a I feel like we have been missing this, like, context and, like, this inside scoop in the industry. So I appreciate that you're able to come on here and, like, props to Coinbase as well because, like, every time we talk to these companies about, like, hey. Can you come on and talk about it? They're like, sorry. Legal killed it.

Guest 1

Yeah. So I appreciate that. Yeah. Love to share what we're building in the in the open.

Guest 1

I've chatted with other eng leaders from from, different companies, and and, you know, it's it's such an interesting time right now to be building. So Yeah. Glad I could glad I could share and kinda give a peek at what we're doing. Yeah. Yeah. Beautiful. Thank you so much, Kyle. Massively,

Scott Tolinski

massively interesting and and valuable here.

Guest 1

Absolutely.