I built an enterprise SaaS platform in two weeks. Here's the part most people miss.

There’s a popular narrative in tech right now: AI is the great equalizer. Every SaaS company will be disrupted. Experience doesn’t matter anymore because anyone can “vibe code” a product into existence over a weekend.

I’ve spent 15 years building SaaS products, and I’ve obviously been following the latest in AI closely, as well as using it in our products. But, I hadn’t built an “AI native” product from scratch, so I wanted to test the claims and see what’s actually feasible, and what isn’t.

So I ran an experiment. I set out to build a complete, production-grade SaaS platform from scratch. Just me and Claude Code as my primary collaborator, and see how far I can get. Not a prototype, a platform I’d actually be comfortable putting in front of paying customers.

It took just over two weeks, and I’d like to share what I’ve learned so far.

The experiment

The idea was simple: build a real enterprise-grade SaaS platform with all the architectural decisions done right, infrastructure in place for handling scale, and obviously AI native processes and workflows in place from day one. Multi-tenant isolation, RBAC, SSO, billing, design system, notifications, observability, Kubernetes deployment, AI agents, MCP compatibility, vector search — the full stack implemented with proven and robust technologies and no vendor lock-in. No shortcuts I’d regret in six months.

As mentioned, I used Claude Code as my implementation partner. I used my 15 years in SaaS to direct the architecture, vision, tech stack, and made every key structural decision, but did consult and brainstorm these with Opus 4.6, and other LLMs too.

Claude Code wrote essentially all of the code. I initially reviewed all code that was generated, but throughout the two weeks, as my QA process improved and I learned to trust Claude more in the scope of the project, I now only review code in areas that matter.

The final count: 200+ features, 2,000+ automated tests, full CI/CD pipeline, production Kubernetes deployment, built-in automated agentic engineering workflows, full design system, a bunch of AI features, incl. agentic workflows, MCP etc. All integrated, all tested, all documented.

Agentic engineering, not vibe coding

There’s a term for the “let AI do everything and ship whatever comes out” approach: vibe coding. It’s now used commonly for AI driven software engineering, which actually isn’t the original intent of the phrase. Vibe coding is essentially about building stuff for yourself: prototypes, weekend projects, and demos. You only care about what you see in the UI, and that’s obviously fine for those use cases. Not something that I’d recommend for software that real businesses depend on.

What I did was something different. Simon Willison calls it agentic engineering — using AI as a capable collaborator and agent that operates within an engineering discipline you define and enforce. The distinction matters.

Vibe coding says: “build me a webhook system.” You get something that works well enough as a proof of concept or maybe an internal tool.

Agentic engineering says: “build me tenant-scoped webhook delivery with HMAC-SHA256 signing, retry with exponential backoff, delivery logging, and a pruning job — here’s the reference pattern for tenant-scoped models, here’s the RBAC policy it needs, here are the test categories I expect, and here’s the Definition of Done.” You get something that works in production.

A typical development cycle looked like this: I’d describe what I wanted with architectural and quality constraints. Claude would produce a plan, which I’d review and iterate until I’m satisfied it’s well though out and matches what I need for the project. Then I’d let Claude go at it, build it, test it, and create a PR. It always built something that pretty much worked. Sometimes it was close to perfect, sometimes less so. I’d then test and/or code review substantial new features and catch the things that only show up after you’ve built the wrong version before, or that just didn’t meet the quality bar of my project (e.g. for usability, simplicity, etc.).

What surprised me is actually the quality of the code being produced. Last fall when I last tried something similar, it didn’t work most of the time, and quality was still sub-par. Now it almost always works, and quality is generally good, sometimes even excellent. The pace of development is incredible, and AI already catches a lot of issues I wouldn’t have thought of and helps me make better decisions in many areas.

However, it still makes a lot of mistakes. Just like software engineers, even senior ones, would. Sometimes it over-engineers and introduces unnecessary complexity. Sometimes it’s lazy and goes for hacky solutions. Sometimes it forgets project conventions or critical requirements like testing or documentation.

Experience tells you which parts of the code will become a liability, and what good architectural decisions look like in the context of your business context and requirements. Vibe coding skips that step. Agentic engineering addresses this head on.

The difference between the two isn’t implementation speed. Both are fast. It’s what you have six months later. Vibe coding gives you something you’ll rewrite. Agentic engineering gives you something you’ll be happy to build on for years to come.

What does “AI-native software” actually mean?

When people say “AI-native software,” they usually mean they’ve added an LLM-powered feature, maybe even as one of the core capabilities of the product. However, I think that’s a bit narrow. AI gives us the opportunity to fundamentally rethink software engineering. That’s why I like to think of it as three distinct layers:

Features built for AI

This is the obvious one, as described above. In my project, I set out to build a platform ships with an agent framework, tool calling, conversation threading, and streaming responses. Every API endpoint can be automatically exposed as an MCP tool. There’s also a knowledge base with RAG search and vector embeddings for all key objects to power a hybrid semantic + keyword search. These aren’t add-ons. They’re structural.

Architecture that AI can work with

This is subtler. I set out to structure the codebase so that an AI collaborator can navigate it, understand the patterns, and make changes that fit. AI picks the naming conventions to ensure they’re intuitive for it. Architecture docs live alongside the code. Conventions are explicit and consistent. “Everything as Code” is now the way to go: Infrastructure, Design System, API contracts, Type Schemas etc.

When Claude needs to add a new CRUD endpoint, there’s a reference implementation to follow. Not because a human will read it, but because the AI will. In practice, this means a CLAUDE.md file with 500+ lines of project conventions, architecture decision records in the .agents/ directory, and a Definition of Done that is machine-readable and enforced for every feature. The AI doesn’t guess at conventions, it reads them. With the proper structures in place, it actually does a great job following them. I would argue a much better job than 95% of engineers I’ve seen.

Process built around AI collaboration

This is the layer most teams skip entirely. The development workflow assumes AI as a participant throughout the process, not just a code generator. Custom Claude Code skills automate migrations, commits, PR creation, and component scaffolding. Subagents run code review, security review, UX review, and documentation review in parallel after every significant change. CI gates enforce quality before anything gets merged. The AI doesn’t just write code — it reviews, tests, documents, and deploys it. This helps you keep moving fast while still producing great quality.

You can also rethink product adjacent processes. For example, I now have Claude Code automatically do a thorough root cause analysis of every error notification that happens in production with the outcome being a detailed, documented, and well formatted issue/bug report with actionable and reliable implementation suggestions for the fix itself, but also for improving testing so that similar regressions can be avoided in the future. I then review the issues and let Claude fix the ones worth fixing. The result is a significantly tighter feedback loop than was ever possible before.

By rethinking the entire R&D process, you can dramatically streamline the R&D process, accelerate your velocity and throughput, while maintaining or even improving quality.

The uncomfortable question

So, does experience matter less in the age of AI? Will AI disrupt every SaaS company?

Yes and no.

On one hand, creating software is now dramatically easier, faster, and cheaper than ever before. This means that ambitious teams can, and will, challenge the incumbents with much more experience. It will without a doubt disrupt and displace the ones that won’t or can’t adapt. The industry will also experience pricing pressure, and need to rethink their approach to value capture.

On the other hand, AI didn’t make my 15 years of experience obsolete. It made those years dramatically more leveraged. Every architectural opinion I’ve formed, every success and failure I’ve internalized, every “never do that again” moment is more valuable than before. AI lets me express and enforce these much more effectively before, and also implement the outcomes of those in two weeks instead of two years, and without needing a large team.

Similarly, the incumbents that leverage their current strengths of existing distribution, customer base, and domain expertise, while embracing the change are probably going to be formidable companies in the future – even if their org chart and business model may look a bit different from today.

Key takeaways

But here’s the thing that really dawned on me: Code is no longer the bottleneck. Yet, the same principles still matter for crafting quality software. You need to know what code to write, and even more so, what not to write. You need to define what good looks like. You need to have and communicate the vision clearly. The AI would happily generate unnecessary complexity, sloppy UIs, and the occasional hacky solution just like any other engineering team would.

My job was to say “no”, “not good enough”, “let’s think this through”. Repeatedly. The hardest part of building software fast is resisting the urge to build too much, to build the wrong things, or to put the quality bar too low for the things that matter.

The same principles still matter for creating quality software, but code is no longer the bottleneck. The 10x developer is now a 100x developer.

An inexperienced developer with AI will produce a lot of code, quickly. The infamous 10x developer with AI will produce the right code, quickly. The gap between those two outcomes is still enormous, and I’m not convinced it’s shrinking. The 10x developer is now a 100x developer.

What’s next

If there’s enough interest on the topic, I might dig into the specific lessons from this and future experiments in subsequent posts. For example, what actually matters when building production ready software with AI, what workflows and practices made the biggest difference for me, what changed compared to traditional development, and what I got wrong along the way.

I’ve always found writing a great way to crystallize my thinking, and sharing what I’ve learned also gives others an opportunity to share their thoughts so that we can reflect on the lessons learned together. So, let me know!

In the meanwhile, if you want to try out what I built, you can register here and give it a go. Would love to hear your feedback.