Every AI-curious company eventually faces the build vs buy decision. Do you build custom AI for your specific problem, or buy an off-the-shelf SaaS that promises to solve it? The answer is never universal — it depends on the problem, your scale, your differentiation strategy, and how your team is set up. But the frameworks to make the decision are straightforward, and the common mistakes are predictable.

This post is the framework we walk clients through when they ask us this question — and sometimes the framework leads us to recommend they not hire us. Good framing saves everyone time.

Four tiers

Off-the-shelf SaaS → configured SaaS → custom on foundation APIs → full fine-tune/pre-train. Most projects that think they need tier 4 actually belong at tier 3.

The four tiers of AI sourcing

Framing this as binary is the first mistake. There are four tiers, each with different economics, timeline, and risk profiles.

Tier 1: Off-the-shelf SaaS (pure buy)

You subscribe to a vendor's AI product. Zero engineering investment, monthly fee. Example: Intercom Fin for support, Gong for sales intelligence, Jasper for marketing copy. Time to value: days. Customization: minimal.

Tier 2: Configured SaaS

Same as above, but with meaningful configuration, integration, and data ingestion work. You pay for the SaaS plus consulting to make it fit your process. Example: integrating Salesforce Einstein across a complex sales org. Time to value: weeks to months. Customization: moderate (within the vendor's guardrails).

Tier 3: Custom built on foundation APIs (hybrid)

You build your own application logic using commercial LLM APIs (OpenAI, Anthropic, Google). You own the UX, the prompts, the retrieval layer, the integration. You don't train models. Time to value: 2-4 months for a first ship. Customization: very high.

Tier 4: Full custom (pure build)

You fine-tune or pre-train models, own infrastructure end-to-end, and treat AI as core to your product moat. Time to value: 6+ months, often years. Customization: unlimited. Cost: 10-100x of tier 3.

Where most companies should be

Tier 3 (custom on foundation APIs) is the right tier for 70% of the companies that think they need tier 4, and for the 50% of companies who started at tier 1 but outgrew it. This is where we do most of our work.

The framework: when to pick each tier

Three axes drive the decision: differentiation (how much does this system need to feel like your product?), data (how unique is your data, and how much does the system need it?), and scale (how much volume will this see?).

Pick tier 1 (off-the-shelf) when:

The problem is generic (most teams have it), not specific to you.
Differentiation isn't through this capability — it's operational overhead you want to solve and move on.
A good vendor exists and has the core features you need.
Total annual spend is under ~$100K (building custom is rarely worth it below this threshold).

Pick tier 2 (configured SaaS) when:

The generic vendor product is 70-80% of what you need.
The gap is integration work — fitting the tool into your specific process, data, and systems.
Your team has configuration and integration capacity but not AI engineering capacity.

Pick tier 3 (custom on foundation APIs) when:

The capability is or will become part of your product, not just internal tooling.
Your data is specific enough that generic vendors do poorly.
You want to own the UX completely — no vendor logo, no vendor UI.
Annual volume is >$100K in AI-driven value or >$20K in API spend (below this, cost of custom engineering exceeds savings).
You have or can hire the engineering capacity to maintain.

Pick tier 4 (full custom with training) when:

Your domain is sufficiently specialized that foundation models underperform even with good RAG.
You have millions of domain-specific examples to train on.
AI is a core differentiator of your business (not a feature, the product).
You have or can hire ML researcher-level talent, not just engineers.

Common mistakes

Mistake 1: Overshooting to tier 4 too early

Many ambitious teams start by talking about fine-tuning and custom models before they've tried RAG with a good foundation model. See our post on when to fine-tune — in 80% of cases, tier 3 with good RAG patterns beats a custom-trained model in both cost and quality. Tier 4 is rarely the right starting point.

Mistake 2: Staying at tier 1 when you've outgrown it

Companies pick a tier 1 SaaS, hit its limits at scale, work around the limits with hacks, pay premium prices for limited differentiation, and eventually end up with a worse system than if they'd gone tier 3 from the start. The signal: you're working around the vendor's limitations more than using its features. Time to rebuild.

Mistake 3: Tier 3 without the engineering capacity

Tier 3 is only good if you can actually staff the ongoing maintenance. The system needs an owner, an eval framework, a monitoring stack. Teams that build tier 3 and then can't maintain it end up with a custom system that degrades and eventually gets replaced — often by a tier 1 SaaS — at 2x the total cost.

Migration paths between tiers

Most companies end up at multiple tiers simultaneously — tier 1 for some things, tier 3 for others. That's fine. The pattern we see most often and recommend:

Start tier 1 for anything where a good vendor exists. Prove AI value cheaply.
Graduate to tier 3 for the things that matter most to your product. Usually 1-2 workloads.
Stay at tier 1 for internal tooling that isn't worth custom work.
Only consider tier 4 after tier 3 plateaus and you have clear evidence that foundation models are the bottleneck.

The economics at each tier

Rough ranges from our experience:

Tier 1: $10-100K/year in SaaS fees. Zero build cost. Predictable.
Tier 2: $20-200K/year SaaS plus $50-300K in one-time implementation.
Tier 3: $50K-$500K build cost, $50-500K/year operational (API + infra + engineering).
Tier 4: $500K-$5M+ build cost, $200K-$2M+/year operational.

For a full cost breakdown including the hidden categories, see our LLM TCO post.

The messy middle: partial build

The most valuable pattern we see: take a tier 1 SaaS for 70% of the problem, then build tier 3 custom for the 30% where the SaaS can't deliver. Example: a client uses Intercom Fin for generic support questions but built a custom tier 3 pipeline for their billing escalation flow where their data is too specific for Intercom to handle well. This hybrid approach delivers tier 1 speed for the majority of cases and tier 3 customization where it matters. Most mature AI programs look like this.

Buy for operational overhead. Build for differentiation. Know which is which.

Our bias (disclosed)

Eazyware makes money when clients choose tier 3 and hire us to build it. Our bias is not neutral. The framework above is the one we use honestly, and we regularly recommend tier 1 or tier 2 to potential clients when that's the right answer — it saves everyone the disappointment of a tier 3 engagement that shouldn't have started. If in doubt, we'll tell you honestly which tier your problem falls into during a call.

Closing

The build vs buy decision isn't once — it's per workload, per year. Revisit annually. A tier 1 SaaS that fit perfectly in year one may be limiting you by year three. A tier 3 build that was necessary in 2024 may be replicable with a tier 1 vendor by 2026. Stay honest about where each workload actually sits, and don't let past decisions prevent present ones.

Build vs buy: when custom AI beats off-the-shelf