eazyware
Strategy·December 25, 2023·11 min read

AI open-source strategy: what to open, what to hold, why

Models, weights, infra, tools — what to open-source and what to keep proprietary. The strategic logic at AI companies in 2026.

KR
Kushal R.
Engineering lead

Open source in AI is a strategic choice with significant downstream implications. Which layers to open — model weights, training code, inference infrastructure, tooling — shapes ecosystem position, competitive moat, commercial opportunity. This post is the strategic logic at AI companies in 2026 and the patterns that work at different company stages.

What to open
Open source strategy — what to open Open typically SDKs, client libraries Docs, examples, cookbooks Integration layers, connectors Sometimes open Smaller model weights Fine-tuning scripts Eval harnesses, benchmarks Rarely open Frontier model weights Training pipelines, data Commercial tooling, infra Strategic logic Open what drives adoption, not what drives revenue OSS near the business; commercial where unique value lives Trademark and brand matter — control project identity even when open
Typically open: SDKs, docs, integrations. Sometimes open: smaller weights, fine-tuning scripts, evals. Rarely open: frontier weights, training data, commercial tooling.

Typically open

SDKs and client libraries. Near-universal. Developers want to see the code they integrate. Opens adoption.

Documentation, examples, cookbooks. Open, permissive licenses. Community can fork and adapt.

Integration layers, connectors. Connectors to other tools open-sourced enable ecosystem growth.

No competitive risk. These don't undermine your commercial product; they enable its use.

Sometimes open

Smaller model weights. Meta's Llama, Mistral's open weights, smaller open models. Open weights can attract developer mindshare even if your commercial offering is larger models.

Fine-tuning scripts. Tools to customize base models for specific use cases. Open source of these can drive model adoption.

Eval harnesses and benchmarks. Open benchmarks (MMLU, SWE-Bench, HumanEval) drive industry progress; open eval frameworks make commercial products easier to evaluate.

Strategic decision. Open if it drives adoption of commercial layers; hold if it undermines them.

Rarely open

Frontier model weights. Anthropic, OpenAI, Google keep their best models closed. Commercial and safety reasons dominate.

Training pipelines and data. The process of training models is valuable IP. Rarely shared.

Commercial tooling. Deployment, serving, monitoring infrastructure that generates revenue. Proprietary or limited license (BSL, SSPL, etc.).

Licensing choices

Permissive (MIT, Apache 2). Widest adoption, fewest restrictions. Fine for SDKs, examples, non-core tooling.

Copyleft (GPL, AGPL). Requires derivative works to also be open. Useful when you want contributions back but accept some commercial use.

Source available (BSL, SSPL, Elastic License). Restricts commercial use of the source. Covers many 'open source' databases, AI infrastructure projects. Not OSI-approved, causes community friction.

Custom licenses (Llama license, Gemma license). Hybrid — open for non-commercial or small commercial, restricted above thresholds.

Business rationale

Open to drive adoption. If your commercial product depends on ecosystem scale, open the ecosystem layer.

Open to commoditize competitors. If you're a platform provider, open-sourcing a layer competitors charge for undermines their business. (Microsoft's history here is instructive.)

Open for hiring. Strong open source presence attracts talent. Engineers want to work on open projects.

Open for trust. Enterprise buyers increasingly prefer open-source components (or at least source-available) for verification and lock-in avoidance.

Community management when open

Governance. Who decides what goes in? Purely company-controlled has fewer contributions; community-governed has less strategic control.

Trademark vs code. Even with open code, control the brand. Linux is open; 'Linux' trademark is managed.

Contribution friction. CLA or DCO required? Maintainer commitments to review? Balance between protecting the project and welcoming contributors.

Commercial boundary. Clear about what's commercial vs community. Don't surprise community with 'oh, that's actually commercial now.'

Common mistakes

Open-washing. Announcing open source as PR without real community intent. Developers see through this fast.

License changes in bad faith. MongoDB, HashiCorp, Redis all changed licenses in recent years. Each caused community fork (MariaDB for MySQL, OpenTofu for Terraform, Valkey for Redis).

Under-investing in maintenance. Open project without maintenance atrophies. Decide upfront if you're committing.

Over-complicated licensing. Creative custom licenses confuse users and scare enterprise legal teams. Keep it simple.

Examples in 2026

Meta's Llama — open weights, semi-permissive license. Drove developer mindshare, pressured closed providers.

Mistral — open weights for smaller models, commercial for frontier. Dual-path strategy.

Databricks MLflow — permissively open; drives platform adoption.

Anthropic, OpenAI — SDKs open; weights closed. Typical for frontier labs.

Read next
AI community building: developer communities, user forums, advocates
Read next
Platform vs product AI companies: the strategic fork
Read next
Open-source models in production: what actually holds up
Tags
open sourcestrategylicensing
/ Next step

Want to talk about this?

We love debating this stuff. 30-minute call, no pitch, just engineering conversation.

~4h
avg response
Q2 '26
next slot
100%
NDA on request