Open source in AI is a strategic choice with significant downstream implications. Which layers to open — model weights, training code, inference infrastructure, tooling — shapes ecosystem position, competitive moat, commercial opportunity. This post is the strategic logic at AI companies in 2026 and the patterns that work at different company stages.
Typically open
SDKs and client libraries. Near-universal. Developers want to see the code they integrate. Opens adoption.
Documentation, examples, cookbooks. Open, permissive licenses. Community can fork and adapt.
Integration layers, connectors. Connectors to other tools open-sourced enable ecosystem growth.
No competitive risk. These don't undermine your commercial product; they enable its use.
Sometimes open
Smaller model weights. Meta's Llama, Mistral's open weights, smaller open models. Open weights can attract developer mindshare even if your commercial offering is larger models.
Fine-tuning scripts. Tools to customize base models for specific use cases. Open source of these can drive model adoption.
Eval harnesses and benchmarks. Open benchmarks (MMLU, SWE-Bench, HumanEval) drive industry progress; open eval frameworks make commercial products easier to evaluate.
Strategic decision. Open if it drives adoption of commercial layers; hold if it undermines them.
Rarely open
Frontier model weights. Anthropic, OpenAI, Google keep their best models closed. Commercial and safety reasons dominate.
Training pipelines and data. The process of training models is valuable IP. Rarely shared.
Commercial tooling. Deployment, serving, monitoring infrastructure that generates revenue. Proprietary or limited license (BSL, SSPL, etc.).
Licensing choices
Permissive (MIT, Apache 2). Widest adoption, fewest restrictions. Fine for SDKs, examples, non-core tooling.
Copyleft (GPL, AGPL). Requires derivative works to also be open. Useful when you want contributions back but accept some commercial use.
Source available (BSL, SSPL, Elastic License). Restricts commercial use of the source. Covers many 'open source' databases, AI infrastructure projects. Not OSI-approved, causes community friction.
Custom licenses (Llama license, Gemma license). Hybrid — open for non-commercial or small commercial, restricted above thresholds.
Business rationale
Open to drive adoption. If your commercial product depends on ecosystem scale, open the ecosystem layer.
Open to commoditize competitors. If you're a platform provider, open-sourcing a layer competitors charge for undermines their business. (Microsoft's history here is instructive.)
Open for hiring. Strong open source presence attracts talent. Engineers want to work on open projects.
Open for trust. Enterprise buyers increasingly prefer open-source components (or at least source-available) for verification and lock-in avoidance.
Community management when open
Governance. Who decides what goes in? Purely company-controlled has fewer contributions; community-governed has less strategic control.
Trademark vs code. Even with open code, control the brand. Linux is open; 'Linux' trademark is managed.
Contribution friction. CLA or DCO required? Maintainer commitments to review? Balance between protecting the project and welcoming contributors.
Commercial boundary. Clear about what's commercial vs community. Don't surprise community with 'oh, that's actually commercial now.'
Common mistakes
Open-washing. Announcing open source as PR without real community intent. Developers see through this fast.
License changes in bad faith. MongoDB, HashiCorp, Redis all changed licenses in recent years. Each caused community fork (MariaDB for MySQL, OpenTofu for Terraform, Valkey for Redis).
Under-investing in maintenance. Open project without maintenance atrophies. Decide upfront if you're committing.
Over-complicated licensing. Creative custom licenses confuse users and scare enterprise legal teams. Keep it simple.
Examples in 2026
Meta's Llama — open weights, semi-permissive license. Drove developer mindshare, pressured closed providers.
Mistral — open weights for smaller models, commercial for frontier. Dual-path strategy.
Databricks MLflow — permissively open; drives platform adoption.
Anthropic, OpenAI — SDKs open; weights closed. Typical for frontier labs.