20250801-More_model_releases_on_31st_July

原文摘要

Here are a few more model releases from today, to round out a very busy July:

Cohere released Command A Vision, their first multi-modal (image input) LLM. Like their others it's open weights under Creative Commons Attribution Non-Commercial, so you need to license it (or use their paid API) if you want to use it commercially.
San Francisco AI startup Deep Cogito released four open weights hybrid reasoning models, cogito-v2-preview-deepseek-671B-MoE, cogito-v2-preview-llama-405B, cogito-v2-preview-llama-109B-MoE and cogito-v2-preview-llama-70B. These follow their v1 preview models in April at smaller 3B, 8B, 14B, 32B and 70B sizes. It looks like their unique contribution here is "distilling inference-time reasoning back into the model’s parameters" - demonstrating a form of self-improvement. I haven't tried any of their models myself yet.
Mistral released Codestral 25.08, an update to their Codestral model which is specialized for fill-in‑the‑middle autocomplete as seen in text editors like VS Code, Zed and Cursor.
And an anonymous stealth preview model called Horizon Alpha running on OpenRouter was released yesterday and is attracting a lot of attention.

<p>Tags: <a href="https://simonwillison.net/tags/llm-release">llm-release</a>, <a href="https://simonwillison.net/tags/openrouter">openrouter</a>, <a href="https://simonwillison.net/tags/mistral">mistral</a>, <a href="https://simonwillison.net/tags/generative-ai">generative-ai</a>, <a href="https://simonwillison.net/tags/cohere">cohere</a>, <a href="https://simonwillison.net/tags/ai">ai</a>, <a href="https://simonwillison.net/tags/llms">llms</a></p>

原文链接

进一步信息揣测

Cohere的商用限制：Command A Vision虽然是开源权重，但采用CC-NC许可证，商业用途需额外授权或使用付费API，这可能导致企业面临隐性成本或法律风险。
Deep Cogito的“推理蒸馏”技术：其模型通过“将推理时逻辑蒸馏回模型参数”实现自我改进，这种技术细节未公开具体实现方式，可能是其核心竞争力，但缺乏第三方验证效果。
匿名模型Horizon Alpha的炒作策略：通过OpenRouter匿名发布并迅速吸引关注，可能是刻意制造的“神秘营销”手段，利用社区好奇心推动早期测试，但实际性能可能未经验证。
Mistral Codestral的垂直领域适配：专注于代码补全（如VS Code、Zed等编辑器），暗示其可能针对开发者工具链优化，但未提及是否与这些编辑器存在私下合作或定制协议。
Deep Cogito的模型规模跳跃：从v1的3B-70B直接跃升至v2的109B-671B，可能依赖未公开的算力资源或训练技巧，但快速迭代背后可能存在技术债务风险。
开源模型的商业化路径：Cohere和Deep Cogito均采用“开源权重+商业许可”模式，反映行业趋势——通过开源吸引开发者，再通过企业级服务盈利，但实际商用门槛可能高于预期。