Deep Cogito Launches with Open-Source LLMs Featuring Advanced Reasoning Capabilities

San Francisco-based startup Deep Cogito has officially launched from stealth mode with the release of Cogito v1, a new series of open-source large language models (LLMs). These models are fine-tuned from Meta’s LLaMA 3.2 and come with hybrid reasoning capabilities—allowing them to respond instantly or engage in deeper, self-reflective thinking, similar to models from OpenAI’s “o” series and DeepSeek R1.

Aiming Beyond Human-Guided AI

Deep Cogito’s mission is to advance AI systems beyond current limitations by developing models that can self-improve their reasoning over time. The company ultimately aims to create superintelligent AI—systems that outperform humans across all domains. Importantly, Deep Cogito has committed to making all of its models open-source.

Backed by Experience

The company is led by Drishan Arora, former Senior Software Engineer at Google, where he was involved in building LLMs for Google’s generative search. On X (formerly Twitter), Arora claimed that Cogito’s models are the strongest open models at their scale, outperforming peers like LLaMA, DeepSeek, and Qwen.

Model Lineup and Availability

Cogito v1 includes models with 3B, 8B, 14B, 32B, and 70B parameters, all available on Hugging Face, Ollama, and via APIs through Fireworks AI and Together AI. These models are released under Meta’s LLaMA license, which allows for commercial use by organizations with up to 700 million monthly users—beyond that, a paid license from Meta is required.

The company also plans to launch larger models soon, including versions with 109B, 400B, and 671B parameters, as well as Mixture-of-Experts (MoE) variants.

Unique Training Method: Iterated Distillation and Amplification (IDA)

Instead of relying on traditional training techniques like RLHF (Reinforcement Learning from Human Feedback), Deep Cogito uses a method called Iterated Distillation and Amplification (IDA). This process gives the model more computing power to find better answers, then teaches the model to internalize those improved thought processes—essentially creating a loop of continuous improvement. Arora compares it to AlphaGo’s self-play, but applied to natural language understanding.

Strong Performance Across Benchmarks

Deep Cogito’s models show impressive performance on industry-standard benchmarks:

Cogito 3B (Standard) beats LLaMA 3.2 3B by 6.7 points on MMLU and 18.8 points on Hellaswag.
In reasoning mode, Cogito 3B scores 72.6% on MMLU and 84.2% on ARC, improving significantly over standard mode.
Cogito 8B (Standard) scores 80.5% on MMLU, outperforming LLaMA 3.1 8B by over 12 points.
In reasoning mode, it reaches 83.1% on MMLU and 92.0% on ARC, surpassing DeepSeek R1 Distill 8B in most areas except the MATH benchmark.
Cogito 14B and 32B models outperform Qwen2.5 counterparts by 2–3 points on average.
Cogito 70B (Standard) outperforms LLaMA 3.3 70B and even exceeds LLaMA 4 Scout 109B on aggregate benchmarks. In reasoning mode, it achieves 91.0% on MMLU and 92.7% on MGSM.
While the models excel in reasoning, there are trade-offs—particularly in complex math tasks where DeepSeek R1 still holds an edge.

Tool-Calling Capabilities

Deep Cogito also tested its models for tool usage, which is becoming vital for AI agents and APIs.

Cogito 3B supports native tool-calling across four types: simple, parallel, multiple, and parallel-multiple.
It scored 92.8% on simple tasks and over 91% on multi-tool calls.
Cogito 8B performed consistently well, scoring above 89% in all tool-calling tasks—far ahead of LLaMA 3.1 8B, which ranged from 35% to 54%.

These results are due to a mix of architectural choices, training data, and task-specific post-training, which many competing models currently lack.

What’s Next?

Deep Cogito plans to release even larger and more capable models, with updates to current versions as training continues. The company views IDA as a long-term solution for building scalable, self-improving AI without depending on human feedback or fixed teacher models.

Arora stressed that while benchmark scores are important, real-world utility and adaptability are the ultimate goals—and that this is only the beginning of what could be a rapid evolution.

Industry Partnerships

Deep Cogito is collaborating with infrastructure and research partners including Hugging Face, RunPod, Fireworks AI, Together AI, and Ollama. All released models are open source and available now.