DeepSeek V4: Silicon Valley Builds Walls While China Builds Roads
On April 24, DeepSeek V4 was finally unveiled, quickly topping the Hugging Face open-source model leaderboard. Two significant innovations were highlighted:
- A million-level ultra-long context with a KV cache only 10% of that in V3.2, praised by Amazon engineers for addressing HBM shortages.
- Adaptation to domestic chips, developed in close collaboration with Huawei, and quickly compatible with Ascend and Cambricon chips.
Coincidentally, the second-ranked model on Hugging Face was Kimi K2.6, released and open-sourced on April 20.
In Silicon Valley, the competition between giants like OpenAI and Anthropic often leads to public disputes and closed-source strategies. In contrast, the Chinese AI landscape showcases a different narrative, where companies collaborate openly rather than engage in public spats.
01 Silicon Valley’s Power Games
In Silicon Valley, leading AI firms like OpenAI, Anthropic, and Google Gemini are staunch advocates of closed-source models. This has led to a situation where cutting-edge innovations are confined within their data centers, resulting in a zero-sum game driven by power struggles.
Over the past two years, the competition has escalated into public confrontations, with companies frequently launching significant updates to overshadow competitors’ releases. For instance, both OpenAI and Google released new products simultaneously in May 2024, leading to a social media spat between their CEOs.
The rivalry between OpenAI and Anthropic has intensified, with Anthropic announcing a new model just hours before OpenAI unveiled a major update to Codex, showcasing a calculated attempt to undermine its competitor.
The closed-source model’s inherent logic is to build moats that prevent technology diffusion, resulting in a Nash equilibrium where any sign of truce could lead to brand narrative collapse.
02 Collaborative Evolution in the Open-Source Camp
In stark contrast, the Chinese AI ecosystem has embraced open-source collaboration. The launch of DeepSeek-R1 over a year ago catalyzed a shift in the competitive landscape, encouraging many startups to adopt open-source principles.
For example, the startup Moonlight, which launched in 2023, closely mirrors DeepSeek’s growth trajectory. Both teams maintain a small yet highly skilled workforce and are committed to the principles of Scaling Law.
In July 2025, Moonlight released the world’s first trillion-parameter open-source model, Kimi K2, which openly acknowledged the use of DeepSeek’s MLA architecture. This architecture significantly reduces memory constraints, allowing for efficient processing of ultra-long texts.
DeepSeek V4’s technical documentation reveals upgrades, including a shift from AdamW to the Muon optimizer, enhancing convergence speed and training stability. Similarly, Kimi K2.6 reported a doubling of efficiency using the Muon optimizer.
The Muon optimizer, initially proposed by Keller Jordan in late 2024, has been refined by the Moonlight team, resulting in stable pre-training with zero loss spikes. DeepSeek adopted this proven optimizer for training V4.
It’s important to note that this open-source evolution is not leading to homogenization but rather a path of diverse innovation. DeepSeek V4 focuses on core model capabilities, while Kimi K2.6 addresses practical deployment challenges, showcasing the unique strengths of each model.
03 Building Roads vs. Walls
Despite the impressive collaborative evolution in China, a stark commercial reality persists. OpenAI and Anthropic have annual revenues exceeding $10 billion, while leading Chinese firms have just crossed the $100 million mark.
OpenAI’s market valuation is around $880 billion, with Anthropic reaching approximately $1 trillion. In contrast, Kimi and DeepSeek’s latest funding rounds valued them at $18 billion and $20 billion, respectively.
The competition is entering a new phase, focusing on training efficiency and architectural innovation rather than just parameters and benchmarks. This shift is driven by the necessity to reduce computational costs.
Chinese open-source models are demonstrating significant cost advantages, with Kimi K2’s training costs around $4.6 million, compared to over $500 million for GPT-5. Moreover, usage data indicates that Chinese models are gaining traction globally, driven by their reputation for being effective and affordable.
04 Conclusion
As the AI wave continues, the ultimate victor will depend not only on model performance but also on the autonomy of computational power. The integration of domestic models with local computational resources is becoming increasingly vital.
DeepSeek V4’s documentation lists both Ascend NPU and NVIDIA GPU for hardware validation, indicating a growing synergy between domestic models and computation.
As of 2026, the Chinese open-source AI community is actively creating new rules for the industry, showcasing a collaborative approach that contrasts sharply with the competitive, closed-source strategies of Silicon Valley.
Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.