Entries
Deepseek V4 Pro is the biggest open model ever with 1.6T total 49B active, trained on 33T tokens, 1M context, with 2 new attention mechanisms, Muon, mHC, open source kernels, FP4 QAT, MIT license and with one of the best tech repot of the year.
available openrouter and opencode.
also, available on most providers, in day 0.
DeepSeek just dropped V4, and it’s a massive flex for the open-source community. Released in two open-weights variants—the compute-heavy V4-Pro and the low-latency V4-Flash—this model brings frontier-level performance straight to local environments. It features a beastly 1-million-token context window, making massive RAG pipelines and repo-level code ingestion a total breeze. The real kicker, though, is the hardware stack. DeepSeek has decoupled from the Nvidia ecosystem by deeply optimizing V4 for Huawei's Ascend silicon. This proves that SOTA AI training and inference can completely bypass traditional US hardware chokepoints while still trading blows with the best closed-source models from OpenAI and Google.
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf