Latest Articles

Dive deep into the world of Artificial Intelligence with our curated collection of articles, covering the latest breakthroughs and insights from leading researchers and engineers.

Filtering by tag:

MoE

(3 articles)

May 15, 2026Artificial Intelligence

DeepSeek-V4 MegaMoE: Overlapping Communication and Compute

How DeepSeek-V4 MegaMoE overlaps expert-parallel communication with GPU compute using wave scheduling, TMA/MMA, and Epilogue warp pipelines for faster serving.

Qing Ke AI

DeepSeek-V4 MegaMoE MoE+2 more

November 15, 2025Large Language Models

Kimi K2: A Trillion-Parameter Open-Source LLM

Explore Kimi K2, the 1.04T parameter open-source MoE model. Our deep dive covers its MuonClip optimizer, agentic AI training, and benchmark performance.

Ji Zhi Liu

Kimi K2 MoE LLM architecture+4 more

July 22, 2025Technology

LLM Architecture Explained: DeepSeek V3 vs Llama 4 (MLA vs GQA 2025)

Compare DeepSeek V3 vs Llama 4 architecture: MLA vs GQA attention, MoE vs dense models. Learn how 671B parameters run at 37B speed. Includes code examples and design trade-offs.

Alex

LLM architecture DeepSeek V3 Kimi K2+5 more