AI Architecture
Inside Ant Ling 2.5: Rebuilding Attention With MLA + Lightning Attention
How Ling 2.5 replaces part of GQA with a 1:7 MLA + Lightning Attention design to improve long-context throughput, reduce KV cache cost, and keep training quality stable.
Qingke AI