DeepSeek Claims 85% Inference Speedup with DSpark

Get the Tech newsletter
Daily tech — startups, AI labs, chips, the launches that shape the next decade. Free.
- DeepSeek detailed DSpark, a speculative decoding framework for its V4 models, claiming it speeds up AI inference by up to 85%.
- The DSpark framework was tested on Gemma and Qwen models, according to the company's reporting.
Why it matters: DeepSeek says DSpark delivers up to 85% faster inference and was validated on external models like Gemma and Qwen rather than only its own stack, meaning the speedup claim is being tested against rivals' architectures.



