
Summaries like this, in your inbox every morning.
Sign up free →What happened: Intel and AMD released the full specification for ACE (a CPU extension for x86 processors) that adds silicon dedicated to matrix multiplication — the core math operation behind AI workloads. ACE leverages existing AVX10 registers and can perform 16x as many operations as AVX10 for the same number of input vectors.
Why it matters: Many AI tasks — small models, single-user latency-sensitive operations, or situations where no GPU is available — run better on CPUs. ACE makes this practical by cutting instruction overhead, improving power efficiency, and letting ML frameworks like PyTorch and TensorFlow write one code path instead of multiple variations for different hardware. Developers can also move NPU-specific workloads back to CPU without dealing with the fact that each NPU is different.
What to watch: ACE natively supports most ML data types (INT8, INT32, FP8, FP16, FP32, BF16) and can use Open Compute Project's MX block-scaled formats — something AVX10 does not provide. The actual performance gains will depend on how much silicon Intel and AMD dedicate to ACE in future chip designs.
No comments yet. Be the first to share your thoughts!
Log in to join the discussion





Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.
Get Started FreeFree · takes 30 seconds · unsubscribe anytime
5 minutes a day. The AI essentials.
200+ sources · Email / LINE / Slack