Tensordyne unveiled Napier, a new AI inference chip using unconventional logarithmic math to reduce power and rack count, claiming 1,300 tokens per second per user on a single 120kW rack versus NVIDIA and Groq's nine racks.

Hacker News2d ago3 min read

Summaries like this, in your inbox every morning.

3 Key Points

1
What happened: Tensordyne announced Napier, a 3nm AI processor with 138 billion transistors, 2.1 petaflops of compute per die, and 144GB of HBM3E memory per chip. The company is packaging nine Napier chips into racks with a total of 42TB of HBM and claims five times the SRAM of NVIDIA Blackwell. Beta programs are planned for Q1 2027, with system shipments expected by the end of Q2 2027.
2
Why it matters: Current AI inference infrastructure faces bottlenecks in memory, interconnect, and power efficiency rather than raw compute speed. Tensordyne's logarithmic math approach replaces multiplication with addition, reducing the transistor area needed for multipliers and freeing up space for more on-chip memory. If this translates to real-world workload efficiency, it could offer a materially different cost-per-token alternative to dominant incumbents for large language models.
3
What to watch: The company claims a single TDN72 rack (120kW, air-cooled) can serve two-trillion-parameter models at 1,300 tokens per second per user, while NVIDIA and Groq require nine racks at 1.5MW and AWS plus Cerebras require fourteen racks at 800kW. The critical unknowns are whether the logarithmic math delivers accuracy and software portability in production workloads, and whether Tensordyne can build a credible supply chain and customer base by end of Q2 2027.

No discussion yet for this article

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Free · takes 30 seconds · unsubscribe anytime

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack