Amazon unveiled its third-generation Trainium AI chip Tuesday at its annual conference in Las Vegas. It’s a big jump, the company said, from its second version — but just wait for Trainium4, presumably next year.
The rapid iteration could be a financial problem: Investors worry AI chips in data centers are going to depreciate faster than companies hope.
The Trainum3 rollout offers a glimpse at another puzzle: The difficulty of separating individual chips from an entire data center working as a complete system. Amazon has configured its data centers so that old Trainium2 chips can be yanked out and replaced with new Trainium3 chips without other major changes. Indeed, many of the gains come not from the chips themselves, but in how they are connected. Not only are the latest Trainium racks using a new (somewhat secretive) kind of interconnect between server racks, but up to 144 chips inside the individual racks are also connected, allowing each chip to talk directly to another chip. In the past, they had to make multiple “hops” to do it.
The AI industry’s short sellers could be right about the chips depreciating faster than expected. But companies are also not being completely transparent about how these data centers actually work, and for good reason — those are valuable trade secrets.

