AMD knew it needed to make radical changes in its Zen CPU chip to become a force in the PC and server markets again.
So when the chip designers sat down four years ago to etch out the Zen design, they had two things in mind: to drive up CPU performance as much as possible and to keep power efficiency stable.
The company ultimately settled for a 40 percent improvement in Zen over its predecessor, Excavator.
"We had a hard time convincing the team we were going for 40 percent," said Mike Clark, a senior fellow at AMD. "It was a very aggressive goal, and we knew we had to do it to be competitive."
AMD first promoted the 40 percent CPU improvement goal when it introduced Zen in 2015 during an overhaul of its chip roadmap. The company recently demonstrated chips to prove it has achieved the goal.
If benchmarks of PCs with Zen hold, a 40 percent boost in CPU performance will be a radical improvement compared to low-double digit improvements claimed by Intel and AMD in recent x86 chips.
Intel ran away with the PC and server markets after AMD encountered design and manufacturing problems with some recent chips. AMD's CPU performance has fallen behind rival Intel's, and the Bulldozer architecture, even in the company's eyes, was an unmitigated failure. The Bulldozer chips, which started shipping in 2011, affected followup chips considered derivatives.
With Zen, AMD is looking to relive its glory days in chip design. Zen could be as significant as AMD's introduction of 64-bit server chips in 2003 and dual-core chips in 2004. Both moves gave AMD a competitive advantage over Intel at the time.
The Zen server chips will first reach high-end gaming workstations early next year, followed by servers and then laptops. The chips will have eight to 32 cores, and the 32-core chips could come in quad-CPU configurations, but those details aren't finalized yet, Clark said.
From day one, AMD's designers knew they had to keep the aggressive power-efficiency goals in mind, a consideration the company usually introduced a lot later in the chip design process, Clark said.
Designers needed to "treat power as an equal citizen" to the 40 percent performance improvement goal, Clark said.
"It's a very dynamic iterative process through the design phase," Clark said. "We had good design metrics so we knew we didn't lose power ... and instructions per clock."
Designers looked at architectural weaknesses in the previous chips, etched out a Zen chip design on paper, then got to work. The new architecture runs two threads per CPU core, which could boost performance while remaining power efficient.
They also introduced a new chip structure for Zen with a restructured memory subsystem that was faster and shorter in length. They designed in a fatter pipeline that can get more instructions to the machine, allowing for the faster execution of commands.
AMD also changed the cache structure in Zen. It shortened the size of the L2 cache to 512KB and widened the size of the L3 cache. Clark didn't disclose the size of the L3 cache but said it will be much faster than in previous chips. However, a slide about Zen showed at the Hot Chips conference this week showed the L3 cache size as 8MB.
The company also made the chip's integer and floating point processing units more dynamic and accessible to single- and multithreaded workloads. It will take fewer cycles to load operations on the processing units. The units in Bulldozer and its derivatives weren't as dynamic, widely considered a problem.
The designers also sharpened the chip's execution units. Zen has a distributed scheduler, and it provides visibility to more threads in a window. Bulldozer had a unified scheduler with more complexity.
Clark was involved in the development of AMD's first in-house chip code-named the K5, which was introduced in 1996. Meeting the goals set with Zen is encouraging, but there's still space to improve performance, he said.
"We still have a long way to go as far as compute performance," he added. "Zen is definitely not the destination, it's the first stop to improve performance."