domingo, 24 de enero de 2016

Intel talks concurrency and Knights Landing

Interview The Intel Software Development Conference was on in London last week, and we took the opportunity to catch up with James Reinders, director and evangelist for parallel programming and HPC tools.

Reinders is talking up Knights Landing, the next generation of Xeon Phi, Intel's MIC (many integrated core) processors, which are designed for high-performance concurrent programming.

The first Xeon Phi, Knights Corner, was released in 2012 and had up to 61 cores. 48,000 of the chips are installed in the world's most powerful supercomputer, China's Tianhe-2.

Knights Landing has up to 72 cores, but the more significant difference is that the new Xeon Phi is a processor rather than a co-processor. Co-processors use a host/device programming model, where an application running on the host (the CPU) offloads compute-intensive tasks to the device (the co-processor), with huge potential speed-ups. Nvidia's Tesla range of GPU accelerator boards (installed in the Titan, the world's second most powerful supercomputer) also use this model.

Processor versus co-processor

Why did Intel go the co-processor route with Knights Corner, but is now changing tack? "One issue was software," says Reinders. "[Knights Corner] being a co-processor fitted with a mould that people seemed to be more ready for. The other thing was a bit of legacy. The cluster on a chip design came from Larabee, a project for something else that we didn't bring to market. We could introduce a co-processor faster. It was an engineering trade-off.

"In a co-processor you can control your ecosystem more: everything that runs on it we had control of. The host was standard. We weren't quite ready to understand how 512-bit vectors should be done on a processor.

"Personally I was, I'll deal with this co-processor and where it is taking us, but I can't wait for Knights Landing."

From the programmer's perspective, a processor is easier to code for since you no longer have to worry about the host/device boundary. "Co-processors have a big issue, a controlling program that already has the data, but has to ship the data over to the co-processor. You buy the memory twice. You have memory on the host that stores the data, then you transfer it to the memory on the card," says Reinders.

"The other thing is integrating the fabric onto the package. Knights Landing will be our first processor that does that. Then driving the latency down on that fabric will allow scaling out," he adds.

Supercomputers spend much of their time analysing huge datasets, a trend that will continue as IoT (Internet of Things) sensors supply more and more data. "By turning [Xeon Phi] into a processor rather than a co-processor, it unleashes our ability to handle huge amounts of data. The processor nature will enable machines to be built with arbitrarily large amounts of memory," Reinders explains.

Intel's Xeon Phi has far fewer cores than its GPU-based competition, but each core is more capable. "It's a classic computer architecture question. Are you better off with a few fat cores that do everything well, or a bunch of smaller cores? We're doing it in a way that's compatible," says Reinders.

From 61 to just 72 cores over three years' development may seem disappointing, but Reinders says the core count is not the only important thing. "Can people figure out how to get three times as much parallelism? Or will they be better off if we became a processor, ran at a higher clock rate, gave high-bandwidth memory, and did out-of-order execution to accelerate the per-thread experience? That's the design trade-off we've made."

When do we get Knights Landing, which Intel originally promised for 2015? "We have three systems outside of Intel now," Reinders told the Reg. "Cray has one, Sandia National Laboratories, and CEA in France. They are on A0 (first stepping level) silicon. You'll see a gradual ramp-up as the new year starts. We haven't said when general availability comes."

 

 

Source