19 October 2020

IBM Reveals Next-Generation IBM POWER10 Processor


New CPU co-optimized for Red Hat OpenShift for enterprise hybrid cloud

IBM revealed the next generation of its IBM POWER central processing unit (CPU) family: IBM POWER10. 

OpenPOWER Summit 2020 Sponsor Showcase: IBM POWER10

Intel and AMD have some fresh competition in the enterprise and data center markets as IBM just launched its next-generation Power10 processor.

The Power9 processor was introduced back in 2017. It's a 14nm processor that was used in the Summit supercomputer, which held the top spot as the world's fastest supercomputer from Nov. 2018 to June 2020. Now IBM is set to replace Power9 with the company's first 7nm processor, and Power10 will be manufactured through a partnership with Samsung.

Power10 promises some massive improvements over Power9. IBM claims a 3x improvement in both capacity and processor energy efficiency over its previous chip generation within the same power envelope. Power10 also includes a new feature called "memory inception," allowing clusters of physical memory to be shared across a pool of systems. Each system in the pool can access all of the memory, and memory clusters can be scaled up to petabytes in size.

IBM says there's up to a 20x improvement in speed for artificial intelligence workloads compared to Power9, and there's also been a focus on bolstering security. IBM added "quadruple the number of AES encryption engines per core" while also anticipating "future cryptographic standards like quantum-safe cryptography and fully homomorphic encryption."

"Enterprise-grade hybrid clouds require a robust on-premises and off-site architecture inclusive of hardware and co-optimized software," said Stephen Leonard, GM of IBM Cognitive Systems. "With IBM POWER10 we've designed the premier processor for enterprise hybrid cloud, delivering the performance and security that clients expect from IBM. With our stated goal of making Red Hat OpenShift the default choice for hybrid cloud, IBM POWER10 brings hardware-based capacity and security enhancements for containers to the IT infrastructure level."

Considering that the Summit supercomputer has only dropped to second place on the fastest list and still counts as the fifth most efficient supercomputer operating today, it seems likely a supercomputer using Power10 processors is going to appear and jump immediately to the top of the charts within a few years. 


Japan's ARM-Based 'Fugaku' System Now the World's Fastest Supercomputer

Microsoft's Powerful Supercomputer Will Supercharge AI for Azure Developers

Supercomputers Taken Offline After Hackers Secretly Install Cryptocurrency Miners

Designed to offer a platform to meet the unique needs of enterprise hybrid cloud computing, the IBM POWER10 processor uses a design focused on energy efficiency and performance in a 7nm form factor with an expected improvement of up to 3x greater processor energy efficiency, workload capacity, and container density than the IBM POWER9 processor.1

Designed over five years with hundreds of new and pending patents, the IBM POWER10 processor is an important evolution in IBM's roadmap for POWER. Systems taking advantage of IBM POWER10 are expected to be available in the second half of 2021. Some of the new processor innovations include:

  • -IBM's First Commercialized 7nm Processor that is expected to deliver up to a 3x improvement in capacity and processor energy efficiency within the same power envelope as IBM POWER9, allowing for greater performance.1
  • -Support for Multi-Petabyte Memory Clusters with a breakthrough new technology called Memory Inception, designed to improve cloud capacity and economics for memory-intensive workloads from ISVs like SAP, the SAS Institute, and others as well as large-model AI inference.
  • -New Hardware-Enabled Security Capabilities including transparent memory encryption designed to support end-to-end security. The IBM POWER10 processor is engineered to achieve significantly faster encryption performance with quadruple the number of AES encryption engines per core compared to IBM POWER9 for today's most demanding standards and anticipated future cryptographic standards like quantum-safe cryptography and fully homomorphic encryption. It also brings new enhancements to container security.
  • -New Processor Core Architectures in the IBM POWER10 processor with an embedded Matrix Math Accelerator which is extrapolated to provide 10x, 15x and 20x faster AI inference for FP32, BFloat16 and INT8 calculations per socket respectively than the IBM POWER9 processor to infuse AI into business applications and drive greater insights.

"Enterprise-grade hybrid clouds require a robust on-premises and off-site architecture inclusive of hardware and co-optimized software," said Stephen Leonard, GM of IBM Cognitive Systems. "With IBM POWER10 we've designed the premier processor for enterprise hybrid cloud, delivering the performance and security that clients expect from IBM. With our stated goal of making Red Hat OpenShift the default choice for hybrid cloud, IBM POWER10 brings hardware-based capacity and security enhancements for containers to the IT infrastructure level."

IBM's POWER10 Processor - William Starke & Brian W. Thompto, IBM

IBM POWER10 7nm Form Factor Delivers Energy Efficiency and Capacity Gains

IBM POWER10 is IBM's first commercialized processor built using 7nm process technology. IBM Research has been partnering with Samsung Electronics Co., Ltd. on research and development for more than a decade, including demonstration of the semiconductor industry's first 7nm test chips through IBM's Research Alliance.

With this updated technology and a focus on designing for performance and efficiency, IBM POWER10 is expected to deliver up to a 3x gain in processor energy efficiency per socket, increasing workload capacity in the same power envelope as IBM POWER9. This anticipated improvement in capacity is designed to allow IBM POWER10-based systems to support up to 3x increases in users, workloads and OpenShift container density for hybrid cloud workloads as compared to IBM POWER9-based systems. 

This can affect multiple datacenter attributes to drive greater efficiency and reduce costs, such as space and energy use, while also allowing hybrid cloud users to achieve more work in a smaller footprint.

Hardware Enhancements to Further Secure the Hybrid Cloud

IBM POWER10 offers hardware memory encryption for end-to-end security and faster cryptography performance thanks to additional AES encryption engines for both today's leading encryption standards as well as anticipated future encryption protocols like quantum-safe cryptography and fully homomorphic encryption.

Further, to address new security considerations associated with the higher density of containers, IBM POWER10 is designed to deliver new hardware-enforced container protection and isolation capabilities co-developed with the IBM POWER10 firmware. If a container were to be compromised, the POWER10 processor is designed to be able to prevent other containers in the same Virtual Machine (VM) from being affected by the same intrusion.

Cyberattacks are continuing to evolve, and newly discovered vulnerabilities can cause disruptions as organizations wait for fixes. To better enable clients to proactively defend against certain new application vulnerabilities in real-time, IBM POWER10 is designed to give users dynamic execution register control, meaning users could design applications that are more resistant to attacks with minimal performance loss.

Multi-Petabyte Size Memory Clustering Gives Flexibility for Multiple Hybrid Deployments

IBM POWER has long been a leader in supporting a wide range of flexible deployments for hybrid cloud and on-premises workloads through a combination of hardware and software capabilities. The IBM POWER10 processor is designed to elevate this with the ability to pool or cluster physical memory across IBM POWER10-based systems, once available, in a variety of configurations. In a breakthrough new technology called Memory Inception, the IBM POWER10 processor is designed to allow any of the IBM POWER10 processor-based systems in a cluster to access and share each other's memory, creating multi-Petabyte sized memory clusters.

For both cloud users and providers, Memory Inception offers the potential to drive cost and energy savings, as cloud providers can offer more capability using fewer servers, while cloud users can lease fewer resources to meet their IT needs. 

Infusing AI into the Enterprise Hybrid Cloud to Drive Deeper Insights

As AI continues to be more and more embedded into business applications in transactional and analytical workflows, AI inferencing is becoming central to enterprise applications. The IBM POWER10 processor is designed to enhance in-core AI inferencing capability without requiring additional specialized hardware.

With an embedded Matrix Math Accelerator, the IBM POWER10 processor is expected to achieve 10x, 15x, and 20x faster AI inference for FP32, BFloat16 and INT8 calculations respectively to improve performance for enterprise AI inference workloads as compared to IBM POWER9,2 helping enterprises take the AI models they trained and put them to work in the field. With IBM's broad portfolio of AI software, IBM POWER10 is expected to help infuse AI workloads into typical enterprise applications to glean more impactful insights from data.

Building the Enterprise Hybrid Cloud of the Future

With hardware co-optimized for Red Hat OpenShift, IBM POWER10-based servers will deliver the future of the hybrid cloud when they become available in the second half of 2021. Samsung Electronics will manufacture the IBM POWER10 processor, combining Samsung's industry-leading semiconductor manufacturing technology with IBM's CPU designs.

OpenPOWER Summit EU 2019: Microwatt: Make Your Own POWER CPU

IBM today introduced its next generation Power10 microprocessor, a 7nm device manufactured by Samsung. The chip features a new microarchitecture, broad new memory support, PCIe Gen 5 connectivity, hardware enabled security, impressive energy efficiency, and a host of other improvements. Unveiled at the annual Hot Chips conference (virtual this year) Power10 won’t turn up in IBM systems until this time next year. IBM didn’t disclose when the chip would be available to other systems makers.

IBM says Power10 offers a ~3x performance gain and ~2.6x core efficiency gain over Power9. No benchmarks against non-IBM chips were presented. Power9, of course, was introduced in 2017 and manufactured by Global Foundries on a 14nm process. While the move to a 7nm process provides many of Power10’s gains, there are also significant new features, not least what IBM calls Inception Memory that allows Power10 to access up to “multi petabytes” of pooled memory from diverse sources.

“You’re able to kind of trick a system into thinking that memory in another system belongs to this system. It isn’t like traditional [techniques] and doing an RDMA over InfiniBand to get access to people’s memory. This is programs running on my computer [that] can do load-store-access directly, coherently,” said William Starke, IBM distinguished engineer and a Power10 architect in a pre-briefing. “They use their caches [to] play with memory as if it’s in my system, even if it’s bridged by a cable over to another system. If we’re using short-reach cabling, we can actually do this with only 50-to-100 nanoseconds of additional latency. We’re not talking adding a microsecond or something like you might have over and RDMA.”

IBM is promoting Inception as a major achievement.

“HP came out with their big thing a few years ago. They called it The Machine and it was going to be their way of revolutionizing things largely by disaggregating memory. Intel you’ve seen from their charts talking about their Rack Scale architectures [that] they’re evolving toward. Well, this is IBM’s version of this and we have it today, in silicon. We are announcing we are able to take things outside of the system and aggregate the multiple systems together to directly share memory.

OpenPOWER Summit NA 2019: An Overview of the Self Boot Engine (SBE) in POWER9 base OpenPOWER Systems

Inception is just one of many interesting features of Power10, which has roughly 18 billion transistors. IBM plans to offer two core types – 4 SMT (simultanous multi-threaded) cores and 8 SMT cores; IBM focused on the latter in today’s presentation. There are 16 cores on the chip and on/offchip bandwidth via the OMI interface or PoweAXON (for adding OpenCAPI accelerators) or PCIe5 interface, all of which are shown delivering up to 1 terabyte per sec on IBM’s slides.

CXL interconnect is not supported by Power10, which is perhaps surprising given the increasingly favorable comments about CXL from IBM over the past year.

Starke said as part of a Slack conversation tied to Hot Chips, “Does POWER10 support CXL? No, it does not. IBM created OpenCAPI because we believe in Open, and we have 10+ years of experience in this space that we want to share with the industry. We know that an asymmetric, host-dominant attach is the only way to make these things work across multiple companies. We are encouraged to see the same underpinnings in CXL. It’s open. It’s asymmetric. So it’s built on the right foundations. We are CXL members and we want to bring our know-how into CXL. But right now, CXL is a few years behind OpenCAPI. Until it catches up, we cannot afford to take a step backwards. Right now OpenCAPI provides a great opportunity to get in front of things that will become more mainstream as CXL matures.”

Below is the block diagram of IBM’s new Power10 chip showing major architecture elements.

How open is OpenPOWER? - DevConf.CZ 2020

The process shrink does play role in allowing to IBM to offer two packaging options shown below (slide below).

IBM.  is offering two versions of the processor module and were able to do this primarily because of the energy efficiency gains. “We’re bringing out a single chip module. There is one Power10 chip and exposing all those high bandwidth interfaces, so very high bandwidth per compute type of characteristics. [O]n the upper right you can see [it]. We build a 16-socket, large system that’s very robustly scalable. We’ve enjoyed success over the last several generations with this type of offering, and Power10 is going to be no different.

“On the bottom you see something a little new. We can basically take two Power10 processor chips and cram them into the same form factor where we used to put just one Power9 processor. We’re taking 1200 square millimeters of silicon and putting it into the same form factor. That’s going to be very valuable in compute-dense, energy-dense, volumetric space-dense cloud configurations, where we can build systems ranging from one to four sockets where those are dual chip module sockets as shown.

IBM POWER10 technical preview of chip capabilities

It will be interesting to see what sort of traction the two different offerings gain among non-IBM systems builders as well as hyperscalers. Broadly IBM is positioning Power10 as a strong fit for hybrid cloud, AI, and HPC environments. Hardware and firmware enhancements were made to support security, containerization, and inferencing, with IBM pointedly suggesting Power10 will be able to handle most inferencing workflows as well as GPUs.

Talking about security, Satya Sharma, IBM Fellow and CTO, IBM Cognitive Systems, said “Power10 implements transparent memory encryption, which is memory encryption without any performance degradation. When you do memory encryption in software, it usually leads to performance degradation. Power10 implements transparent hardware memory encryption.”

Sharma cited similar features for containers and acceleration cryptographic standards. IBM’s official announcement says Power10 is designed to deliver hardware-enforced container protection and isolation optimized with the IBM firmware and that Power10 can encrypt data 40 percent faster than Power9.

Architecture innovations in POWER ISA v3.01 and POWER10

IBM also reports Power10 delivers a 10x-to-20x advantage over Power9 on inferencing workloads. Memory bandwidth and new instructions helped achieve those gains. One example is a new special purpose-built matrix math accelerator that was tailored for the demands of machine learning and deep learning inference and includes a lot of AI data types.

Focusing for a moment on dense-math-engine microarchitecture, Brian Thompto, distinguished engineer and Power10 designer, noted, “We also focused on algorithms that were hungry for flops, such as the matrix math utilized in deep learning. Every core has built in matrix math acceleration and efficiently performs matrix outer product operations. These operations were optimized across a wide range of data types. Recognizing that various precisions can be best suited for specific machine learning algorithms, we included very broad support: double precision, single precision, two flavors of half-precision doing both IEEE and bfloat16, as well as reduced precision integer 16-, eight-, and four-bit. The result is 64 flops per cycle, double precision, and up to one K flops per cycle of reduced precision per SMT core. These operations were tailor made to be efficient while applying machine learning.

At the socket level, you get 10 times the performance per socket for double and single-precision, and using reduced precision, bfloat16 sped up to over 15x and int8 inference sped up to over 20x over Power9 More broadly, he said, “We have a host of new capabilities in ISA version 3.1. This is the new instruction set architecture that supports Power10 and is contributed to the OpenPOWER Foundation. The new ISA supports 64-bit prefixed instructions in a risk-friendly way. This is in addition to the classic way that we’ve delivered 32-bit instructions for many decades. It opens the door to adding new capabilities such as adding new addressing modes as well as providing rich new opcode space for future expansion.

POWER Up Your Insights - IBM System Summit

IBM promises 1000-qubit quantum computer—a milestone—by 2023

IBM  today, for the first time, published its road map for the future of its quantum computing hardware. There is a lot to digest here, but the most important news in the short term is that the company believes it is on its way to building a quantum processor with more than 1,000 qubits — and somewhere between 10 and 50 logical qubits — by the end of 2023.

Currently, the company’s quantum processors top out at 65 qubits. It plans to launch a 127-qubit processor next year and a 433-qubit machine in 2022. To get to this point, IBM is also building a completely new dilution refrigerator to house these larger chips, as well as the technology to connect multiple of these units to build a system akin to today’s multi-core architectures in classical chips.

Gil believes that 2023 will be an inflection point in the industry, with the road to the 1,121-qubit machine driving improvements across the stack. The most important — and ambitious — of these performance improvements that IBM is trying to execute on is bringing down the error rate from about 1% today to something closer to 0.0001%. But looking at the trajectory of where its machines were just a few years ago, that’s the number the line is pointing toward.

Q-CTRL  and Quantum Machines, two of the better-known startups in the quantum control ecosystem, today announced a new partnership that will see Quantum Machines  integrate Q-CTRL‘s quantum firmware into Quantum Machines’ Quantum Orchestration hardware and software solution.

Building quantum computers takes so much specialized knowledge that it’s no surprise that we are now seeing some of the best-of-breed startups cooperate — and that’s pretty much why these two companies are now working together and why we’ll likely see more of these collaborations over time.

“The motivation [for quantum computing] is this immense computational power that we could get from quantum computers and while it exists, we didn’t make it happen yet. We don’t have full-fledged quantum computers yet,” Itamar Sivan, the co-founder and CEO of Quantum Machines, told me.

IBM Power10 A Glimpse Into the Future of Servers

For 20 years scientists and engineers have been saying that “someday” they’ll build a full-fledged quantum computer able to perform useful calculations that would overwhelm any conventional supercomputer. But current machines contain just a few dozen quantum bits, or qubits, too few to do anything dazzling. Today, IBM made its aspirations more concrete by publicly announcing a “road map” for the development of its quantum computers, including the ambitious goal of building one containing 1000 qubits by 2023. IBM’s current largest quantum computer, revealed this month, contains 65 qubits.

“We’re very excited,” says Prineha Narang, co-founder and chief technology officer of Aliro Quantum, a startup that specializes in code that helps higher level software efficiently run on different quantum computers. “We didn’t know the specific milestones and numbers that they’ve announced,” she says. The plan includes building intermediate-size machines of 127 and 433 qubits in 2021 and 2022, respectively, and envisions following up with a million-qubit machine at some unspecified date. Dario Gil, IBM’s director of research, says he is confident his team can keep to the schedule. “A road map is more than a plan and a PowerPoint presentation,” he says. “It’s execution.”

IBM is not the only company with a road map to build a full-fledged quantum computer—a machine that would take advantage of the strange rules of quantum mechanics to breeze through certain computations that just overwhelm conventional computers. At least in terms of public relations, IBM has been playing catch-up to Google, which 1 year ago grabbed headlines when the company announced its researchers had used their 53-qubit quantum computer to solve a particular abstract problem that they claimed would overwhelm any conventional computer—reaching a milestone known as quantum supremacy. Google has its own plan to build a million-qubit quantum computer within 10 years, as Hartmut Neven, who leads Google’s quantum computing effort, explained in an April interview, although he declined to reveal a specific timeline for advances.

AI in Automobile :Solutions for ADAS and AI data engineering using OpenPOWER/POWER systems

IBM’s declared timeline comes with an obvious risk that everyone will know if it misses its milestones. But the company decided to reveal its plans so that its clients and collaborators would know what to expect. Dozens of quantum-computing startup companies use IBM’s current machines to develop their own software products, and knowing IBM’s milestones should help developers better tailor their efforts to the hardware, Gil says.

One company joining those efforts is Q-CTRL, which develops software to optimize the control and performance of the individual qubits. The IBM announcement shows venture capitalists the company is serious about developing the challenging technology, says Michael Biercuk, founder and CEO of Q-CTRL. “It’s relevant to convincing investors that this large hardware manufacturer is pushing hard on this and investing significant resources,” he says.

A 1000-qubit machine is a particularly important milestone in the development of a full-fledged quantum computer, researchers say. Such a machine would still be 1000 times too small to fulfill quantum computing’s full potential—such as breaking current internet encryption schemes—but it would big enough to spot and correct the myriad errors that ordinarily plague the finicky quantum bits.

IBM Power Systems at FIS InFocus 2019

A bit in an ordinary computer is an electrical switch that can be set to either zero or one. In contrast, a qubit is a quantum device—in IBM’s and Google’s machines, each is a tiny circuit of superconducting metal chilled to nearly absolute zero—that can be set to zero, one, or, thanks to the strange rules of quantum mechanics, zero and one at the same time. But the slightest interaction with the environment tends to distort those delicate two-ways-at-once states, so researchers have developed error-correction protocols to spread information ordinarily encoded in a single physical qubit to many of them in a way that the state of that “logical qubit” can be maintained indefinitely.

With their planned 1121-qubit machine, IBM researchers would be able to maintain a handful of logical qubits and make them interact, says Jay Gambetta, a physicist who leads IBM’s quantum computing efforts. That’s exactly what will be required to start to make a full-fledged quantum computer with thousands of logical qubits. Such a machine would mark an “inflection point” in which researchers’ focus would switch from beating down the error rate in the individual qubits to optimizing the architecture and performance of the entire system, Gambetta says.

IBM is already preparing a jumbo liquid-helium refrigerator, or cryostat, to hold a quantum computer with 1 million qubits. The IBM road map doesn’t specify when such a machine could be built. But if company researchers really can build a 1000-qubit computer in the next 2 years, that ultimate goal will sound far less fantastical than it does now.

IBM Power Systems at the heart of Cognitive Solutions

More Information:














0 reacties:

Post a Comment