22 July 2020

Red Hat Powers the Future of Supercomputing with Red Hat Enterprise Linux

Red Hat Powers the Future of Supercomputing

World’s leading enterprise Linux platform provides the operating system for the top 3 supercomputers globally and four out of the top 10

Fujitsu to build 37 petaflops supercomputer

Red Hat, Inc., the world's leading provider of open source solutions, today announced that Red Hat Enterprise Linux provides the operating system backbone for the top three supercomputers in the world and four out of the top 10, according to the newest TOP500 ranking. Already serving as a catalyst for enterprise innovation across the hybrid cloud, these rankings also show that the world’s leading enterprise Linux platform can deliver a foundation to meet even the most demanding computing environments.
Red Hat Enterprise Linux is designed to run seamlessly on a variety of architectures underlying leading supercomputers, playing an important part in driving HPC into new markets and use cases, including AI, enterprise computing, quantum computing and cloud computing

In the top ten of the current TOP500 list, Red Hat Enterprise Linux serves as the operating system for:

  • Fugaku, the top-ranked supercomputer in the world based at RIKEN Center for Computational Sciences in Kobe, Japan.
  • Summit, the number two-ranked supercomputer based at Oak Ridge National Laboratory in Oak Ridge, Tennessee.
  • Sierra, the third-ranked supercomputer globally based at Lawrence Livermore National Laboratory in Livermore, California.
  • Marconi-100, the ninth-ranked supercomputer installed at CINECA research center in Italy.

80 Core 64-bit Arm Processor - A Quick Look at the Ampere Altra

High-performance computing across architectures

Red Hat Enterprise Linux is engineered to deliver a consistent, standardized and high-performance experience across nearly any certified architecture and hardware configuration. These same exacting standards and consistency are also brought to supercomputing environments, providing a predictable and reliable interface regardless of the underlying hardware.

Fujitsu A64FX Post-K Supercomputer: World's Fastest Arm Processor

Fugaku is the first Arm-based system to take first place on the TOP500 list, highlighting Red Hat’s commitment to the Arm ecosystem from the datacenter to the high-performance computing laboratory. Sierra, Summit and Marconi-100 all boast IBM POWER9-based infrastructure with NVIDIA GPUs; combined, these four systems produce more than 680 petaflops of processing power to fuel a broad range of scientific research applications.

In addition to enabling this immense computation power, Red Hat Enterprise Linux also underpins six out of the top 10 most power-efficient supercomputers on the planet according to the Green500 list. Systems on the list are measured in terms of both performance results and the power consumed achieving those. When it comes to sustainable supercomputing the premium is put on finding a balanced approach for the most energy-efficient performance.

In the top ten of the Green500 list, Red Hat Enterprise Linux serves as the operating system for:

  • A64FX prototype, at number four, was created as the prototype system to test and develop the Fugaku supercomputer and is based at Fujitsu’s plant in Numazu, Japan.
  • AIMOS, the number five supercomputer on the Green500 list based at Rensselaer Polytechnic Institute in Troy, New York.
  • Satori, the seventh-ranked most power-efficient system in the world, installed at MIT Massachusetts Green High Performance Computing Center (MGHPCC) in Holyoke, Massachusetts. It serves as the home for the Mass Open Cloud (MOC) project, where Red Hat supports a number of activities.
  • Summit at number eight.
  • Fugaku at number nine.
  • Marconi-100 at number ten.

Create the world's fastest supercomputer to fight COVID-19

From the laboratory to the datacenter and beyond

Modern supercomputers are no longer purpose-built monoliths constructed from expensive bespoke components. Each supercomputer deployment powered by Red Hat Enterprise Linux uses hardware that can be purchased and integrated into any datacenter, making it feasible for organizations to use enterprise systems that are similar to those breaking scientific barriers. Regardless of the underlying hardware, Red Hat Enterprise Linux provides the common control plane for supercomputers to be run, managed and maintained in the same manner as traditional IT systems.

Red Hat Enterprise Linux also opens supercomputing applications up to advancements in enterprise IT, including Linux containers. Working closely in open source communities with organizations like the Supercomputing Containers project, Red Hat is helping to drive advancements to make Podman, Skopeo and Buildah, components of Red Hat’s distributed container toolkit, more accessible for building and deploying containerized supercomputing applications.

Fujitsu's supercomputers accelerate the creation of new knowledge

Supporting Quote

Stefanie Chiras, vice president and general manager, Red Hat Enterprise Linux Business Unit, Red Hat

"Supercomputing is no longer the domain of custom-built hardware and software. With the proliferation of Linux across architectures, high-performance computing has now become about delivering scalable computational power to fuel scientific breakthroughs. Red Hat Enterprise Linux already provides the foundation for innovation to the enterprise world and, with the recent results of the TOP500 list, we’re pleased to now provide this same accessible, flexible and open platform to the world’s fastest and some of the most power-efficient computers."
Steve Conway, senior advisor, HPC Market Dynamics, Hyperion Research
"Every one of the world's Top500 most powerful supercomputers runs on Linux, and a recent study we did confirmed that Red Hat is the most popular vendor-supported Linux solution in the global high performance computing market. Red Hat Enterprise Linux is designed to run seamlessly on a variety of architectures underlying leading supercomputers, playing an important part in driving HPC into new markets and use cases, including AI, enterprise computing, quantum computing and cloud computing."
Satoshi Matsuoka, director, RIKEN Center for Computational Science (R-CCS); professor, Department of Mathematical and Computing Sciences, Tokyo Institute of Technology
"Fugaku represents a new wave of supercomputing, delivering the performance, scale and efficiency to help create new scientific breakthroughs and further drive research innovation. A key consideration of the project was to deliver an open source software stack, starting with the addition of Red Hat Enterprise Linux. With Red Hat Enterprise Linux running on Arm-based processors, we have been able to make supercomputing resources accessible and manageable by our distributed community of scientists and simplify development and deployment of a broader range of workloads and applications."
Professor Jack Dongarra, University of Tennessee, Oak Ridge National Laboratory, and the University of Manchester
"Computing innovation and scientific advancement is not done in a vacuum - the supercomputing community, from laboratories to the vendor ecosystem, collaborates to help drive breakthroughs at both the architectural and the research level. Red Hat is a key part of this global community, helping to deliver a standards-based, open control plane that can make all of this processing power accessible and usable to an extensive range of scientists across disciplines."
Into the Future with the Post-K Computer -Solutions to Global Challenges

Around the world, innumerable supercomputers are sifting through billions of molecules in a desperate search for a viable therapeutic to treat COVID-19. Those molecules are pulled from enormous databases of known compounds, ranging from preexisting drugs to plants and other natural substances. But now, researchers at the University of Washington are using supercomputing power to revisit a decades-old concept that would allow researchers to design a completely new drug from the ground up.

This approach – called de novo protein design – works by linking amino acids together to create specific proteins. Thus far, de novo design has only been used for a few drugs that are still undergoing trial. In large part, de novo design has been stymied by the extreme difficulty in predicting how the amino acids in a protein would fold, making prediction of the full three-dimensional shape of the protein and other drug-critical factors exceedingly troublesome.

Overview of the K computer System

At the University of Washington’s Institute for Protein Design, David Baker – a professor of biochemistry and head of the institute – applied supercomputing to tackle this roadblock. Baker and his colleagues developed methods for the prediction of proteins’ folded forms and for the rapid design of targeted protein binders. The researchers use computer simulations to generate a library of candidates, after which the most promising candidates are tested in-depth in further simulations and wet labs.

Technologies beyond-the-k-computer - Takashi Aoki

For the last six months, the Baker Lab has been using this approach to zero in on COVID-19, predicting the folded shapes of millions of proteins and then matching them with various parts of the SARS-CoV-2 virus. This massive undertaking requires correspondingly massive computing – and for that, the researchers turned to Stampede2 at the Texas Advanced Computing Center (TACC). Stampede2 is a Dell EMC system with Intel Xeon Phi CPUs rated at 10.7 Linpack petaflops, which placed it 21st on the most recent Top500 list of the world’s most powerful supercomputers.

Japan and Fugaku’s Fight Against the COVID-19 in HPC

For their COVID-19 efforts, the team started by testing 20,000 “scaffold” proteins – starting points for drug design – each of which possesses more than a thousand possible orientations, with each orientation tested around a thousand times: in total, 20 billion interactions to test. The best million candidates from these went on to the second stage, sequence design, where the “scaffold” is covered in amino acids, with 20 possibilities at each position.

In the third stage, the best hundred thousand protein candidates are forwarded to Agilent, a DNA synthesis firm, which returns physical DNA samples of those proteins that the Baker Lab can test against the real-life virus. Then, the team looks at the results and mutates individual amino acids on the proteins to see if docking performance improves or worsens. Then the proteins undergo a barrage of other tests and modifications, eventually resulting in the 50 promising leads that have been found so far.

“TACC has a lot of computing power and that has been really helpful for us,” said Brian Coventry, a PhD student working on the research, in an interview with TACC’s Aaron Dubrow. “Everything we do is purely parallel. We’re able to rapidly test 20 million different designs and the calculations don’t need to talk to each other.”

Technical Computing Suite Job Management Software

“Our goal for the next pandemic will be to have computational methods in place that, coupled with high performance computing centers like TACC, will be able to generate high affinity inhibitors within weeks of determination of the pathogen genome sequence,” Baker said. “To get to this stage will require continued research and development, and centers like TACC will play a critical role in this effort as they do in scientific research generally.”

Header image: antiviral protein binders (blue) targeting the spike proteins of the coronavirus. Image courtesy of Ian Haydon, Institute for Protein Design.

Fugaku supercomputer already used in COVID-19 research

To read the reporting on this research from TACC’s Aaron Dubrow, click here https://www.tacc.utexas.edu/-/designing-anew-radical-covid-19-drug-development-approach-shows-promise.

The supercomputer Fugaku, which is being developed jointly by RIKEN and Fujitsu Limited based on Arm® technology, has taken the top spot on the Top500 listThe webpage will open in a new tab., a ranking of the world’s fastest supercomputers. It also swept the other rankings of supercomputer performance, taking first place on the HPCGThe webpage will open in a new tab., a ranking of supercomputers running real-world applications, HPL-AIThe webpage will open in a new tab., which ranks supercomputers based on their performance capabilities for tasks typically used in artificial intelligence applications, and Graph 500The webpage will open in a new tab., which ranks systems based on data-intensive loads. This is the first time in history that the same supercomputer has become No.1 on Top500, HPCG, and Graph500 simultaneously. The awards were announced on June 22 at the ISC High Performance 2020 DigitalThe webpage will open in a new tab., an international high-performance computing conference.

On the Top500, it achieved a LINPACK score of 415.53 petaflops, a much higher score than the 148.6 petaflops of its nearest competitor, Summit in the United States, using 152,064 of its eventual 158,976 nodes. This marks the first time a Japanese system has taken the top ranking since June 2011, when the K computer—Fugaku’s predecessor—took first place. On HPCG, it scored 13,400 teraflops using 138,240 nodes, and on HPL-AI it gained a score of 1.421 exaflops—the first time a computer has even earned an exascale rating on any list—using 126,720 nodes.

The top ranking on Graph 500 was won by a collaboration involving RIKEN, Kyushu University, Fixstars Corporation, and Fujitsu Limited. Using 92,160 nodes, it solved a breadth-first search of an enormous graph with 1.1 trillion nodes and 17.6 trillion edges in approximately 0.25 seconds, earning it a score of 70,980 gigaTEPS, more than doubling the score of 31,303 gigaTEPS the K computer and far surpassing China’s Sunway TaihuLight, which is currently second on the list, with 23,756 gigaTEPS.

Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC

Fugaku, which is currently installed at the RIKEN Center for Computational Science (R-CCS) in Kobe, Japan, is being developed under a national plan to design Japan’s next generation flagship supercomputer and to carry out a wide range of applications that will address high-priority social and scientific issues. It will be put to use in applications aimed at achieving the Society 5.0 planThe webpage will open in a new tab., by running applications in areas such as drug discovery; personalized and preventive medicine; simulations of natural disasters; weather and climate forecasting; energy creation, storage, and use; development of clean energy; new material development; new design and production processes; and—as a purely scientific endeavor—elucidation of the fundamental laws and evolution of the universe. In addition, Fugaku is currently being used on an experimental basis for research on COVID-19, including on diagnostics, therapeutics, and simulations of the spread of the virus. The new supercomputer is scheduled to begin full operation in fiscal 2021 (which starts in April 2021).

Fujitsu SVE update, building the Arm HPC Ecosystem

According to Satoshi Matsuoka, director of RIKEN R-CCS, “Ten years after the initial concept was proposed, and six years after the official start of the project, Fugaku is now near completion. Fugaku was developed based on the idea of achieving high performance on a variety of applications of great public interest, such as the achievement of Society 5.0, and we are very happy that it has shown itself to be outstanding on all the major supercomputer benchmarks. In addition to its use as a supercomputer, I hope that the leading-edge IT developed for it will contribute to major advances on difficult social challenges such as COVID-19.”

Supercomputer Fugaku sets new world records

According to Naoki Shinjo, Corporate Executive Officer of Fujitsu Limited, “I believe that our decision to use a co-design process for Fugaku, which involved working with RIKEN and other parties to create the system, was a key to our winning the top position on a number of rankings. I am particularly proud that we were able to do this just one month after the delivery of the system was finished, even during the COVID-19 crisis. I would like to express our sincere gratitude to RIKEN and all the other parties for their generous cooperation and support. I very much hope that Fugaku will show itself to be highly effective in real-world applications and will help to realize Society 5.0.

“The supercomputer Fugaku illustrates a dramatic shift in the type of compute that has been traditionally used in these powerful machines, and it is proof of the innovation that can happen with flexible computing solutions driven by a strong ecosystem,” said Rene Haas, President, IPG, Arm.” 
“For Arm, this achievement showcases the power efficiency, performance and scalability of our compute platform, which spans from smartphones to the world’s fastest supercomputer. We congratulate RIKEN and Fujitsu Limited for challenging the status quo and showing the world what is possible in Arm-based high-performance computing.”

Following the rise of Linux container use in commercial environments, the adoption of container technologies has gained momentum in technical and scientific computing, commonly referred to as high-performance computing (HPC). Containers can help solve many HPC problems, but the mainstream container engines didn't quite tick all the boxes. Podman is showing a lot of promise in bringing a standards-based, multi-architecture enabled container engine to HPC. Let’s take a closer look.

The trend towards using AI-accelerated solutions often require repackaging of applications and staging the data for easier consumption, breaking up otherwise massively parallel flow of purely computational solutions.

The ability to package application code, its dependencies and even user data, combined with the demand to simplify sharing of scientific research and findings with a global community across multiple locations, as well as the ability to migrate said applications into public or hybrid clouds, make containers very relevant for HPC environments. A number of supercomputing sites already have portions of their workflows containerized, especially those related to artificial intelligence (AI) and machine learning (ML) applications.

The first “exascale” supercomputer Fugaku & beyond - Satoshi matsuoka

Another aspect of why containerized deployments are becoming more and more important for HPC environments is the ability to provide an effective and inexpensive way to isolate the workloads. Partitioning large systems for use by multiple users or multiple applications running side by side has always been a challenge.

The desire to protect applications and their data from other users and potentially malicious actors is not new and has been addressed by virtualization in the past. With Linux cgroups and later with Linux containers the ability to partition system resources with practically no overhead has made containers particularly suitable for HPC environments where achieving maximum system utilization is the goal.

Linaro Connect Keynote: Toshiyuki Shimizu (Fujitsu Post-K A64FX ARM Supercomputer)

However, most recent implementations of mainstream container runtime environments have been focused on enabling CI/CD pipelines and microservices and have not been able to address supercomputing requirements, prompting the creation of several incompatible implementations just for use in HPC.

Podman and Red Hat Universal Base Image

That landscape changed when Podman arrived. Based on standards from the Open Container Initiative (OCI) Podman's implementation is rootless (does not require superuser privileges) and daemon-less (does not need constantly running background processes), and focuses on delivering performance and security benefits.

Most importantly, Podman and the accompanying container development tools, Buildah and Skopeo, are being delivered with Red Hat Enterprise Linux (RHEL), making it relevant to many HPC environments that have standardized and rely on this operating system (OS).

Another important aspect is that Podman shares many of the same underlying components with other container engines, like CRI-O, providing a proving ground for new and interesting features, and maintaining direct technology linkage to Kubernetes and Red Hat OpenShift. The benefits of technology continuity, the ability to contribute and tinker code at the lowest layers of the stack, and the presence of a thriving community, were the fundamental reasons for Red Hat’s investment in Podman, Buildah and Skopeo.

To further foster collaboration in the community and enable participants to freely redistribute their applications and containers that encapsulate them, Red Hat introduced the Red Hat Universal Base Image (UBI). UBI is an OS container image that does not run directly on bare metal hardware and is not supported as a stand alone entity, however it offers the same proven quality and reliability characteristics as Red Hat Enterprise Linux since it is tested by the same quality, security and performance teams.

UBI offers a different end user license agreement (EULA) that allows users to freely redistribute containerized applications built with it. Moreover, when a container built with UBI image is running on top of Red Hat platforms, like RHEL with Podman or OpenShift, it can inherit support terms from the host system that it runs on. For many sites that are required to run supported software this seamlessly creates a trusted software stack that is based on a verified OS container image.

Podman for HPC

Podman offers several features that are critical to HPC. For example, enabling containers to run with a single UID/GID pair based on the logged-in user’s UID/GID (i.e., no root privileges) and the ability to enforce additional security requirements via advanced kernel features like SELinux and Seccomp. Podman also allows users to set up or disable namespaces, specify mounting points for every container and modify default security controls settings across the cluster, by outlining these tasks in containers.conf file. 

To make Podman truly useful for running mainstream HPC it needs the ability to run jobs via Message Passing Interface (MPI). MPI applications still represent the bulk of HPC workloads and that is not going to change overnight. In fact, even AI/ML workflows often use MPI for multi-node execution. Red Hat engineers worked in the community to enable Podman to run MPI jobs with containers. This feature was then made available in RHEL 8 and was further tested and benchmarked against different container runtime implementations by the members of the community and independent researchers resulting in a published paper.

This ecosystem consisting of the container runtime, associated tools and container base image offers tangible benefits to scientists and HPC developers. They can create and prototype containers on their laptop, test and validate containers in a workflow using a single server (referred to as "node" in HPC) and then successfully deploy containers on thousands of similarly configured nodes across large supercomputing clusters using MPI. Moreover, with UBI scientists can now distribute their applications and data within the global community more easily.

All these traits of Podman have not gone unnoticed in the scientific community and at the large national supercomputing sites. Red Hat has a long history of collaborating with supercomputing sites and building software stacks for many TOP500 supercomputers in the world. We have keen interest in the Exascale Computing Project (ECP) and are tracking the next generation of systems that seek to break the exascale threshold. So when ECP kicked off the SuperContainers project, one of ECP’s newest efforts, Andrew Younge of Sandia National Laboratories, a lead investigator for that project, reached out to Red Hat to see how we can collaborate on and expand container technologies for use in first exascale supercomputers, which are expected to arrive as soon as 2021.

Red Hat contributes to upstream Podman and has engineers with deep Linux expertise and background in HPC who were able to work out a multi-phase plan. The plan expedites the development of HPC-friendly features in Podman, Buildah and Skopeo tools that come with Red Hat Enterprise Linux, with the goal of getting these features into Kubernetes and then into OpenShift.

Red Hat Linux Presentation at OpenPOWER and AI workshop

SuperContainers and multiple architectures

The first phase of the collaboration plan with ECP would focus on enabling a single host environment, incorporating UBI for ease of sharing container packages and providing support for accelerators and other special devices that make containers aware of the hardware that exists on the host. In the second phase, we would enable support for container runtime on the vast majority of the pre-exascale systems using MPI, across multiple architectures, like Arm and POWER. And the final phase calls for using OpenShift for provisioning containers, managing their life cycle and enabling scheduling at exascale.

Here is what Younge shared with us in a recent conversation: "When the ECP Supercomputing Containers project (aka SuperContainers) was launched, several container technologies were in use at different Department of Energy (DOE) Labs. However, a more robust production-quality container solution is desired as we are anticipating the arrival of exascale systems. Due to a culture of open source software development, support for standards, and interoperability, we’ve looked to Red Hat to help coalesce container runtimes for HPC."

Sandia National Labs is a home to Astra, the world's first Arm-based petascale supercomputer. Red Hat collaborated with HPE, Mellanox and Marvell to deliver this supercomputer to Sandia in 2018, as a part of the Vanguard program. Vanguard is aimed at expanding the high-performance computing ecosystem by evaluating and accelerating the development of emerging technologies in order to increase their viability for future large-scale production platforms. That collaboration was enabled by Red Hat’s multi-architecture strategy that helps customers design and build infrastructure based on their choice of several commercially available hardware architectures using a fully-open, enterprise-ready software stack.

Astra is now fully operational and Sandia researchers are using it to build and validate containers with Podman on 64-bit Arm v8 architecture. Younge provided the following insight: "Building containers on less widespread architectures such as Arm and POWER can be problematic, unless you have access to servers of the target architecture. Having Podman and Buildah running on Astra hardware is of value to our researchers and developers as it enables them to do unprivileged and user-driven container builds. The ability to run Podman on Arm servers is a great testament to the strength of that technology and the investment that Red Hat made in multi-architecture enablement."

Parallel Computing: Past, Present and Future - Dr. VirendrakumarC. Bhavsar

International Supercomputing Conference and the TOP500 list

If you are following or virtually attending the International Supercomputing Conference (ISC) that starts today, be sure to check out "Introduction to Podman for HPC use cases" keynote by Daniel Walsh, senior distinguished engineer at Red Hat. It will be presented during the Workshop on Virtualization in High-Performance Cloud Computing. For a deeper dive into practical implementation of HPC containers be sure to check out the High Performance Container Workshop where a panel of industry experts, including Andrew Younge and engineers from Red Hat, will be providing insights into most popular container technologies and the latest trends.

While it is fascinating to see Red Hat Enterprise Linux running Podman and containers on the world’s first Arm-based supercomputer, according to the latest edition of TOP500 list, published today at ISC 2020, RHEL is also powering the world’s largest Arm supercomputer. Fujitsu's Supercomputer Fugaku is the newest and largest supercomputer in the world and it is running RHEL 8. Installed at RIKEN, Fugaku is based on Arm architecture and is the first ever Arm-based system to top the list with 415.5 Pflop/s score on the HPL benchmark.

RHEL now claims the top three spots on the TOP500 list as it continues to power the #2 and #3 supercomputers in the world, Summit and Sierra, that are based on IBM POWER architecture.

RHEL is also powering the new #9 system on the list, the Marconi-100 supercomputer installed at Cineca and built by IBM for a grand total of four out of 10 top systems on the list.

RHEL also underpins six out of the top ten most power-efficient supercomputers on the planet according to the Green500 list.

So what does the road ahead look like for Podman and RHEL in supercomputing?

RHEL serves as the unifying glue that makes many TOP500 supercomputers run reliably and uniformly across various architectures and configurations. It enables the underlying hardware and creates a familiar interface for users and administrators.

New container capabilities in Red Hat Enterprise Linux 8 are paving the way for SuperContainers and can help smooth transition of HPC workloads into the exascale space.

In the meantime, growing HPC capabilities in OpenShift could be the next logical step for successful provisioning and managing containers at exascale while also opening up a path for deploying them into the public or hybrid clouds.

History of the World's Fastest Computers (1938–2020)

More Information


















0 reacties:

Post a Comment