Lenovo Infrastructure Solutions Group, Intel and the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities are embarking on Phase Two of the LRZ’s SuperMUC-NG supercomputer. The system will utilise artificial intelligence to implement advanced simulations, modelling, and data analysis that will accelerate research.
Funding is coming from the Free State of Bavaria and the German Federal Ministries of Education and Research.
Since SuperMUC-NG Phase One was launched, the supercomputer has been used not only for traditional simulation and modelling, but also to automate image and pattern recognition in planet observations, climate data from satellites, medical visuals and health records, and data demographics. Given the successful utilisation of SuperMUC-NG in these projects, the demand for high performance data analytics, machine learning and fast memory performance has further increased.
SuperMUC-NG will now be enhanced with next-generation Intel Xeon Scalable processors (codenamed Sapphire Rapids) and Intel’s upcoming HPC GPUs based on the Xe HPC architecture, codenamed “Ponte Vecchio”.
Phase Two will also use distributed asynchronous object storage (DAOS), leveraging 3rd Gen Intel Xeon Scalable processors (codename “Ice Lake”) integrated into Lenovo’s ThinkSystem SR630 V2 platform. DAOS provides 1 petabyte of data storage, and will enable fast throughput of large data volumes, while the system architecture can deliver highly scalable, compute and data-intensive workloads and artificial intelligence applications. Overall, the SuperMUC-NG Phase Two compute nodes will deliver four times higher performance per Watt (High Performance Linpack) than Phase One.
“The Leibniz Supercomputing Centre has long been an important innovation partner for both Lenovo and Intel. Phase Two is an exciting opportunity to share our expertise in what Lenovo calls ‘Exascale for Everyscale’- solutions using advanced exascale technologies in any size cluster- and provide researchers with the specialist resources needed to accelerate projects,” explains Scott Tease, Vice President, HPC and AI, Lenovo Infrastructure Solutions Group. “Through the implementation of our Neptune™ warm-water cooling and a smarter integrated system for artificial intelligence and deep learning, LRZ can continue to be a thought leader in advanced technologies for many years to come, and set new standards for research and development.”
Ensuring a sustainable approach
The enhancements made in Phase Two will ensure SuperMUC-NG is now capable of performing additional tasks in a way that’s as energy-efficient as possible. The key to this is the integration of 240 Intel compute nodes into Lenovo’s ThinkSystem SD650 leveraging Neptune warm water cooling and connected to the DAOS storage system via a high-speed network. Lenovo’s innovative Neptune direct water-cooling technology removes approximately 90% of the heat from the compute system, reducing overall energy consumption, significantly increasing overall efficiency and ultimately allowing the processors to perform at their peak.
In addition, the components for SuperMUC-NG Phase Two will be manufactured within Europe, in Lenovo’s new dedicated manufacturing facility in Hungary, to help further improve the eco-footprint of the project’s supply chain.
“Delivering resources and services that empower researchers to accelerate their projects is at the heart of everything we do at LRZ,” says Prof. Dr. Dieter Kranzlmüller, Director of the LRZ. “Our work with Lenovo and other partners to integrate advanced AI capabilities into this next phase will help the centre better achieve this, and ensure researchers are given what they need to excel in their scientific fields. Not only that, but with Lenovo’s warm-water cooling technology we’re able to deliver these enhancements in a way that’s as sustainable, and energy-efficient as possible.”
Phase Two kick off
LRZ will receive the DAOS storage system in the last quarter of 2021, and the compute system will follow in the 2nd quarter of 2022. The LRZ team are preparing their user community for Phase Two’s enhancements by offering support and consultations for adapting and optimizing codes and AI algorithms, and giving researchers access to GPU systems specialised in AI applications. The LRZ training program also offers a wide variety of machine and deep learning courses, educating users in how they can adapt existing algorithms or develop and train their own.
Read the latest edition of PCR’s monthly magazine below: