TL;DR:
- Traditional HPC (High Performance Computing) is rooted in physics-based simulation. AI brings a new layer of adaptability, efficiency, and automation.
- The synergy between AI and HPC is unlocking faster, more precise modeling across industries like manufacturing, energy, and climate science.
- Steve Heibein, Federal AI Chief Technologist @ HPE, laid out practical architectures and real-world examples, including NASA’s generative design and NVIDIA Earth-2.
- The conversation included everything from retrieval-augmented generation (RAG) to liquid-cooled data centers and AI factory blueprints.
What happens when AI meets HPC?
In this session hosted by Data Science Connect, Steve Heibein framed the convergence of AI and HPC as both inevitable and essential. Traditional HPC workloads — like computational fluid dynamics or finite element modeling — are foundational to innovation in fields from aerospace to pharmaceuticals. But they’re increasingly being pushed to their limits.
Enter AI. Not to replace simulation, but to augment it.
AI helps HPC systems zero in on relevant data, narrow simulation parameters, and optimize results faster. For example, during the early COVID-19 pandemic, the U.S. Department of Energy used AI to drastically reduce the number of protein configurations requiring simulation. HPC then modeled those high-probability configurations to identify the virus’s spike protein shape, accelerating vaccine development.
How are organizations applying this today?
Generative design at NASA
NASA is using LLMs (large language models) to generate highly optimized parts for space missions. Engineers specify material, load conditions, and mounting points. AI proposes designs that are then 3D-printed and tested — many with geometries a human wouldn’t naturally consider. This isn’t just futuristic — it’s a response to real-world constraints like needing to manufacture parts in space.
NVIDIA Earth-2: The digital twin of the planet
One of the most ambitious examples is NVIDIA Earth-2, a high-resolution digital twin that combines AI, visualization, and physics-based simulation. It’s used to forecast weather and model climate events with unprecedented granularity, supporting better disaster preparedness and environmental planning.
Manufacturing, energy, and federal use cases
Across industries, we’re seeing:
- Predictive maintenance using sensor data and ML to anticipate equipment failure
- Supply chain optimization blending simulation with real-time AI models
- Power grid routing improved via AI-enhanced modeling of electrical loads
What infrastructure is needed to scale this?
As workloads become more complex and data-intensive, infrastructure becomes a critical differentiator. Heibein outlined a clear path:
Three tiers of AI infrastructure
- Packaged applications: No-code tools for vertical-specific use cases (e.g., video surveillance, medical imaging)
- AI appliances: Turnkey systems like HPE’s co-developed racks with NVIDIA, pre-loaded with GenAI accelerators and software
- AI factories: Bespoke, large-scale environments designed for massive AI and HPC workloads. These can span 16 to 16,000 GPUs, with advanced cooling and workload orchestration.
Liquid cooling is no longer optional
Modern GPUs can draw up to 1,000 watts. That kind of heat load makes traditional air cooling unviable at scale. Heibein pointed out that liquid cooling is becoming table stakes for any serious AI/HPC deployment.
Private clouds with public cloud UX
HPE is bringing cloud-like orchestration and monitoring to on-prem deployments. Through its GreenLake platform, users get Kubernetes management, resource provisioning, and cost monitoring with the governance and data residency benefits of private infrastructure.
Questions answered in this session
- How can AI reduce simulation runtime in HPC workflows?
- What is retrieval-augmented generation (RAG) and how does it apply to enterprise data?
- Why are digital twins like NVIDIA Earth-2 reshaping climate forecasting?
- What does it take to deploy a high-density, liquid-cooled AI cluster?
- How are agencies and enterprises integrating AI without scrapping their HPC investments?