Objectives
- Understand the principles and challenges of infrastructure management for hybrid HPC.
- Design ahybrid HPC environments considering hardware and software components.
- Implement hybrid HPC environments
- Optimise the infrastructure for performance and efficiency in hybrid HPC configurations.
- Evaluate the trade-offs between different infrastructure options for hybrid HPC.
- Apply infrastructure management best practices to real hybrid HPC scenarios.
Program
- Introduction
- Computing models: on-prem, cloud and HPC.
- HPC architectures: clusters and supercomputers.
- Cloud architectures: public, private and hybrid.
- Services
- Task scheduling.
- Infrastructure and cloud services relevant to hybrid HPC.
- Virtualisation and containers in hybrid HPC: benefits and use cases.
- Management of virtual machines and containers for HPC workloads.
- Optimisation and performance analysis
- Hybrid HPC network: Considerations on interconnections, latency, bandwidth and optimisation.
- Network topologies for efficient data movement in hybrid HPC.
- Performance analysis in hybrid HPC: Identifying bottlenecks and optimising the use of resources.
- Load balancing and job queue management in hybrid HPC clusters.
Bibliography
- High-Performance Computing: Modern Systems and Practices. Thomas Sterling, Matthew Anderson, Maciej Brodowicz. 2017. Morgan Kaufmann
- Distributed Storage Networks: Architecture, Protocols and Management, Thomas C. Jepsen, wiley 2013.
- Network Storage Tools and Technologies for Storing Your Company’s Data 1st ed., James O’Reilly. Morgan Kaufmann, 2016