Overview
This course provides hands-on training for implementation of high-performance computing (HPC) networks with the Cisco® Server Fabric Switch (SFS) platform.
The course begins with an overview of server fabrics, high-performance computing networks, and the InfiniBand protocol. Through a series of structured lecture and hands-on lab work, you will then learn how to install, cable, configure, and manage the InfiniBand fabric. You will also learn the basics of HPC performance tuning.
For those students who are implementing multi-fabric I/O (MFIO) with the SFS platform, the course also covers implementation of the SFS Ethernet and Fibre Channel gateway modules, and remote server boot over InfiniBand.
Pre-Requisites
The Knowledge and skills required for a delegates to sit this course are as follows;
- Ethernet and TCP/IP data networks
- Fibre Channel storage network
- Linux and/or Microsoft Windows system administration
- Server hardware maintenance
- Familiarity with network management and troubleshooting
Content
Module 1: Fundamentals of High Performance Computing
Lesson 1: HPC Fundamentals
- Technical Computing Applications
- Technical Computing Challenges
- What Is a Cluster?
- Types of Clusters
- Linux HPC
- Windows HPC
Lesson 2: The Basics of HPC Cluster Design
- HPC Cluster Components
- HPC Network Requirements
- HPC Cluster Architecture Solutions
- I/O Consolidation with InfiniBand
Lesson 3: InfiniBand Fundamentals
- Why InfiniBand?
- Solution Architecture
- Server Fabric Switches
- Blade Switches
- Gateways
- Host Channel Adapters
- Physical Layer
- Subnet Manager
- Addressing
- Quality of Service
- Remote Direct Memory Access
- Upper-Layer Protocols
Module 2: Designing HPC Server Fabrics
Lesson 1: HPC Cluster General Design Considerations
- Design Overview
- Networking Considerations
- Data Access Considerations
- Server Considerations
- Physical Infrastructure Considerations
Lesson 2: HPC Cluster InfiniBand Design
- HPC Cluster InfiniBand Technical Design Considerations
- HPC Cluster InfiniBand Cable Plant Design
- HPC Cluster InfiniBand Topologies
Lesson 3: Windows Compute Cluster Server and HPC Cluster Design
- Windows CCS Technical Design Considerations
- Windows CCS topologies
Lesson 4: Linux Clusters
- Linux Cluster Technical Design Considerations
Lesson 5: The Clustomizer Tool
Module 3: Implementing Server Fabrics with the Cisco SFS
Lesson 1: Cisco SFS Components
- The Cisco SFS 7000s
- InfiniBand Cabling Best Practices
Lesson 2: Switch Management
- Ports and Interfaces
- Command Line Interface
- Element Manager
- Chassis Manager
- SNMP
- Image Management
- User Management
Lesson 3: Subnet Management
- How Subnet Management Works
- High-Performance Subnet Manager
Lesson 4: Configuring InfiniBand Partitions
- What are InfiniBand Partitions?
- How Do P_Keys Work?
- Creating Partitions
Lesson 5: Cluster Bringup
- Cluster Bringup Process Overview
- Cluster Bringup Process: Planning
- Installing and Cabling Fabric Components
- Configuring Ethernet Attributes
- Validating the Physical Installation
- Preparing to Bring Up Switches
- Bringing up the Pod
- Troubleshooting Pods
- Connecting Pods and Core Switches
Lesson 6: Logging and Monitoring
- Port and Card Statistics
- Log Viewing
- HCA self-testing
Module 4: Building the Unified Compute Fabric
Lesson 1: Unified Fabric Overview
- Traditional Versus Unified Fabric Data Center Configuration
- Server Fabric Virtual I/O
Lesson 2: MFIO Components
Module 5: Configuring the Ethernet Gateway
Lesson 1: Ethernet Gateway Overview
- Virtual IP Interfaces
- Configuration Concepts
Lesson 2: SFS Ethernet Gateway Module Installation
- Installing and Removing an Ethernet Gateway
- Interpreting Ethernet Gateway LEDs
Lesson 3: Ethernet Gateway Configuration
- Dependencies
- Trunking
- Bridge Groups
- Redundancy Groups
- Bridging with Additional Partitions
Module 6: Configuring the Fibre Channel Gateway
Lesson 1: Fibre Channel Gateway Overview
- Fibre Channel Gateways as Virtual HBAs
- Fibre Channel Gateway Control
- Fibre Channel Gateway Redundancy
Lesson 2: Installing the Fibre Channel Gateway
- Installing and Removing a Fibre Channel Gateway
- Interpreting Fibre Channel Gateway LEDs
- Recovering from a FC Gateway Failure or Disconnect
Lesson 3: Configure and Verify Fibre Channel over InfiniBand
- Configuration Process
- Configuring Global Attributes
- Virtual WWNN and WWPN Generation
- Fibre Channel Switch Zoning
- LUN Discovery
- Creating Storage Association
- Editing ITL
Lesson 4: Storage Setup Process
- Linux Storage Setup
- Windows Storage Setup
- Storage Monitoring
- Recovering from a FC Gateway Failure or Disconnect
Appendix A: Boot over InfiniBand
Lesson 1: Boot over InfiniBand Overview
How Boot over InfiniBand Works
Boot over InfiniBand SAN
Boot over InfiniBand PXE
Lesson 2: Configure Remote Server Boot over InfiniBand on Linux
Configuring a Fibre Channel Connection
Installing an Image onto Fibre Channel Storage
Booting an Existing Image
Booting from PXE
Objectives
At the end of this course delegates will be able to;
- Describe server fabric, InfiniBand and HPC basic architecture, applications and components
- Install HCAs on hosts, rack and cable the fabric components
- Configure and verify the InfiniBand fabric
- Use switch management interfaces and identify recommended management practices for images and users
- Describe the subnet manager and IB addressing
- Monitor port and card statistics and view logs
- Configure InfiniBand partitions
- Describe what comprises a unified fabric and how it functions
- Install and configure the Ethernet and Fibre Channel Gateways
- Configure remote server boot over InfiniBand
Target Audience
This course is designed for field engineers who are implementing the Cisco Server Fabric Switch in HPC environments.