The ThinkSystem SD650 V3 Neptune DWC server is the next-generation high-performance server based on the fifth generation Lenovo Neptune™ direct water cooling platform.
With two 5th Gen Intel Xeon Scalable or Intel Max Series CPUs, the ThinkSystem SD650 V3 server combines the latest Intel processors and Lenovo's market-leading water-cooling solution, which results in extreme performance in an extreme dense packaging.
This product guide provides essential pre-sales information to understand the SD650 V3 server, its key features and specifications, components and options, and configuration guidelines. This guide is intended for technical specialists, sales specialists, sales engineers, IT architects, and other IT professionals who want to learn more about the SD650 V3 and consider its use in IT solutions.
The Lenovo ThinkSystem SD650 V3 dual-node server tray is designed for High Performance Computing (HPC), large-scale cloud, heavy simulations, and modeling. It implements Lenovo Neptune™ Direct Water Cooling (DWC) technology to optimally support workloads from technical computing, grid deployments, analytics, and is ideally suited for fields such as research, life sciences, energy, simulation, and engineering.
The unique design of ThinkSystem SD650 V3 provides the optimal balance of serviceability, performance, and efficiency. By using a standard rack with the ThinkSystem DW612S enclosure equipped with patented stainless steel drip-less quick connectors, the SD650 V3 provides easy serviceability and extreme density that is well suited for clusters ranging from small enterprises to the world's largest supercomputers.
The Lenovo Neptune™ direct liquid cooling doesn't use risky plastic retrofitting but instead custom-designed copper water loops, so you have peace of mind implementing a platform with liquid cooling at the core of the design.
Compared to other technology, the SD650 V3 direct water cooling:
- Reduces data center energy costs by up to 40%
- Increases system performance by up to 10%
- Delivers up to 100% heat removal efficiency into water (depending on environment)
- Creates a quieter data center with its fan-less design
- Enables data center growth without adding computer room air conditioning
Lenovo’s direct water-cooled solutions are factory-integrated and are re-tested at the rack-level to ensure that a rack can be directly deployed at the customer site. This careful and consistent quality testing has been developed as a result of over a decade of experience designing and deploying DWC solutions to the very highest standards.
Scalability and performance
The ThinkSystem SD650 V3 server tray and DW612S enclosure offer the following features to boost performance, improve scalability, and reduce costs:
- Each SD650 V3 node supports two high-performance Intel Xeon Scalable processors, 32x TruDDR5 DIMMs, up to two PCIe 5.0 slots for high-speed I/O, and up to two drive bays, in a half-wide 1U form factor.
- Up to 12x SD650 V3 nodes are installed in 6x trays in the DW612S enclosure, occupying only 6U of rack space. It is a highly dense, scalable, and price-optimized offering.
- Each node supports one or two 5th Gen Intel Xeon Processor Scalable processors
- Up to 64 cores and 128 threads
- Core speeds of up to 3.9 GHz
- TDP ratings of up to 385 W
- Alternatively, each node supports one or two 4th Gen Intel Xeon Processor Scalable processors
- Up to 60 cores and 120 threads
- Core speeds of up to 3.7 GHz
- TDP ratings of up to 350 W
- Alternatively, each node supports one or two Intel Xeon Max Series processors
- Integrated 64GB High Bandwidth Memory (HBM)
- Up to 56 cores and 112 threads
- Core speeds of up to 2.7 GHz
- TDP ratings of up to 350 W
- Support for DDR5 memory DIMMs to maximize the performance of the memory subsystem. Each node supports the following:
- Up to 16 DDR5 memory DIMMs, 8 DIMMs per processor
- 8 memory channels per processor (1 DIMM per channel)
- Supports 1 DIMM per channel operating at 5600 MHz (5th Gen processors) or 4800 MHz (4th Gen processors)
- Using 128GB 3DS RDIMMs, the server supports up to 2TB of system memory
- Each node supports combinations of PCIe 5.0 x16 slots and SSDs, as follows:
- One PCIe 5.0 x16 slot and either two 7mm SSDs or two E3.S EDSFF SSDs
- One PCIe 5.0 x16 slot and one 15mm SSD
- Two PCIe 5.0 x16 slots without SSDs (M.2 still supported)
- The server is Compute Express Link (CXL) v1.1 Ready. With CXL 1.1 for next-generation workloads, you can reduce compute latency in the data center and lower TCO. CXL is a protocol that runs across the standard PCIe physical layer and can support both standard PCIe devices as well as CXL devices on the same link.
- Drives can be either SATA or high-performance NVMe drives, to maximize I/O performance in terms of throughput, bandwidth, and latency.
- Supports a PCIe 4.0 x4 high-speed M.2 NVMe drive installed in an adapter for convenient operating system boot and internal storage functions.
- The node includes one Gigabit and two 25 Gb Ethernet onboard ports for cost effective networking. High speed networking can be added through the included PCIe slots.
- The node offers PCI Express 5.0 I/O expansion capabilities that doubles the theoretical maximum bandwidth of PCIe 4.0 (32GT/s in each direction for PCIe 5.0, compared to 16 GT/s with PCIe 4.0). A PCIe 5.0 x16 slot provides 128 GB/s bandwidth, enough to support a 400GbE network connection.
Energy efficiency
The direct water cooled solution offers the following energy efficiency features to save energy, reduce operational costs, increase energy availability, and contribute to a green environment:
- Water cooling eliminates power that is drawn by cooling fans in the enclosure and dramatically reduces the required air movement in the server room, which also saves power. In combination with an Energy Aware Runtime environment, savings as much as 40% are possible in the data center due to the reduced need for air conditioning.
- Water chillers may not be required with a direct water cooled solution. Chillers are a major expense for most geographies and can be reduced or even eliminated because the water temperature can now be 45°C instead of 18°C in an air-cooled environment.
- With the new water-cooled power supplies, essentially 100% system heat recovery is possible, depending on water and ambient temperature chosen. At 45°C water temperature and 30°C room temperature it will be typically around 95% through surface radiated heat. Heat energy absorbed may be reused for heating buildings in the winter, or generating cold through Adsorption Chillers, for further operating expense savings.
- The processors and other microelectronics are run at lower temperatures because they are water cooled, which uses less power, and allows for higher performance through Turbo Mode.
- The processors are run at uniform temperatures because they are cooled in parallel loops, which avoid thermal jitter and provides higher and more reliable performance at same power.
- Low-voltage 1.1V DDR5 memory offers energy savings compared to 1.2V DDR4 DIMMs, an approximately 20% decrease in power consumption
- 80 Plus Titanium power supplies ensure energy efficiency.
- There are power monitoring and management capabilities through the System Management Module in the DW612S enclosure.
- Lenovo power/energy meter based on TI INA226 measures DC power for the CPU at higher than 97% accuracy and 100 Hz sampling frequency to the XCC and can be leveraged both in-band and out-of-band using IPMI raw commands.
- Optional Lenovo XClarity Energy Manager provides advanced data center power notification, analysis, and policy-based management to help achieve lower heat output and reduced cooling needs.
- Optional Energy Aware Runtime provides sophisticated power monitoring and energy optimization on a job-level during the application runtime without impacting performance negatively.
Manageability and security
The following powerful systems management features simplify local and remote management of the SD650 V3 server:
- The server includes an XClarity Controller 2 (XCC2) to monitor server availability. Optional upgrade to XCC Platinum to provide remote control (keyboard video mouse) functions, support for the mounting of remote media files, FIPS 140-3 security, enhanced NIST 800-193 support, boot capture, power capping, and other management and security features.
- Lenovo XClarity Administrator offers comprehensive hardware management tools that help to increase uptime, reduce costs, and improve productivity through advanced server management capabilities.
- Lenovo XClarity Provisioning Manager, based in UEFI and accessible from F1 during boot, provides system inventory information, graphical UEFI Setup, platform update function, RAID Setup wizard, operating system installation function, and diagnostic functions.
- Support for Lenovo XClarity Energy Manager which captures real-time power and temperature data from the server and provides automated controls to lower energy costs.
- Support for industry standard management protocols, IPMI 2.0, SNMP 3.0, Redfish REST API, serial console via IPMI
- The SD650 V3 is enabled with Lenovo HPC & AI Software Stack, so, you can support multiple users and scale within a single cluster environment.
- Lenovo HPC & AI Software Stack provides our HPC customers you with a fully tested and supported open-source software stack to enable your administrators and users with for the most effective and environmentally sustainable consumption of Lenovo supercomputing capabilities.
- Our Confluent management system and Lenovo Intelligent Computing Orchestration (LiCO) web portal provides an interface designed to abstract the users from the complexity of HPC cluster orchestration and AI workloads management, making open-source HPC software consumable for every customer.
- LiCO web portal provides workflows for both AI and HPC, and supports multiple AI frameworks, allowing you to leverage a single cluster for diverse workload requirements.
- Integrated Trusted Platform Module (TPM) 2.0 support enables advanced cryptographic functionality, such as digital signatures and remote attestation.
- Supports Secure Boot to ensure only a digitally signed operating system can be used.
- Industry-standard Advanced Encryption Standard (AES) NI support for faster, stronger encryption.
- With the System Management Module (SMM) installed in the enclosure, only one Ethernet connection is needed to provide remote systems management functions for all SD650 V3 servers and the enclosure.
- The SMM management module has two Ethernet ports which allows a single Ethernet connection to be daisy chained across 7 enclosures and 84 servers, thereby significantly reducing the number of Ethernet switch ports needed to manage an entire rack of SD650 V3 servers and DW612S enclosures.
- The DW612S enclosure includes drip sensors that monitor the inlet and outlet manifold quick connect couplers; leaks are reported via the SMM.
- The server supports Lenovo XClarity suite software with Lenovo XClarity Administrator, Lenovo XClarity Provisioning Manager, and XClarity Energy Manager. They are described further in the Software section of this product guide.
- Lenovo HPC & AI Software Stack provides our HPC customers you with a fully tested and supported open-source software stack to enable your administrators and users with for the most effective and environmentally sustainable consumption of Lenovo supercomputing capabilities.
- Our Confluent management system and Lenovo Intelligent Computing Orchestration (LiCO) web portal provides an interface designed to abstract the users from the complexity of HPC cluster orchestration and AI workloads management, making open-source HPC software consumable for every customer.
- LiCO web portal provides workflows for both AI and HPC, and supports multiple AI frameworks, allowing you to leverage a single cluster for diverse workload requirements.
Availability and serviceability
The SD650 V3 node and DW612S enclosure provide the following features to simplify serviceability and increase system uptime:
- Designed to run 24 hours a day, 7 days a week
- Depending on the configuration and node population, the DW612S enclosure supports N+1 power policies for its power supplies, which means greater system uptime.
- All supported power supplies are hot-swappable, including the water-cooled power supplies.
- Toolless cover removal on the trays provides easy access to upgrades and serviceable parts, such as adapters and memory.
- The server uses ECC memory and supports memory RAS features including Single Device Data Correction (SDDC, also known as Chipkill), Patrol/Demand Scrubbing, Bounded Fault, DRAM Address Command Parity with Replay, DRAM Uncorrected ECC Error Retry, On-die ECC, ECC Error Check and Scrub (ECS), and Post Package Repair.
- Proactive Platform Alerts (including PFA and SMART alerts): Processors, voltage regulators, memory, internal storage (HDDs and SSDs, NVMe SSDs, M.2 storage), fans, power supplies, and server ambient and subcomponent temperatures. Alerts can be surfaced through the XClarity Controller to managers such as Lenovo XClarity Administrator and other standards-based management applications. These proactive alerts let you take appropriate actions in advance of possible failure, thereby increasing server uptime and application availability.
- The XCC offers optional remote management capability and can enable remote keyboard, video, and mouse (KVM) control and remote media for the node.
- Built-in diagnostics in UEFI, using Lenovo XClarity Provisioning Manager, speed up troubleshooting tasks to reduce service time.
- Lenovo XClarity Provisioning Manager supports diagnostics and can save service data to a USB key drive or remote CIFS share folder for troubleshooting and reduce service time.
- Auto restart in the event of a momentary loss of AC power (based on power policy setting in the XClarity Controller service processor)
- Virtual reseat is a supported feature of the System Management Module (SMM2) which simulates physically removing the node from A/C power and reconnecting the node to AC power from a remote location.
- There is a three-year customer replaceable unit and onsite limited warranty, with next business day 9x5 coverage. Optional warranty upgrades and extensions are available.
- With water cooling, system fans are not required. This results in significantly reduced noise levels on the data center floor, a significant benefit to personnel having to work on site.