HIGH PERFORMANCE COMPUTING CLUSTER

Cluster computing is nothing but two or more computers that are networked together to provide solutions as required. However, this idea should not be confused with a more general client-server model of computing as the idea behind clusters is quite unique.

A cluster of computers joins computational powers of the compute nodes to provide a more combined computational power. Therefore, as in the client-server model, rather than a simple client making requests of one or more servers, cluster computing utilize multiple machines to provide a more powerful computing environment perhaps through a single operating system.

In its simplest structure, as said above the HPC clusters are intended to utilize parallel computing to apply more processor force for the arrangement (solution) of a problem. There are numerous case of experimental computing utilizing different low-cost processors as a part of parallel to perform huge quantities of operations. This is alluded to as parallel computing. A High-Performance cluster, as seen on Figure 1, is regularly comprised of nodes (also called blades). HPC clusters will typically have a large number of computers (often called ‘nodes’) and, in general, most of these nodes would be configured identically. Though from the out side the cluster may look like a single system, the internal workings to make this happen can be quite complex. The idea is that the individual tasks that make up a parallel application should run equally well on whatever node they are dispatched on.However, some nodes in a cluster often have some physical and logical differences.

Master node

Clusters are complex environments, and administration of each individual segment is essential. The administration node gives numerous capacities, including: observing the status of individual nodes, issuing administration orders to individual nodes to right issues or to give orders to perform administration capacities, for example, power on/off. One can not underestimate the importance of cluster management. It is an imperative when trying to coordinate the activities of a large numbers of systems

Compute nodes

A compute node is the place where all the computing is performed. Most of the nodes in a cluster are ordinarily compute nodes. With a specific end goal to give a general arrangement, a compute node can execute one or more tasks, taking into account the scheduling system.

Storage

Applications that keep running on a cluster, compute nodes must have quick, dependable, and concurrent access to a storage framework. Storage gadgets are specifically joined to the nodes or associated to a brought together the storage node that will be in charge of facilitating the storage demands. The HPCC in Central Computer Centre is designed with 13 compute nodes in which one node is Knights landing node. Details are as follows





project 1

Master /Head Node


  • Make
  • : Lenovo-System X 3650 M5

  • Form Factor
  • : 2U Rack mountable server

  • Processor
  • : 2X Intel Xeon Processor (E5-2640 v3 8C 2.6GHz,20MB 1866MHz 90W)

  • Installed Memory
  • : 8x 8GB TruDDR4 Memory (1Rx4, 1.2V) PC4-17000 CL15 2133MHz LP RDIMM

  • Accelerator
  • : 1x NVIDIA K20c, 5GB, 2496 Cores

  • HDD
  • : 8x 1TB 7.2K 6Gbps NL SATA 3.5" G2HS HDD

  • Operating System
  • : 64-bit CentOS 6.7

  • Network Interface
  • : Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe

  • Advanced Interface
  • : Mellanox Technologies MT27520 Family (FDR)

  • Storage Interface
  • : QLogic 8Gb FC Dual-port HBA with Fiber channel

  • System Management
  • : Lenovo Integrated Management Module (IMM)
project 2

GPU Compute Nodes


  • Make
  • : Lenovo System X 3650 M5 (X 6 Nos.)

  • Form Factor
  • : 2U Rack mountable server

  • Processor
  • : 2X Intel Xeon Processor E5-2640 v3 8C 2.6GHz 20MB 1866MHz 90W

  • Installed Memory
  • : 8x 8GB TruDDR4 Memory (1Rx4, 1.2V) PC4-17000 CL152133MHz LP RDIMM

  • Accelerator
  • : 2x NVIDIA K20c, 5GB, 2496 Cores

  • HDD
  • : 1x 1TB 7.2K 6Gbps NL SATA 3.5" G2HS HDD

  • Operating System
  • : 64-bit CentOS 6.7

  • Network Interface
  • : Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe

  • Advanced Interface
  • : Mellanox Technologies MT27520 Family (FDR)

  • System Management
  • : Lenovo Integrated Management Module (IMM)
project 3

CPU Compute Node(Old)


  • Make
  • : Lenovo System X 3650 M5 (X 6 Nos.)

  • Form Factor
  • : 2U Rack mountable server

  • Processor
  • : 2X Intel Xeon Processor E5-2640 v3 8C 2.6GHz 20MB 1866MHz 90W

  • Installed Memory
  • : 8x 8GB TruDDR4 Memory (1Rx4, 1.2V) PC4-17000 CL152133MHz LP RDIMM

  • HDD
  • : 1x 1TB 7.2K 6Gbps NL SATA 3.5" G2HS HDD

  • Operating System
  • : 64-bit CentOS 6.7

  • Network Interface
  • : Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe

  • Advanced Interface
  • : Mellanox Technologies MT27520 Family (FDR)

  • System Management
  • : Lenovo Integrated Management Module (IMM)
project 4

CPU Compute Node(New)


  • Make
  • : Lenovo System X 3550 M5

  • CPU
  • : Dual 64 Bit Processor from x86 family, Intel Broadwell E5-2640v4

  • RAM
  • : 8x8GB TruDDR 4, 2133MHz scalable upto 1.5TB

  • CPU Cache
  • : L3 Cache, 25MB

  • Infiniband ports
  • : FDR port with 3 Mtr copper cable

  • Hard Disk Storage
  • : 500GB, SATA disk scalable up to 8 disk

  • Remote Management
  • : Remote Management with IPMI 2.0.

  • Trusted Platform module
  • : TPM 1.2

  • Power Supply
  • : Redundant and Hotswap energy efficient (90%) Power Supply

  • Fans
  • : Redundant and Hotswap

  • System Management
  • : Advance Failure analysis support on systems for CPU, memory, HDD, Power supply and fans.
project 5

Knights Landing Node (KNL node)


  • Processor
  • : 1 x Intel Xeon Phi 7250 Processor(272 cores)

  • RAM
  • : 64 GB(16GB x 4) DDR4, 2400 MHz ECC Memory

  • Network
  • :Two 1GbE network ports with PXE boot capability

  • Interconnect
  • : Single port FDR with cable

  • Installed Memory
  • : 8x 8GB TruDDR4 Memory (1Rx4, 1.2V) PC4-17000 CL15 2133MHz LP RDIMM

  • HDD
  • : 480 GB

  • Remote Management
  • : Remote Management with IPMI 2.0.

  • OS Support
  • : Fully certified/compatible with latest RHEL 7.x

  • Power supply
  • : 80 Plus Platinum or better certified powersupply with IEC 14 type Power cables

  • Form Factor
  • : Half width 1U or equivalent (e.g. 4 servers in 2U) rack mountable
project 6

Storage


  • Make
  • : IBM Storwize V3700 Storage

  • Processor
  • : 2X Intel Xeon Processor

  • Installed Memory
  • : 2x 4GB PC3L- 10600E

  • HDD
  • : 16X 1 TB 7,200 rpm 6 Gb SAS NL 2.5 Inch HDD (12 TB Useable after RAID 5 Fiber channel storage)

  • Interface
  • : 8Gb FC 4 Port Host Interface Card with FC
project 7

Infiniband Switch


  • Model-Mellanox SX6036
  • 36 FDR (56Gb/s) ports in a 1U switch
  • 4.032Tb/s switching capacity
  • FDR/FDR10 support for Forward Error Correction (FEC)
  • Remote Management Tool CLI, SNMP
  • Port Mirroring
project 8

Ethernet Switch


  • Model
  • : Lenovo RackSwitch G7028

  • Performance
  • : 128 Gbps switching throughput (full duplex) Latency of 3.3 microseconds 96 Mpps

  • Interface Options
  • : 24 × 1 GbE (24 RJ-45), 4 × 10 GbE SFP+

Central Computer Centre,National Institute of Technology,calicut 673601

maincc@nitc.ac.in

0495 228 6852, 0495 228 6854