Building EdenN Cluster

Photographic relation from the installation of our new computers.

January 01, 2021
Krzysztof Kaczmarski
Faculty of Mathematics and Information Science,
Warsaw University of Technology

How it started

In September 2019 prof. Dariusz Plewczyński from Laboratory of Bioinformatics and Computational Genomics and me applied together in the Ministry of Science and Higher Education for a Grant for building research equipment “Artificial intelligence system for interpreting the DNA sequence of the human genome”. In 2020, it was approved and despite Covid pandemic we got funding 2 mln PLN in the end of 2020. The dean of Faculty of Mathematics and Information Science, prof. Wojciech Domitrz decided to extend the initial application, and we opened public tender in November 2020. The final budget of purchases was close to 4 mln PLN.

Arrival

The hardware delivery was late due to Covid pandemic but in February 2021 the boxes finally arrived. They were transported from PNY factory in France directly to our building at Warsaw University of Technology Main Campus.

Picture1

Picture2

We decided to locate the new cluster in two empty racks. DGX computers require a lot of power so new 400V connections were supplied together with professional PDA’s with full network, humidity and temperature monitoring.

Picture3

Installation

The four servers after mounting the in racks.

Picture4

DDN storage filling with HDDs. Overall 800 TB was just a beginning and initial capacity which we afforded at a time. After 2 years it is already doubled.

Picture5

Mounting of additional memory and faster Ethernet connection.

Picture6

The final view of the cluster without cables.

Picture7

Complete cabling.

White - DGX power supply Blue - Storage optic fiber interconnection Black with blue marking - Infiniband 200 Gbps network

Picture8

Initial Tests

The system run initial tests smoothly. It took us half a year to finish the configuration and get all the services and monitoring running.

Picture9

Final configuration

As of 2021 the cluster contained:

  • 512 CPU Cores, 1 024 CPU Threads
  • 5 TB CPU RAM
  • 221 184 GPU Cores
  • 1 280 GB GPU RAM
  • 768 TB Long Term Storage (HDD)
  • 256 TB Ultra Fast Cache (SSD)
  • 56 TB Local Storage (SSD)

It all weighted around 504 kg.