How it started
In September 2019 prof. Dariusz Plewczyński from Laboratory of Bioinformatics and Computational Genomics and me applied together in the Ministry of Science and Higher Education for a Grant for building research equipment “Artificial intelligence system for interpreting the DNA sequence of the human genome”. In 2020, it was approved and despite Covid pandemic we got funding 2 mln PLN in the end of 2020. The dean of Faculty of Mathematics and Information Science, prof. Wojciech Domitrz decided to extend the initial application, and we opened public tender in November 2020. The final budget of purchases was close to 4 mln PLN.
Arrival
The hardware delivery was late due to Covid pandemic but in February 2021 the boxes finally arrived. They were transported from PNY factory in France directly to our building at Warsaw University of Technology Main Campus.
We decided to locate the new cluster in two empty racks. DGX computers require a lot of power so new 400V connections were supplied together with professional PDA’s with full network, humidity and temperature monitoring.
Installation
The four servers after mounting the in racks.
DDN storage filling with HDDs. Overall 800 TB was just a beginning and initial capacity which we afforded at a time. After 2 years it is already doubled.
Mounting of additional memory and faster Ethernet connection.
The final view of the cluster without cables.
Complete cabling.
White - DGX power supply Blue - Storage optic fiber interconnection Black with blue marking - Infiniband 200 Gbps network
Initial Tests
The system run initial tests smoothly. It took us half a year to finish the configuration and get all the services and monitoring running.
Final configuration
As of 2021 the cluster contained:
- 512 CPU Cores, 1 024 CPU Threads
- 5 TB CPU RAM
- 221 184 GPU Cores
- 1 280 GB GPU RAM
- 768 TB Long Term Storage (HDD)
- 256 TB Ultra Fast Cache (SSD)
- 56 TB Local Storage (SSD)
It all weighted around 504 kg.