To rapidly address the modeling requests of our collaborators, the simulation codes we develop and use can run on parallel architectures.
The framework is organized as follows.

Methods

Community code development

We centralize every code development through Git platforms such as Bitbucket.org.

Local resources

Shared memory computers  (< 1,000,000 core-hours per year)

Field CPU GPU RAM HDD Equiv. core-hours per year
Office 201, station 1 2×8 cores, Intel Xeon E5-2650 v2m @2.6 GHz, TDW 95 W Nvidia 96 CUDA cores @ 700 MHz, 2GB, Cuda cap. 2.1. 32 GB 886 GB + 466 GB 140,000 + 840,960
Office 206, station 1 2×6 cores, Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz (TDP 80 W) Radeon Tahiti PRO HD 7950/8950 OEM / R9 280 16 GB 884 GB 105,000
Office 201, station 2 2×12 Intel Xeon E5-2650 v4 @ 2.2 GHz Nvidia 640 CUDA cores, 4 GB, Capabilities 6.1. 64 GB 210,240 + 5,606,400
Luzicka 2×4 Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Nvidia ~2304 CUDA cores, 8 GB, CUDA v11.1 8 GB  >4 TB 70,080 + 20,183,040

Remote resources

The real-time job queue of Barbora cluster (IT4I.cz) is available here, with the estimation of remaining simulation time. If you are in hurry, please let me know.

Currently reached performances on Octopus

We attempt to optimize the computer time we use. In particular, we prepare numbers of TD-runs using Octopus software.

Here we keep a small note on the performances reached on various computers.

Example of Silicon primitive cell

Machine Octopus Version Compiler & Parameters Time per SCF step Time per TD step
CPU GPU CPU GPU
IBM Power9 (1 node + 1xGPU 16 GB) Octopus 10.4 (distrib. binary) GCC/GFortran 1539 s 500 s (single-kgrid, StatePack=no)  1860 s  115 s (StatePack=no)
IBM Power9 (1 node + 4xGPU 16 GB) Octopus 10.4 (distrib. binary) GCC/GFortran  36 s
(single-kgrid, StatePack=no)
 – 22 s (single-kgrid, StatePack=no)
IBM Power9 (1 node + 4xGPU 16 GB) Octopus 10.4 (distrib. binary) GCC/GFortran  113 s
(single-kgrid, StatePack=yes)
 – 22 s
(single-kgrid, StatePack=yes)
IBM Power9 (1 node + 4xGPU 16 GB) Octopus 10.4 (distrib. binary) GCC/GFortran 139 s
(4x kgrid, StatePack=no)
xx s
(4x kgrid, StatePack=no)
 MPCDF Draco (4 nodes)  Octopus (version of 2019 09 11)  Intel compilers  –  2.5 s  –
IT4I Salomon (4 nodes) Octopus (version of 2019 06 10) Intel compilers 5.0 s
IT4I Barbora (8 nodes) Octopus (version of 2019 06 10) Intel compilers  1.91 s xx s
 PRACE Prometheus (Poland) (4 nodes)  Octopus (version of 2019 10 16)  Intel compilers  5.0 s  –

Example of silica primitive cell

  • dx=0.22 Bohr. k=8³. RAM: 22 GB. Duration per SCF cycle on Barbora: <72 s.