We attempt to optimize the computer time we use. In particular, we prepare numbers of TD-runs using Octopus software.
Here we keep a small note on the performances reached on various computers.
Example of Silicon primitive cell
| Machine | Octopus Version | Compiler & Parameters | Time per SCF step | Time per TD step | ||
|---|---|---|---|---|---|---|
| CPU | GPU | CPU | GPU | |||
| IBM Power9 (1 node + 1xGPU 16 GB) | Octopus 10.4 (distrib. binary) | GCC/GFortran | 1539 s | 500 s (single-kgrid, StatePack=no) | 1860 s | 115 s (StatePack=no) |
| IBM Power9 (1 node + 4xGPU 16 GB) | Octopus 10.4 (distrib. binary) | GCC/GFortran | – | 36 s (single-kgrid, StatePack=no) | – | 22 s (single-kgrid, StatePack=no) |
| IBM Power9 (1 node + 4xGPU 16 GB) | Octopus 10.4 (distrib. binary) | GCC/GFortran | 113 s (single-kgrid, StatePack=yes) | – | 22 s (single-kgrid, StatePack=yes) | |
| IBM Power9 (1 node + 4xGPU 16 GB) | Octopus 10.4 (distrib. binary) | GCC/GFortran | 139 s (4x kgrid, StatePack=no) | – | xx s (4x kgrid, StatePack=no) | |
| MPCDF Draco (4 nodes) | Octopus (version of 2019 09 11) | Intel compilers | – | 2.5 s | – | |
| IT4I Salomon (4 nodes) | Octopus (version of 2019 06 10) | Intel compilers | – | 5.0 s | – | |
| IT4I Barbora (8 nodes) | Octopus (version of 2019 06 10) | Intel compilers | 1.91 s | – | xx s | – |
| PRACE Prometheus (Poland) (4 nodes) | Octopus (version of 2019 10 16) | Intel compilers | – | 5.0 s | – | |
Example of silica primitive cell
- dx=0.22 Bohr. k=8³. RAM: 22 GB. Duration per SCF cycle on Barbora: <72 s.