This service offers dedicated computing resources as part of the Helmholtz AI platform.
Access to computing resources is the key for the Helmholtz AI community to accelerate innovative AI applications. The “Helmholtz AI COmputing REsources” (HAICORE) at the Max Delbrück Center provides easy and low-barrier access to dedicated GPU resources for the Helmholtz Foundation Models Initiative (HFMI).
The service provides in total 64 Nvidia H100 GPUs with 5.1TB of high-bandwidth GPU-RAM, 1760 CPU-cores with 7.7TB of system-memory, and a low-latency interconnect for massive-parallel AI-training. Additionally, fast local scratch-storage (NVMe) and big project storage (1.6 PB) is provided.
Basic Informations about other HAICORE sites can be found at helmholtz.ai.
Connect
You can access the system via a secure shell (ssh) connection. Before accessing the system you must prepare your account as explained below.
Connecting to the headnode of HAICORE.berlin
Please ssh-connect to login.haicore.berlin using your HAICORE.berlin account and the ssh private-key matching the public-key which you have uploaded as explained below in step 2.
Once you got connected, you can start using the workload management tool Slurm by loading the respective module with the following command: module load slurm
There you click on the link at the bottom of the dialogue to sign in with Helmholtz AAI.
Step 1.2: Find your Helmholtz-organization
In the drop-down-list provided, find your institution to authenticate with. You can use the search-field to speed-up the scrolling through more than 6000 institutions which are member of Helmholtz-AAI currently.
Tip for members of the Max Delbrück Center: You will fail finding “MDC” - please search for “Max” (Max Delbrück Center for Molecular Medicine) - no “MDC”, no “Berlin”.
Step 1.3: Authenticate with your institutional credentials
Please ensure that the URL now lists your home-institutions address - never give your credentials to other websites than those of your home-institution.
Type-in your home institutions username and password, plus maybe your second-factor when MFA is active on your institution.
In the next dialogue “keycloak@haicore.berlin” must get your permission to access the Helmholtz-AAI information. Please click on “Allow”.
Step 2: Store your HAICORE.berlin ssh-key
After authentications with Helmholts-AAI (step 1.3), you are connected to the HAICORE.berlin Account Management System.
When you have no ssh-key stored in the Helmholtz Cloud Portal, You have to paste an individual ssh-public-key for HAICORE.berlin into the field ssh_key and click on “Safe”
Notice: It only is possible to deposit one ssh-key in the Account Management System. Also passwords and multi-factor-information are ignored.
To create yor ssh-key-pair, you can use for example the “PuTTY Key Generator” on Windows or the ssh-keygen command on the Linux command-line:
PuTTY Key Generator
Linux command-line
Step 3: Bind your HAICORE.berlin account to an existing HFMI Project
Your name The user name that your HAICORE.berlin account got. Usually <forename>.<familyname>
The HFMI project that you belong to (HFMI = Helmholtz Foundation Model Initiative) Sorry, currently, only 6 named HFMI pilot-projects get access to HAICORE.berlin
Which fundamental tools do you need to start your research We provide basic software (git, gcc) and global access via http, https and ssh protocolls to download anything else. We can not provide access to Anaconda.
Now the administrators of HAICORE.berlin will grant you access to the named project, when your name is on the list of pre-named project members. Otherwise we will confirm back with the HFMI-project-lead at your home institution. The result will be communicated to you by e-mail.
After your account got confirmed, you can connect to HAICORE.berlin (see above)
Notice: You can have your account also bound to multiple HFMI-projects, when that is needed.
Compute
All computation is managed by our workload management system SLURM.
You must request the resources (runtime, GPUs, CPU-cores, CPU-memory) that you jobs needs. The following resources are available at most:
Resource
name
default
maximum per node
maximum per cluster
remarks
Runtime
--time=<hh>:<mm>:<ss>
24 hours
24 hours
24 hours
Please use propper checkpointing for your jobs. Please specify when you expect shorter runtime than one full day.
Nvidia H100
--gpus=<ngpu>
0
8 GPUs
64 GPUs
80 GB high-bandwidth-memory per GPU.
Xeon Planinum 8480+
-c <ncpu>
1
220 cores
1760 cores
This is 27 CPU-cores per GPU. (Four cores per node are reserved for OS.)
Memory
--mem=<size><unit>
4 GB
985 GB
7.7 TB
(Some RAM is consumed by OS and caches.)
Infiniband
--mpi=<foo>
none
You can consume a total op 50.000 GPU-hours per HFMI-project. The command <missing> gives you a report how much is consumed already by you and your project mates.
Storage on HAICORE.berlin
We provide you 100GB storage capacity in your home directory /fast/home/<your.alias> (or simply ~).
We provide you 10TB storage capacity in your project directory /fast/project/hfmi_<name>.
Please use the df command to find out, how much quota is consumed and left for the current directory.
You can transfer data using your ssh-credentials and the scp or sftp application on your remote system to transfer data to/from HAICORE.berlin.
For the runtime of your computations, you can use the local (NVMe) storage at /scratch (30TB) for temporary files.
This work was supported by the Helmholtz Association's "Initiative and Netforking Fund" on HAICORE.berlin. HAICORE.berlin is the Helmholtz AI computing resources partition for the Helmholtz Foundation Model Initiative.