1slurm
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
1slurm [2020/05/09 17:12] – admin | 1slurm [2020/06/26 12:38] (current) – admin | ||
---|---|---|---|
Line 3: | Line 3: | ||
Our cluster runs [[https:// | Our cluster runs [[https:// | ||
- | The queuing system give you access to computers owned by LCM, LTHC, LTHI, LINX and IC Faculty; sharing the computational resources among as many groups as possible will result in a more efficient use of the resources (including the electric power). | + | The queuing system give you access to computers owned by LCM, LTHC, LTHI, LINX and IC Faculty; sharing the computational resources among as many groups as possible will result in a more efficient use of the resources (including the electric power), you can take advantage of many more machines for your urgent calculations and get results faster. |
- | As user you can take advantage of many more machines for your urgent calculations and get results faster. On the other hand, since the machines | + | On the other hand, since the machines |
- | We have configured the system almost | + | We have configured the system |
- number of CPU/cores: you must indicate the correct number of cores you're going to use; | - number of CPU/cores: you must indicate the correct number of cores you're going to use; | ||
- Megabytes/ | - Megabytes/ | ||
- Time for the execution: if your job is not completed by the indicated time, it will be automatically terminated; | - Time for the execution: if your job is not completed by the indicated time, it will be automatically terminated; | ||
- | you can find better and more complete guides on how to use S.L.U.R.M. control commands on internet; e.g: | + | here we provide just a fast and dirty guide for the most basic commands/ |
- [[https:// | - [[https:// | ||
- [[https:// | - [[https:// | ||
- [[https:// | - [[https:// | ||
- | here we provide just a fast and dirty guide for the most basic commands/ | ||
==== partitions (a.k.a. queues) ==== | ==== partitions (a.k.a. queues) ==== | ||
- | If you used other types of cluster management, you will already known the term " | + | If you used other types of cluster management, you will already known the term " |
===== Mini User Guide ===== | ===== Mini User Guide ===== | ||
- | The 3 most used commands are: | + | The most used/ |
- '' | - '' | ||
- '' | - '' | ||
Line 29: | Line 28: | ||
- '' | - '' | ||
- | * '' | + | * '' |
< | < | ||
$ sinfo | $ sinfo | ||
Line 49: | Line 48: | ||
here you can see that the command provides the ID of the jobs, the PARTITION used to run the jobs (hence the nodes where these jobs will run), the NAME assigned to the jobs, the name of the USER that submitted the jobs, the STATUS of the job (R=Run, PD=Waiting), | here you can see that the command provides the ID of the jobs, the PARTITION used to run the jobs (hence the nodes where these jobs will run), the NAME assigned to the jobs, the name of the USER that submitted the jobs, the STATUS of the job (R=Run, PD=Waiting), | ||
- | * '' | + | * '' |
- | Once a job is submitted (and accepted by the cluster, you'll receive the ID assigned to the job: | + | Once a job is submitted (and accepted by the cluster), you'll receive the ID assigned to the job: |
< | < | ||
$ sbatch sheepit.slurm | $ sbatch sheepit.slurm | ||
Submitted batch job 552 | Submitted batch job 552 | ||
</ | </ | ||
- | * '' | + | * '' |
* '' | * '' | ||
Line 108: | Line 107: | ||
< | < | ||
- | At the beginning of the file you can read the line ''# | + | At the beginning of the file, you can read the line ''# |
</ | </ | ||
Inside a script, all the line that starts with the '#' | Inside a script, all the line that starts with the '#' | ||
Line 116: | Line 115: | ||
* ''# | * ''# | ||
* ''# | * ''# | ||
- | * ''# | + | * ''# |
* ''# | * ''# | ||
- | * ''# | + | * ''# |
* ''# | * ''# | ||
* ''# | * ''# | ||
Line 132: | Line 131: | ||
* '' | * '' | ||
* '' | * '' | ||
+ | * '' | ||
* '' | * '' | ||
Line 141: | Line 141: | ||
<note important> | <note important> | ||
- | It is **mandatory** to specify at least the estimated run time of the job and the memory needed, so the scheduler can optimize the machines/ | + | It is **mandatory** to specify at least the estimated run time of the job and the memory needed, so the scheduler can optimize the nodes/ |
Please keep in mind that longer jobs are less likely to enter the queue when the cluster load is high. Therefore, don't be lazy and do not always ask for // | Please keep in mind that longer jobs are less likely to enter the queue when the cluster load is high. Therefore, don't be lazy and do not always ask for // | ||
Line 164: | Line 164: | ||
< | < | ||
- | scancel | + | squeue |
</ | </ | ||
1slurm.1589037144.txt.gz · Last modified: 2020/05/09 17:12 by admin