Difference between revisions of "Allocation"

From CAC Wiki
Jump to: navigation, search
(Contributed accounts)
(Time and memory limits)
Line 56: Line 56:
  
 
== Time and memory limits ==
 
== Time and memory limits ==
 +
 +
The SLURM scheduler uses '''time limits''' to predict when a given resource will become available (at the latest). This allows it to fill small fragmented resources with small short-running jobs without forcing larger allocations to wait unduly long. Such waiting periods would result in very inefficient scheduling, wasting valuable resources. In order for time limits to have a beneficial effect, they have to be enforced stringently.
 +
 +
<pre>
 +
Time limits a "hard" limits. Jobs that exceed their time limit are terminated.
 +
</pre>
 +
 +
In order to avoid having a job terminated, you must specify a time limit in excess of your maximum expected run time. We strongly recommend to checkpoint your jobs. The system cannot do this automatically, so it must be done through your application. This is the user's responsibility. It also safeguards against losing all your work when running into time-limit terminations.
 +
 +
<pre>
 +
Please specify a reasonable time limit and checkpoint your jobs. The default time limit is short (3 hrs).
 +
</pre>
 +
 +
If you don't specify a time limit, a short default will be assigned. Time limits may not be changed once a job is running. The maximum time limit for a standard job is 14 days, for an allocated (RAC or contributed) job it is 28 days.
  
 
== Comparison: Allocation on SW/CAC (SGE) vs Frontenac (SLURM) ==
 
== Comparison: Allocation on SW/CAC (SGE) vs Frontenac (SLURM) ==

Revision as of 19:57, 8 February 2018

Resource Allocations on the Frontenac Cluster

This Wiki entry is meant to explain how resources are shared on the CAC Frontenac cluster. This includes default allocations in terms of Compute time as well as extended resources that were allocated by Compute Canada or that come from contributed systems. We also point out differences between the current Frontenac allocation scheme and the older scheme that was used on the now decommissioned SW/CAC clusters.

Default accounts

Our job scheduler on Frontenac is [SLURM https://slurm.schedmd.com/]. All resource allocations and limitations are applied through this scheduler. For a basic intro on how to use it, please see our scheduler help file.

Every user on our systems has at least one SLURM account, the default account. Users with access to extended resources have additional accounts corresponding to these allocations. These SLURM accounts have intrinsic restrictions and allow scheduling of jobs up to these limits.

The limitations of a default account are :

  • Default allocation limit (one-year) cumulative : 50 core years
  • Jobs are scheduled to the standard partition
  • This partition excludes the high-memory and large-core-count nodes
  • The default priority is 1 (low)
  • No core limits
  • Continued usage lowers the relative priority with respect to other jobs
  • Default time limit : 3 hrs
  • Maximum time limit : 14 days (2 weeks)
  • Default memory limit : 1 GB
  • Default number of cores : 1

RAC accounts

Users who have applied for and received a RAC allocation from Compute Canada, are accessing this allocation through a special RAC account.

The limitations of a RAC account are :

  • RAC allocation limit : granted (<asked!) by Compute Canada (2017 RAC)
  • Jobs are scheduled to the reserved partition
  • No scheduling on the standard partition.
  • This partition includes the high-memory and large-core-count nodes
  • The RAC priority is 5 (enhanced)
  • No core limits
  • Continued usage lowers the relative priority with respect to other RAC jobs
  • Maximum time limit : 28 days (4 weeks)
  • Default time limit : 3 hrs
  • Default memory limit : 1 GB
  • Default number of cores : 1

Contributed accounts

Users who have applied for and received a RAC allocation from Compute Canada, are accessing this allocation through a special RAC account.

The limitations of a RAC account are :

  • contributed allocation limit : 1 core year per core contributed
  • Jobs are scheduled to the reserved partition
  • No scheduling on the standard partition.
  • This partition includes the high-memory and large-core-count nodes
  • The contributed priority is 10 (high)
  • Core limit : number of cores contributed
  • Continued usage lowers the relative priority with respect to other contributed jobs
  • Maximum time limit : 28 days (4 weeks)
  • Default time limit : 3 hrs
  • Default memory limit : 1 GB
  • Default number of cores : 1

Time and memory limits

The SLURM scheduler uses time limits to predict when a given resource will become available (at the latest). This allows it to fill small fragmented resources with small short-running jobs without forcing larger allocations to wait unduly long. Such waiting periods would result in very inefficient scheduling, wasting valuable resources. In order for time limits to have a beneficial effect, they have to be enforced stringently.

Time limits a "hard" limits. Jobs that exceed their time limit are terminated.

In order to avoid having a job terminated, you must specify a time limit in excess of your maximum expected run time. We strongly recommend to checkpoint your jobs. The system cannot do this automatically, so it must be done through your application. This is the user's responsibility. It also safeguards against losing all your work when running into time-limit terminations.

Please specify a reasonable time limit and checkpoint your jobs. The default time limit is short (3 hrs).

If you don't specify a time limit, a short default will be assigned. Time limits may not be changed once a job is running. The maximum time limit for a standard job is 14 days, for an allocated (RAC or contributed) job it is 28 days.

Comparison: Allocation on SW/CAC (SGE) vs Frontenac (SLURM)

Allocation Feature SW/CAC (SGE) Frontenac (SLURM)
Default allocation (compute)
  • 48 core limit
  • excludes dedicated systems
  • job limits may apply
  • 50 core years over 1 year
  • standard partition
  • low priority (1)
  • no scheduling to "large nodes"
  • job limits may apply
RAC allocation (compute)
  • fixed core limit
  • according to allocation from Compute Canada
  • associated with dedicated node
  • scheduled to specific node by user
  • fixed core-years allocation over one year
  • allocation from Compute Canada (less than ask!)
  • scheduled to "reserved" partition at higher priority (5)
  • no dedicated resources, reserved partition
  • compete with other users of same priority
Contributed allocation (compute)
  • fixed core limit depending on size of contributed system
  • allocation from depends on system size
  • associated with dedicated node
  • scheduled to specific node by user
  • fixed core-years allocation over one year
  • allocation depends on system size
  • scheduled to "reserved" partition at very high priority (10)
  • no dedicated resources, reserved partition
  • compete with other users of same priority
Limits
  • core limits: default 48
  • temporary extensions available
  • RAC/contributed allocatgion run with extended limits
  • global job limits may apply
  • no permanent core limits
  • sanity core limits may apply to support fair-share
  • usage lowers priority (fair share)
  • RAC/contributed allocation increase priority
  • no global job limits