Difference between revisions of "Filesystems:Frontenac"
Line 11: | Line 11: | ||
Note that it is the user's responsibility to manage the age of their data: these filesystems do not provide archiving. If data are no longer needed, they need to be moved off the system. If you need assistance with this, please contact us. | Note that it is the user's responsibility to manage the age of their data: these filesystems do not provide archiving. If data are no longer needed, they need to be moved off the system. If you need assistance with this, please contact us. | ||
− | == Storage | + | == Storage Areas == <!--T:5--> |
+ | |||
Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with its characteristics. Storage options are distinguished by the available hardware, access mode and write system. Typically, most Compute Canada systems offer the following storage types: | Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with its characteristics. Storage options are distinguished by the available hardware, access mode and write system. Typically, most Compute Canada systems offer the following storage types: | ||
<!--T:6--> | <!--T:6--> | ||
− | ; | + | ;Global Parallel File System (GPFS) |
− | + | : This file system is visible on both login and compute nodes. Combining multiple disk arrays and fast servers, it offers excellent performance for large files and large input/output operations. Two types of storage are distinguished on such systems: long term storage and temporary storage (scratch). Performance is subject to variations caused by other users. | |
− | + | ||
− | : This | + | |
;Local Filesystem | ;Local Filesystem | ||
− | : This | + | : This is a local hard drive attached to each of the nodes. Its advantage is high performance (because it is rarely shared). Its disadvantage is that local files must be re-copied to a global area to be vi sible on other nodes such as the login ()workup) node. Typically, local disk is regularly "cleaned", i.e. data kept there are considered transitory. |
;RAM (memory) Filesystem | ;RAM (memory) Filesystem | ||
− | : This is a filesystem that exists within a | + | : This is a filesystem that exists within a node's RAM, so it reduces the available memory. This makes it very fast but low-capacity. A RAM disk must be cleaned at the end of a job. |
<!--T:7--> | <!--T:7--> |
Revision as of 19:54, 21 November 2017
Overview
The Frontenac cluster uses a shared GPFS filesystem for all file storage.
User files are located under /global/home
of 3TB quota, shared project space under /global/project
, and network scratch space under /global/scratch
of 5TB quota.
In addition to the network storage, each compute node has a 1.5TB local hard disk for fast access to local scratch space by jobs using the location specified by the $TMPDISK
environment variable. All files in the local scratch space are assumed to be deleted automatically when corresponding jobs finish.
Note that it is the user's responsibility to manage the age of their data: these filesystems do not provide archiving. If data are no longer needed, they need to be moved off the system. If you need assistance with this, please contact us.
Storage Areas
Unlike your personal computer, a Compute Canada system will typically have several storage spaces or filesystems and you should ensure that you are using the right space for the right task. In this section we will discuss the principal filesystems available on most Compute Canada systems and the intended use of each one along with its characteristics. Storage options are distinguished by the available hardware, access mode and write system. Typically, most Compute Canada systems offer the following storage types:
- Global Parallel File System (GPFS)
- This file system is visible on both login and compute nodes. Combining multiple disk arrays and fast servers, it offers excellent performance for large files and large input/output operations. Two types of storage are distinguished on such systems: long term storage and temporary storage (scratch). Performance is subject to variations caused by other users.
- Local Filesystem
- This is a local hard drive attached to each of the nodes. Its advantage is high performance (because it is rarely shared). Its disadvantage is that local files must be re-copied to a global area to be vi sible on other nodes such as the login ()workup) node. Typically, local disk is regularly "cleaned", i.e. data kept there are considered transitory.
- RAM (memory) Filesystem
- This is a filesystem that exists within a node's RAM, so it reduces the available memory. This makes it very fast but low-capacity. A RAM disk must be cleaned at the end of a job.
The following table summarizes the properties of these storage types.
Type | Accessibility | Throughput | Latency | Longevity |
---|---|---|---|---|
Network Filesystem (NFS) | All nodes | Poor | High | Long term |
Long-Term Parallel Filesystem | All nodes | Fair | High | Long term |
Short-Term Parallel Filesystem | All nodes | Fair | High | Short term (periodically cleaned) |
Local Filesystem | Local to the node | Fair | Medium | Very short term |
Memory (RAM) Filesystem | Local to the node | Good | Very low | Very short term, cleaned after every job |
Throughput describes the efficiency of the file system for large operations, such as those involving a megabyte or more per read or write.
Latency describes the efficiency of the file system for multiple small operations. Low latency is good; however, if one has a choice between a small number of large operations and a large number of small ones, it is almost always better to use a small number of large operations.
Best practices
- Only use text format for files that are smaller than a few megabytes.
- As far as possible, use local storage for temporary files. It is best to use the temporary directory created by the job scheduler for this, named
$SLURM_TMPDIR
. - If your program must search within a file, it is fastest to do it by first reading it completely before searching, or to use a RAM disk.
- Regularly clean up your data in the scratch and project spaces, because those filesystems are used for huge data collections.
- If you no longer use certain files but they must be retained, archive and compress them, and if possible copy them elsewhere.
- If your needs are not well served by the available storage options please contact us by sending an e-mail to Compute Canada support.
Filesystem Quotas and Policies
In order to ensure that there is adequate space for all Compute Canada users, there are a variety of quotas and policy restrictions concerning back-ups and automatic purging of certain filesystems. On a cluster, each user has access to the home and scratch spaces by default and each group has access to 1 TB of project space by default. The nearline space has a default quota of 5 TB per group which is made available upon request by writing to Compute Canada support. The nearline filesystem is made up of medium to low performance storage in very high capacity. This filesystem should be used for storage of data that is infrequently accessed that needs to be kept for long periods of time. Both the nearline and project spaces may have their group-based quotas increased allocated through the annual RAC (resource allocation) process. You can see your current usage of the current quota for various filesystems on Cedar and Graham using the command diskusage_report.
Filesystem | Default Quota | Lustre-based? | Backed up? | Purged? | Available by Default? | Mounted on Compute Nodes? |
---|---|---|---|---|---|---|
Home Space | 50 GB and 500K files per user | Yes for Cedar, No for Graham (NFS) | Yes | No | Yes | Yes |
Scratch Space | 20 TB (100 TB on Graham) and 1M files per user[1] | Yes | No | Yes, all files older than 60 days are subject to purging. | Yes | Yes |
Project Space | 1 TB and 5M files per group[2] | Yes | Yes | No | Yes | Yes |
Nearline Space | 5 TB per group | No | No | No | No | No |
The backup policy on the home and project space is nightly backups which are retained for 30 days, while deleted files are retained for a further 60 days. If you wish to recover a previous version of a file or directory, you should write to Compute Canada support with the full path for the file(s) and desired version (by date). To copy data from the nearline storage to the project, home or scratch space, you should also write to Compute Canada support.
- ↑ Scratch space on Cedar can be increased to 100 TB per user upon request to Compute Canada support.
- ↑ Project space can be increased to 10 TB per group upon request to Compute Canada support and requests by different members of the same group will be summed together up to the ceiling of 10 TB.
See also
The Frontenac cluster uses a shared GPFS filesystem for all file storage.
User files are located under /global/home
of 3TB quota, shared project space under /global/project
,
and network scratch space under /global/scratch
of 5TB quota.
In addition to the network storage, each compute node has a 1.5TB local hard disk for fast access to local scratch space by jobs using the location specified by the $TMPDISK
environment variable. All files in the local scratch space are assumed to be deleted automatically when corresponding jobs finish.