Difference between revisions of "Renewal"

From CAC Wiki
Jump to: navigation, search
(Created page with "== '''Account Renewals''' == This is a guide to the account renewal process at the Centre for Advanced Computing. It explains what you need to do in order to keep your accoun...")
 
(What if my CCRI has expired / was de-activated ?)
 
(15 intermediate revisions by the same user not shown)
Line 1: Line 1:
== '''Account Renewals''' ==
+
= '''Account Renewals''' =
  
 
This is a guide to the account renewal process at the Centre for Advanced Computing. It explains what you need to do in order to keep your account active.
 
This is a guide to the account renewal process at the Centre for Advanced Computing. It explains what you need to do in order to keep your account active.
 
It also explains details of our present effort to bring our accounts up to date and synchronize the account structure with the one at Compute Canada.
 
It also explains details of our present effort to bring our accounts up to date and synchronize the account structure with the one at Compute Canada.
 +
 +
== How to renew an account at the Centre for Advanced Computing ==
  
* Visit https://login.cac.queensu.ca/pwr to obtain a temporary password. You must use the original email you registered with.
+
Almost all accounts that entitle users to access to our '''Frontenac''' cluster require an active '''Compute Canada Role'''. The exception are temporary and teaching accounts. If you have a username starting with "hpc", you had to obtain Compute Canada credentials to get it. Compute Canada conducts annual renewals of their accounts/roles and will notify you with a deadline. Please follow those instructions. A separate renewal of your account at the Centre for Advanced Computing is not necessary as long as you keep the associated Compute Canada role active.
  
* Logon to the new system using a SSH client (MobaXterm on Windows, Terminal on macOS/Linux): <code>ssh yourUsername@login.cac.queensu.ca</code>. The first time you login, the system will prompt you to change your temporary password, then log you out (so you can test logging in with the new password).  
+
'''If you let your Compute Canada role expire without renewing it, it will be deactivated. The CAC account that is associated with this role will also be de-activated and you will loose access to your CAC account'''.
  
'''A set of guides on how to:'''
+
Find details about the Compute Canada renewal process at https://www.computecanada.ca/research-portal/account-management/account-renewals
  
* [[Access:Frontenac|... log into the system]]
+
<pre>
* [[Software:Frontenac|... setup and use software]]
+
Important: The next deadline for active Compute Canada accounts is April 23, 2018.
* [[Filesystems:Frontenac|... find your way around the filesystems]]
+
CAC accounts whose associated CCRI is not active after that date, will be de-activated.
* [[SLURM|... submit jobs using SLURM]]
+
</pre>
  
= Migrating to the new Frontenac cluster =
+
If your account at the Centre for Advanced Computing is de-activated because you failed to renew or activate the associated Compute Canada role, you can re-activate it by contacting us at cac.admin@queensu.ca with a request for re-activation. However, you must supply us with an active CCRI from Compute Canada, which means that you have to renew your or re-activate your Compute Canada role first.
  
This is a basic guide for users of our current CentOS 6 production systems ("SW cluster") to explain and facilitate migration to our new CentOS 7 systems ("Frontenac", "CAC cluster").
+
== Account cleanup at CAC ==
  
'''Note: We are in the final phase of the migration process. All users will gain access to the new systems by mid-November, and lose acess to the old systems in early January 2018. Scheduling of new jobs on the old system will stop in mid-December! Please make yourself familiar with the new systems.'''
+
In the spring of 2018, we are conducting a review of our accounts to bring them in line with Compute Canada's practice of regular renewals. We will contact users whose account is associated with inactive CCRI's with a request to participate in the annual Compute Canada account renewal to keep their accounts with us active.
  
== Why migrate ? ==
+
=== Old accounts ===
  
Our systems underwent a substantial refresh last year with the retirement of the Solaris-based M9000 systems, and their replacement by new X86/Intel based hardware. This hardware was largely added to the existing "SW cluster" and eventually replaced it completely. However, this gradual replacement did not address issues in the base structure of that cluster, such as an old scheduler system, and a less than cutting-edge file system. To enable our users to make efficient use of the new hardware, we decided that it is time for a re-design of our main compute cluster. Some of our storage components reach their "end of life" phase and will be retired within a year.
+
Some of our users have accounts that are associated with Compute Canada roles that are not active anymore. In past years we have kept these accounts active unless a de-activation was requested or became necessary for other reasons. We cannot continue this practise, and will '''de-activate CAC accounts that are found to be associated with inactive CCRI's after April 23, 2018'''.
  
Rather than permanently operating two separate clusters, we will move both our users and the compute hardware from one cluster/network to the other. In the interest of consistency, we can not make this process optional. '''We must move all our users to the new cluster by early 2018''' when service contracts for the old components run out.
+
If you are receiving an email to remind you to renew or re-activate your Compute Canada role to maintain an active account with the Centre for Advance Computing, please do so before the deadline set by Compute Canada for account renewal. We cannot make exceptions, as active accounts (roles) with Compute Canada are a pre-condition for an account with the CAC and are essential for proper accounting and usage reporting.
  
== What's Different ? ==
+
=== What if I don't have a Compute Canada account ? ===
  
The new cluster is based on a newer version of the same CentOS operating system. We have replaced the scheduler with SLURM, which is the same as is used on the new Compute Canada "GP" systems. We also replaced the "use" system by the more powerful and standard "lmod". Here are the main changes in table format.
+
In some rare cases, user may not have Compute Canada credentials. We will contact these users and direct them to apply for credentials. This is done through the [https://ccdb.computecanada.ca/account_application Compute Canada Database Registration Page]. Once you have obtained a CCRI (Compute Canada Role Identifier), please contact us at cac.admin@queensu.ca with that information, and we will link the role to your account. Make sure you apply for the right type of role: if you are a Principal Investigator (PI), apply for a role as faculty or another PI role; if you are "sponsored" by a PI, apply for one of the sponsored roles (student, post-doc, researcher, etc.) and provide the CCRI of your sponsor. It is important that this matches your account with us so we can link it.
  
{| class="wikitable" | '''Difference between "old" SW (Linux) and "new" CAC (Frontenac) clusters'''
+
=== What if my CCRI has expired / was de-activated ? ===
|-
+
|
+
|'''new SW (Linux) cluster'''
+
|'''new CAC (Frontenac) cluster'''
+
|-
+
| '''Operating system'''
+
| CentOS 6
+
| [https://wiki.centos.org/ CentOS] 7
+
|-
+
| '''File system type'''
+
| ZFS
+
| [https://www.ibm.com/support/knowledgecenter/en/SSFKCN/gpfs_welcome.html GPFS]
+
|-
+
| '''Scheduler'''
+
| Sun Grid Engine (SGE)
+
| [https://slurm.schedmd.com/ SLURM]
+
|-
+
| '''Software manager'''
+
| usepackage
+
| [https://lmod.readthedocs.io/en/latest/ lmod]
+
|-
+
| '''Backup management'''
+
| samfs
+
| [https://en.wikipedia.org/wiki/Hierarchical_storage_management Hierarchical Storage Management (HSM)]
+
|}
+
  
== Migration Time Table ==
+
If you have a CCRI with Compute Canada, and it has expired, you will need to re-activate it or get a new one:
  
Different users will migrate at different times. We have been moving data to the new file system for months, so that at the time when "it's your turn" your data will already be available on the new system. Here is a month-by-month outline of who will move when. If you want to migrate ahead of schedule, or you have compelling reasons to delay the move, please get in touch with us at cac.help@queensu.ca
+
* If the role has expired because you failed to renew it, but otherwise still reflects your current status, you need to login to the Compute Canada database and follow the instructions for role renewal there. If you are PI this may require providing a "CCV" (Canadian Curriculum Vitae).
 +
* If your role was de-activated because the account (role) of your "sponsor" (PI) became inactive, you need to ask your sponsor to renew his or her role first.
 +
* If your role was de-activated because your current status changed and is no longer reflected by that role, you need to apply for a new role. If this is the case, once you have this role, you may have to re-apply for a different account with CAC as well, because it is likely that the old account is no longer appropriate for you. If you are in doubt, please contact cac.help@queensu.ca and ask. We will assist you navigate the re-activation or account migration.
  
{| class="wikitable" | '''Difference between "old" SW (Linux) and "new" CAC (Frontenac) clusters'''
+
'''Important''': Note that we will check the CCRI that we have associated with your account at CAC after the deadline on April 23, 2018. If it is found to be inactive, we will de-activate the CAC account as well. This is done automatically. If your CCRI has changed recently, please contact us at cac.help@queensu.ca so we can reflect the change in our records and avoid de-activation.
|-
+
|'''Month (2017)'''
+
|'''Who moves ?'''
+
|-
+
| September
+
|
+
* De-actived users
+
* User who have not run a scheduled job for > 6 months
+
* Volunteers
+
|-
+
| October
+
|
+
* New accounts (i.e. new users will be going straight to Frontenac)
+
* User who have not run a scheduled job for > 3 months
+
* Volunteers
+
|-
+
| November
+
|
+
* New accounts (i.e. new users will be going straight to Frontenac)
+
|-
+
| December
+
|
+
* New accounts (i.e. new users will be going straight to Frontenac)
+
* '''Everyone'''
+
|}
+
  
We will transfer hardware from the "old" cluster (SW) to the new one (Frontenac) to accommodate the migrated users. This means that in the transition period, the old cluster will gradually become smaller while the new one grows. Dedicated hardware will be moved when its users migrate.
+
=== Why ? ===
  
== '''IMPORTANT DEADLINES''' ==
+
Regular account renewals are necessary to keep account information up-to-date and to avoid issues with account conditions no longer applying. Since Compute Canada account credentials are a necessary condition for a default CAC account, we are using the renewal cycle of Compute Canada to test for the continued existence of this condition. This also makes it possible to conduct regular "syncs" between the Compute Canada database and ours, which is necessary for proper reporting of usage.
 
+
In the final phase of the migration process, all users receive a notification email and are asked to make themselves familiar wit the new systems. Here is a list of important dates that our users should keep in mind when planning to use our systems in the time period between November 2017 and February 2018.
+
{| class="wikitable" | '''Important Migration Dates'''
+
|-
+
|'''Date'''
+
|'''Migration Event'''
+
|'''System'''
+
|-
+
| November 6, 2017
+
| Scheduling halted for all nodes with more than 24 cores
+
| SW ("old system")
+
|-
+
| December 1, 2017
+
|
+
* User notification by email
+
* '''All users receive access to new systems'''
+
| Frontenac ("new system")
+
|-
+
| January 3, 2017
+
|
+
* '''Data synchronization stops'''
+
* User data that differ after this date must be transferred by users
+
* Grid Engine '''scheduling disabled''' (nodes "draining")
+
| SW ("old system")
+
|-
+
| January 19, 2018
+
|
+
* '''All running jobs are terminated'''
+
* Remaining hardware is transferred to new system
+
| SW ("old system")
+
|-
+
| January 26, 2018
+
|
+
* User access to '''sflogin0/swlogin1 closed'''
+
* SNOlab (SX) cluster jobs terminated
+
* SNOlab (SX) login nodes closed
+
| SW ("old system")
+
|}
+
 
+
Until year-end, we are continuously "syncing" user data from the old to the new systems. Note that these are two independent copies of the data. This synchronization stops after January 3, 2018. After this date, '''it is the responsibility of the user''' to transfer data from the old to the new system if desired. If you encounter inconsistencies and need assistance, please contact us.
+
 
+
== Migration Schedule ==
+
 
+
The migration proceeds according to a scheme that was devised to minimize the impact on operations and user's research activities. Research groups migrate as a whole during a 1-month week time period. The migration procedure has three steps:
+
 
+
* '''1 - Initiation of migration process'''
+
** Email notification of the user (mid-November).
+
** Create account on new cluster.
+
** Issue temporary credentials to the new cluster and request initial login to change password.
+
* '''2 - Rolling rsync of user data'''
+
** Will be repeated until update requires less than 2 hrs
+
*** ''/home/hpcXXXX ''
+
*** ''/u1/work/hpcXXXX''
+
*** ''/scratch/hpcXXXX if required''
+
*** other directories ''if required''
+
** Users can access both new and old systems for 1 month.
+
*** Data on the old system that are newer than on the new one are rsync'ed.
+
* '''3 - Final migration'''
+
** Final rsync.
+
** Jobs on old cluster are terminated.
+
** User access to old system closed.
+
 
+
== Migration Q&A ==
+
 
+
* '''Q''': Who migrates ?
+
: '''A''': All of our users will migrate from the old SW cluster to the new "Frontenac" cluster
+
 
+
* '''Q''': Can I use my old "stuff" ?
+
: '''A''': Much of the old data and software will be usable on the new systems. However, the data will have to be copied over as the new systems use a separate file system, and cross access is not possible.
+
 
+
* '''Q''' Do I have to re-compile ?
+
: '''A''': It is possible that you will have to re-compile some of the software you are using. We will assist you with this.
+
 
+
* '''Q''': Do I copy my files over myself ?
+
: '''A''': Initially, we transfer your data for you. This synchronization process will end on December 15. If you are still altering your data after this date, it is your responsibility to transfer the data manually.
+
 
+
* '''Q''': Is this optional ?
+
: '''A''': No. We move both user data and hardware according to a schedule.
+
 
+
* '''Q''': Can I decide when to move ?
+
: '''A''': We are open to "early adopters", but we cannot grant extensions on the old systems.
+
 
+
* '''Q''': Will this disrupt my research ?
+
: '''A''': The moving of hardware and users causes unavoidable scheduling bottlenecks, as substantial portions of the clusters have to be kept inactive to "drain". Also, in the intermediate period when one cluster is dismantled and the other is being built up, both are substantially smaller. Especially larger jobs will be hard or impossible to schedule in the period between November'17 and February'18.
+
 
+
* '''Q''': How are resources allocated on the new cluster ?
+
: '''A''': Pleased read through our help file "[[Allocation|Resource Allocations on Frontenac]]"
+
 
+
== Help ==
+
If you have questions that you can't resolve by checking documentation, [mailto:cac.help@queensu.ca email to cac.help@queensu.ca].
+

Latest revision as of 16:23, 29 March 2018

Account Renewals

This is a guide to the account renewal process at the Centre for Advanced Computing. It explains what you need to do in order to keep your account active. It also explains details of our present effort to bring our accounts up to date and synchronize the account structure with the one at Compute Canada.

How to renew an account at the Centre for Advanced Computing

Almost all accounts that entitle users to access to our Frontenac cluster require an active Compute Canada Role. The exception are temporary and teaching accounts. If you have a username starting with "hpc", you had to obtain Compute Canada credentials to get it. Compute Canada conducts annual renewals of their accounts/roles and will notify you with a deadline. Please follow those instructions. A separate renewal of your account at the Centre for Advanced Computing is not necessary as long as you keep the associated Compute Canada role active.

If you let your Compute Canada role expire without renewing it, it will be deactivated. The CAC account that is associated with this role will also be de-activated and you will loose access to your CAC account.

Find details about the Compute Canada renewal process at https://www.computecanada.ca/research-portal/account-management/account-renewals

Important: The next deadline for active Compute Canada accounts is April 23, 2018.
CAC accounts whose associated CCRI is not active after that date, will be de-activated.

If your account at the Centre for Advanced Computing is de-activated because you failed to renew or activate the associated Compute Canada role, you can re-activate it by contacting us at cac.admin@queensu.ca with a request for re-activation. However, you must supply us with an active CCRI from Compute Canada, which means that you have to renew your or re-activate your Compute Canada role first.

Account cleanup at CAC

In the spring of 2018, we are conducting a review of our accounts to bring them in line with Compute Canada's practice of regular renewals. We will contact users whose account is associated with inactive CCRI's with a request to participate in the annual Compute Canada account renewal to keep their accounts with us active.

Old accounts

Some of our users have accounts that are associated with Compute Canada roles that are not active anymore. In past years we have kept these accounts active unless a de-activation was requested or became necessary for other reasons. We cannot continue this practise, and will de-activate CAC accounts that are found to be associated with inactive CCRI's after April 23, 2018.

If you are receiving an email to remind you to renew or re-activate your Compute Canada role to maintain an active account with the Centre for Advance Computing, please do so before the deadline set by Compute Canada for account renewal. We cannot make exceptions, as active accounts (roles) with Compute Canada are a pre-condition for an account with the CAC and are essential for proper accounting and usage reporting.

What if I don't have a Compute Canada account ?

In some rare cases, user may not have Compute Canada credentials. We will contact these users and direct them to apply for credentials. This is done through the Compute Canada Database Registration Page. Once you have obtained a CCRI (Compute Canada Role Identifier), please contact us at cac.admin@queensu.ca with that information, and we will link the role to your account. Make sure you apply for the right type of role: if you are a Principal Investigator (PI), apply for a role as faculty or another PI role; if you are "sponsored" by a PI, apply for one of the sponsored roles (student, post-doc, researcher, etc.) and provide the CCRI of your sponsor. It is important that this matches your account with us so we can link it.

What if my CCRI has expired / was de-activated ?

If you have a CCRI with Compute Canada, and it has expired, you will need to re-activate it or get a new one:

  • If the role has expired because you failed to renew it, but otherwise still reflects your current status, you need to login to the Compute Canada database and follow the instructions for role renewal there. If you are PI this may require providing a "CCV" (Canadian Curriculum Vitae).
  • If your role was de-activated because the account (role) of your "sponsor" (PI) became inactive, you need to ask your sponsor to renew his or her role first.
  • If your role was de-activated because your current status changed and is no longer reflected by that role, you need to apply for a new role. If this is the case, once you have this role, you may have to re-apply for a different account with CAC as well, because it is likely that the old account is no longer appropriate for you. If you are in doubt, please contact cac.help@queensu.ca and ask. We will assist you navigate the re-activation or account migration.

Important: Note that we will check the CCRI that we have associated with your account at CAC after the deadline on April 23, 2018. If it is found to be inactive, we will de-activate the CAC account as well. This is done automatically. If your CCRI has changed recently, please contact us at cac.help@queensu.ca so we can reflect the change in our records and avoid de-activation.

Why ?

Regular account renewals are necessary to keep account information up-to-date and to avoid issues with account conditions no longer applying. Since Compute Canada account credentials are a necessary condition for a default CAC account, we are using the renewal cycle of Compute Canada to test for the continued existence of this condition. This also makes it possible to conduct regular "syncs" between the Compute Canada database and ours, which is necessary for proper reporting of usage.