- Compute per Project
- Storage per Project
- Backup and Restore
- Availability, Maintenance, and Unplanned Disruption
- Monitoring and Arbitration
- Support and Documentation
Compute per Project
Storage per Project
- 50GB of resilient, backed-up personal home space.
- In addition, users are allocated a quota of 1TB scratch space for working storage.
- Additional Vault storage may be available by negotiation. Vault facilitates longer-term storage on Maxwell. It is useful when projects need to repeatedly use the same data over time.
Backup and Restore
- The scheduler balances the availability of slots among all users to permit fair access to the system.
- It considers the specific requirements of each job (e.g., number of CPUs, amount of RAM, job duration, and node affinity requirements) and prioritisation.
- The scheduler starts queued jobs as space becomes available and can be set to advise users of job status by email.
- Larger jobs requiring more time and resources are more difficult to schedule, so it is to the advantage of all users to make sure the requested resource is as accurate as possible.
- When insufficient memory is requested for a job, the job will not run and will need to be rescheduled with more memory requested.
- Where more memory is requested than is used, the user will have the full amount allocated to them as this resource is blocked and is not usable elsewhere.
- The default runtime of any job is 24 hours
- If more time is needed this must be explicitly stated.
- Less time can also be requested.
- When insufficient time is allocated to a job, the job will be stopped when the allocated time has elapsed and will need to be rescheduled with more time requested.
- Where the actual time used is less than the time requested, only the actual time will be attributed to a user’s account.
- Interactive jobs can run only when the requested resources (e.g. CPUs and memory) are immediately available on Maxwell.
- Once scheduled, data on the job are available from Maxwell using the ‘squeue’ command. This advises users of the status of a running job or the priority of a queued job.
Availability, Maintenance, and Unplanned Disruptions
- Planned maintenance will be communicated in advance to all users and will be scheduled to cause minimum disruption.
- Every effort will be made to ensure there are no unplanned disruptions to the service. Where events, either internal or external to the HPC, do cause disruption, every effort will be made by Digital Research to restore service as quickly as possible. This may involve work with our suppliers.
Monitoring and Arbitration
- The HPC uses a queuing algorithm which prioritises funded projects
Costs must be included in Worktribe grant applications:
- £100 minimum per funded project (HPC account, set up support, and up to 1000 core hours CPU)
- 10p per core hour for compute
- plus 10p per core hour per GPU (where GPU required)
- £400 per day for additional support (eg installation and troubleshooting of bespoke applications)
- Additional storage by negotiation.
- Unfunded PGR projects
Support and documentation