STORAGE OVERVIEW¶
Types of storage¶
There are three types of data storage spaces available on the JHPCE cluster.
Type | Example Path | Quota? | Use | Cost |
---|---|---|---|---|
Home directory | /users/USERID | 100GB | Small datasets, Programs, Applications | $350/TB/yr - max $35/yr if 100GB used |
Project space | /dcs0?/*grpname*/data | Purchased Allocation | Research data | Between $25/TB/yr and $40/TB/yr |
Scratch space | $MYSCRATCH | 1TB | Temporary files | Free |
Home directories¶
All users have access to their own personal home directory space. The path to your home directory space is /users/USERID. By default, only you have access to you home directory.
Home directories can be used for storing programs you write, data that you will be working with, or applications that you need to run. In a Linux shell, your home directory can be referred to by ~ or $HOME
All home directories have a 100 GB quota. You can see how much space you are using by running the "hquota" command. This informaiton is also shown to you each time you login to the cluster. If you are finding that you need more space than the 100GB provided, please email us at bitsupport@lists.jhu.edu. We will work with you to find additional available space to use. This may include utilizing your 1TB of temporary scratch space
Home directories are backed up, but other storage areas are probably not. So if you are working in another directory, and you are generating unique data, you should copy it to your home directory, or copy it off of the JHPCE cluster.
For home directory space you are only charged for the actual storage you are using at a rate of $0.35/GB/yr. As you add or remove files, your charge will increase or decrease based on your usage.
Project Spaces¶
Every 12 - 18 months, a new large storage array is purchased for the JHPCE cluster, and allocations for these storage spaces are sold to the various PIs or groups that need storage space. As part of the planning process for bringing a new storage array online, we will reach out to all active PIs on the cluster, and surey them for their expected storage needs. We will then size the new storage array based on those needs.
Our last large storage build was in 2023, for the DCS07 storage array. We do still have some unsold capacity on this array, so please reach out to us at bitsupport@lists.jhu.edu if you have a need for additional storage.
As of 2024-05-01, the currect project storage arrays in places are:
Storage Name | Year Built | # of Disks | Disk Size | # of JBODs | Useable Space | Cost | Cost per TB |
---|---|---|---|---|---|---|---|
DCL02 | 2018 | 440 | 8TB | 10 | 2.4PB | $164,870.14 | $66.57 |
DCS041 | 2020 | 720 | 12TB | 10 | 5.0PB | $202,451.29 | $40.45 |
DCS052 | 2021 | 480 | 20TB | 8 | 6.2PB | $240,881.36 | $38.83 |
DCS063 | 2021 | 16 | 7.68 | 1 | 88TB | $34,207.00 | $305.17 |
DCS07 | 2023 | 300 | 22TB | 5 | 4.8PB | $145,453.99 | $30.61 |
Other Storage Arrays currently in use on the JHPCE cluster:
Storage Name | Use | Year Built | # of Disks | Disk Size | # of JBODs | Useable Space | Cost | Cost per TB |
---|---|---|---|---|---|---|---|---|
DCS02 | /home,/jhpce,/legacy | 2016 | 40 | 6TB | 1 | 172TB | $21,168.50 | $122.50 |
DCS03 | Backups | 2017 | 450 | 4TB,6TB | 10 | 2.1PB | $136,919.94 | $62.55 |
Fastscratch | Scratch | 2018 | 24 | 1TB | 1 | 24TB | $17,983.45 | $749.29 |
These storage arrays are built on the Dirt Cheap Storage and Dirt Cheap Lustre architecutre, hence the "DCS" and "DCL" names, as described in this paper from 2013. The storage arrays have been historically built on commodity servers and large JBODS (Just a Bunch Of Disks) from Supermicro, and run ZFS with Lustre on top of it in the case of the DCL arrays.
Scratch Space¶
We have a "Fastscratch" storage array that can be used by all users for storing files for less than 30 days. This storage array is in place to provide an additional 1TB of temporary storage space beyond the 100GB of home directory space. It is not for long-term storage of data, and all files older than 30 days are purged from this "Fastscratch" space. More information about using Fastscratch, including important restrictions, can be found here.
Backing up storage¶
You need to ensure that you have copies of your most vital files located somewhere else. See this document for more information.
Encrypted filesystem¶
Encrypted filesystem are used to provide “Encryption At Rest”, meaning that the data on disk will be safely stored in an encrypted format, and only available in an unencrypted state when the data is accessed by an approved user. This may be desireable when working with more sensitive data sources, or where Data Use Agreements require “Encryption At Rest”.
The JHPCE Cluster currently supports the following mechanisms for providing encrypted filesystems:
- Userspace encrypted filesystems using encfs. See ENCFS on JHPCE.
- If need be, a Project Storage space can be encrypted with ZFS encryption.
- The
DCL02
storage array is built on encrypted disk devices.
Deprecated Storage Arrays:¶
For historicla purposes, these are the storage arrays that have been used over
time on the JHPCE cluster:
information deemed reliable but not guaranteed
Storage Name | Year Built | Year Decom. | # of Disks | Disk Size | # of JBODs | Total Useable Space | Cost | Cost per TB |
---|---|---|---|---|---|---|---|---|
DCL01+exp | 2015 | 2023 | 440 | 8TB | 20 | 3.4PB | $164,870.14 | $66.57 |
DCS01 | 2013 | 2021 | 360 | 3TB | 8 | 688TB | $109,961.00 | $159.82 |
amber03 | 2012 | 2020 | 72 | 2TB | 2 | 100TB | $64,861.00 | $648.61 |
amber02 | 2011 | 2016 | 24 | 1TB | 1 | 16TB | $14,730.00 | $920.62 |
dexter | 2011 | 2016 | 12 | 1TB | 1 | 30TB | $13,690.00 | $456.33 |
thumper02 | 2010 | 2016 | 24 | 500G | 1 | 16TB | $21,025.00 | $1314.06 |
amber01 (/home) | 2009 | 2016 | 96 | 500GB+1TB | 2 | 72TB | $92,984.00 | $1291.44 |
nexsan2 | 2009 | 2016 | 12 | 2TB | 1 | 12TB | $14,436.00 | $1203.00 |
thumper01 | 2008 | 2016 | 24 | 500G | 1 | 16TB | $17,079.00 | $1067.00 |
nexsan1 | 2006 | 2016 | 12 | 1TB | 1 | 6TB | $19,060.00 | $3176.66 |