Dear JHPCE community.
We are pleased to announce the availability of a new high-speed scratch storage device on the JHPCE cluster. The device is a 22TB NVME-based network-attached storage device. As such, it is designed as a personal staging area for high-speed i/o and file transfers or where multiple jobs will be accessing the same data/files repeatedly.
You can access your Personal Scratch space by using the $MYSCRATCH environment variable from within a qrsh session, or within a job submitted via qsub. The actual absolute path to your Personal Scratch space is /fastscratch/myscratch/$USER.
Due to the shared nature of the space, and the cost of the array, there are some very important restrictions for using this Personal Scratch space.
- There is currently a 1TB quota set on the Personal Scratch space, though we reserve the right to adjust this quota at a later date should it become necessary.
- All files older than 30 days will be mercilessly removed without exception from the Personal Scratch space area. Additionally, any overly large use/abuse of this space may result in files being deleted on an as needed basis. This Personal Scratch space is meant to be a short-term storage location; it is not a long-term storage solution.
- Because of this, if you “untar” or “unzip” a file, and the extracted files have a timestamp older than 30 days, they will be removed when the daily purge begins. To work around this, you can use the Unix “touch” command to update the timestamp on the extracted files.
- Even though there is a 30 day automatic deletion of data, we ask that you please remove data from your Personal Scratch space once you have finished using it.
- The Fast Scratch space is not visible on the jhpce01/jhpce02 login nodes. You will need to a) qrsh into a compute node or a transfer node, or b) qsub a job in order to see this scratch space.
- There is no cost for using your Personal Scratch space.
- We are not changing the location of the SGE $TMPDIR space used by Sun Grid Engine. Your jobs can still use the existing $TMPDIR as always, and this $TMPDIR will remain on the local internal disk space of each compute node.
- Any exceptions to the above stated policies must be approved by the Director of the JHPCE cluster, Dr. Fernando Pineda. Please note that Dr. Pineda will not approve any exceptions.
Some examples of jobs that could benefit from the use of the Personal Scratch space would be:
– Downloading data from an external site into the JHPCE cluster that is only needed temporarily for a job.
– Staging data that will be used by multiple jobs on multiple nodes. The Personal Scratch space is significantly faster than, say, one’s home directory, so there may be a significant performance benefit by staging frequently used data in one’s Personal Scratch space.
– A temporary location for storing intermediate files that a program creates, but are not needed long term. For an SGE job, the $TMPDIR is still the recommended location for small intermediate files, as the $TMPDIR is cleaned up once the job completes, however you should use the $MYSCRATCH directory for larger files.
With all of these examples, please bear in mind that any data that you wish to retain long term will need to be copied to some other long term location (your home directory, or space on DCS/DCL).
One last technical note… In order to optimize performance and capacity of the Personal Scratch space, the storage array was set up without any redundancy on the array. What this means is that if one of the NVME drives fails, it may cause the Personal Scratch space to become unavailable and data to be lost the Personal Scratch space. While we do not expect this to happen any time soon, it is a possibility. All other storage on the JHPCE cluster has several layers of redundancy in place to minimize the impact of a hardware failure.
We hope that this Personal Scratch space will help in situations where there is a need to store a large amount of data for a short period of time. Please let us know if you have any questions.
Mark