The “Your “qrsh” request could not be scheduled, try again later.” message can happen when you have either requested resources that are not available or when the cluster is busy. Some possible causes and suggested steps are:
– Add the “-now n” option to your qrsh request. This will cause your qrsh to wait until resources become available on a compute node rather than timing out after 5 seconds.
– Make sure you are not making resource requests that cannot be met. For example, our largest compute nodes have 512GB of RAM, and those nodes are typically heavily used. So, if you are requesting more than 300GB of RAM, your request will likely never be satisfiable.
– Similarly, keep in mind that RAM is a per-core resource, so if you are requesting multiple cores along with RAM, you will need to divide your total RAM request by the number of cores requested. For example, if your job needs 160GB of RAM and 8 cores, you would need to add “-pe local 8 -l mem_free=20G,h_vmem=20G” to your qrsh request. If you were to mistakenly use “-pe local 8 -l mem_free=160G,h_vmem=160G”, you would in effect be requesting 1TB of RAM so your request would never be satisfied.