Known Issues¶
This is a collection of known issues that have been encountered on Polaris. Documentation will be updated as issues are resolved. Users are encouraged to email support@alcf.anl.gov to report issues.
Submitting Jobs¶
-
For batch job submissions, if the parameters within your submission script do not meet the parameters of any of the execution queues (
small, ...,backfill-large), you might not receive the "Job submission" error on the command line at all, and the job will never appear in the historyqstat -xu <username>(current bug in PBS). For example, if a user submits a script to theprodrouting queue requesting 10 nodes for 24 hours, exceeding the "Time Max" of 6 hours of thesmallexecution queue (which handles jobs with 10-24 nodes), then it may behave as if the job was never submitted. -
Job scripts are copied to temporary locations after
qsub, and any changes to the original script while the job is queued will not be reflected in the copied script. Furthermore,qalterrequires-A <allocation name>when changing job properties. Currently, there is a request for aqalter-like command to trigger a re-copy of the original script to the temporary location.
Compiling & Running Applications¶
- If your job fails to start with an
RPC launchmessage like the one below, please forward the complete messages to support@alcf.anl.gov. - The message below is an XALT-related warning that can be ignored when running
apptainer. For other commands, please forward the complete message to support@alcf.anl.gov so we are aware of your use case.
ssh'ing between Polaris Compute Nodes¶
-
You should be able to
sshfreely (without needing a password) between your assigned compute nodes on Polaris. If you are running intosshissues, check for the following causes: -
Your
/home/<username>directory permissions should be set to700(chmod 700 /home/<username>). - Confirm the following files exist in your
.sshdirectory and the permissions are set to the following:-rw------- (600) authorized_keys-rw-r--r-- (644) config-rw------- (600) id_rsa-rw-r--r-- (644) id_rsa.pub
- If you do not have the files mentioned above, you will need to create them.
- You can generate an
id_rsafile with the following command:ssh-keygen -t rsa
- You can generate an
- Copy the contents of your
.ssh/id_rsa.pubfile to.ssh/authorized_keys.