Known Issues
This is a collection of known issues that have been encountered on Polaris. Documentation will be updated as issues are resolved. Users are encouraged to email support@alcf.anl.gov to report issues.
Submitting Jobs
-
For batch job submissions, if the parameters within your submission script do not meet the parameters of any of the execution queues (
small
, ...,backfill-large
), you might not receive the "Job submission" error on the command line at all, and the job will never appear in historyqstat -xu <username>
(current bug in PBS). For example, if a user submits a script to theprod
routing queue requesting 10 nodes for 24 hours, exceeding the "Time Max" of 6 hours of thesmall
execution queue (which handles jobs with 10-24 nodes), then it may behave as if the job was never submitted. -
Job scripts are copied to temporary locations after
qsub
, and any changes to the original script while the job is queued will not be reflected in the copied script. Furthermore,qalter
requires-A <allocation name>
when changing job properties. Currently, there is a request for aqalter
-like command to trigger a re-copy of the original script to the temporary location.
Compiling & Running Applications
- If your job fails to start with an
RPC launch
message like below, please forward the complete messages to support@alcf.anl.gov. - The message below is an XALT-related warning that can be ignored when running
apptainer
. For other commands, please forward the complete message to support@alcf.anl.gov so we are aware of your use case.
ssh
'ing between Polaris Compute Nodes
-
You should be able to
ssh
freely (without needing a password) between your assigned compute nodes on Polaris. If you are running intossh
issues, check for the following causes: -
Your
/home/<username>
directory permissions should be set to700
(chmod 700 /home/<username>
). - Confirm the following files exist in your
.ssh
directory and the permissions are set to the following:-rw------- (600) authorized_keys
-rw-r--r-- (644) config
-rw------- (600) id_rsa
-rw-r--r-- (644) id_rsa.pub
- If you do not have the files mentioned above, you will need to create them.
- You can generate an
id_rsa
file with the following command:ssh-keygen -t rsa
- You can generate an
- Copy the contents of your
.ssh/id_rsa.pub
file to.ssh/authorized_keys
.