Skip to content

Argonne Leadership Computing Facility

Argonne Leadership
Computing Facility

Tunneling and Forwarding Ports

Port forwarding is covered here. This is specifically for TensorBoard.

TensorBoard Port-Forwarding

This section describes the steps to be followed to set up port forwarding for applications, like tensorboard, that run on the SambaNova system and bind to one or more ports. This example uses 6006 and 16006 as port numbers. Using port numbers other than these may avoid collisions with other users.

From your local machine

Run

ssh -v -N -f -L localhost:16006:localhost:16006 ALCFUserID@sambanova.alcf.anl.gov
...
Password: < MobilPass+ code >

ssh ALCFUserID@sambanova.alcf.anl.gov
...
Password: < MobilPass+ code >

replacing ALCFUserID with your ALCF User ID.

From sambanova.alcf.anl.gov

Run

NOTE: The full name is sm-01.ai.alcf.anl.gov and it may also be used.

ssh -N -f -L localhost:16006:localhost:6006 ALCFUserID@sm-01
ssh ALCFUserID@sm-01

On sm-01

Execute the following command:

ALCFUserID@sm-01:~$ source /software/sambanova/envs/sn_env.sh
(venv) ALCFUserID@sm-01:~$

Navigate to the appropriate directory for your model. Launch your model using srun or sbatch.

cd /path/to/your/project
sbatch --output=pef/my_model/output.log submit-my_model-job.sh

On Another sm-01 Terminal Window

The SambaNova system has a bash shell script to setup the required software environment. This sets up the SambaFlow software stack, the associated environmental variables and activates a pre-configured virtual environment.

Use

ALCFUserID@sm-01:~$ source /software/sambanova/envs/sn_env.sh
(venv) ALCFUserID@sm-01:~$

Navigate to the appropriate directory for your model.

cd /path/to/your/project
tensorboard --logdir /logs --port 6006

Browser on Local Machine

Then, navigate in your browser to, in this example, http://localhost:16006 on your local machine.

Notes

Explanation of ssh command:

-N : no remote commands

-f : put ssh in the background

-L <machine1>:<portA>:<machine2>:<portB> :

The full command line will forward <machine1>:<portA> (local scope) to <machine2>:<portB> (remote scope)

Adapted from: How can I run Tensorboard on a remote server?