Tunneling and Forwarding Ports

Port forwarding is covered here. This is specifically for TensorBoard.

TensorBoard Port-Forwarding

This section describes the steps to be followed to set up port forwarding for applications, like TensorBoard, which runs on the SambaNova system and binds to one or more ports. This example uses 6006 and 16006 as port numbers. Using port numbers other than these may avoid collisions with other users.

From your local machine

Run

ssh -v -N -f -L localhost:16006:localhost:16006 ALCFUserID@sambanova.alcf.anl.gov
...
Password: < MobilPass+ code >

ssh ALCFUserID@sambanova.alcf.anl.gov
...
Password: < MobilPass+ code >

replacing ALCFUserID with your ALCF User ID.

From sambanova.alcf.anl.gov

Below are the commands specific to sm-01. You may replace sm-01 with sm-02 when using the appropriate system.

Run

NOTE: The full name is sm-01.ai.alcf.anl.gov and it may also be used.

ssh -N -f -L localhost:16006:localhost:6006 ALCFUserID@sm-01
ssh ALCFUserID@sm-01

On sm-01

Execute the following command:

ALCFUserID@sm-01:~$ source /software/sambanova/envs/sn_env.sh
(venv) ALCFUserID@sm-01:~$

Navigate to the appropriate directory for your model. Launch your model using srun or sbatch.

cd /path/to/your/project
sbatch --output=pef/my_model/output.log submit-my_model-job.sh

On Another sm-01 Terminal Window

The SambaNova system has a bash shell script to setup the required software environment. This sets up the SambaFlow software stack, the associated environmental variables and activates a pre-configured virtual environment.

Use

ALCFUserID@sm-01:~$ source /software/sambanova/envs/sn_env.sh
(venv) ALCFUserID@sm-01:~$

Navigate to the appropriate directory for your model.

cd /path/to/your/project
tensorboard --logdir /logs --port 6006

Browser on Local Machine

Then, navigate in your browser to, in this example, http://localhost:16006 on your local machine.

Notes

Explanation of ssh command:

-N : no remote commands

-f : put ssh in the background

-L <machine1>:<portA>:<machine2>:<portB> :

The full command line will forward <machine1>:<portA> (local scope) to <machine2>:<portB> (remote scope)

Adapted from: How can I run Tensorboard on a remote server?