Instructions for gpt-neox:¶
We include below a set of instructions to get EleutherAI/gpt-neox running on Polaris.
A batch submission script for the following example is available here.
Warning
The instructions below should be run directly from a compute node.
Explicitly, to request an interactive job (from polaris-login):
Refer to job scheduling and execution for additional information.
-
Load and activate the base
condaenvironment: -
We've installed the requirements for running
gpt-neoxinto a virtual environment. To activate this environment: -
Clone the
EleutherAI/gpt-neoxrepository if it doesn't already exist: -
Navigate into the
gpt-neoxdirectory:Note
The remaining instructions assume you're inside the
gpt-neoxdirectory. -
Create a DeepSpeed compliant
hostfile(each line is formatted ashostname, slots=N): -
Create a
.deepspeed_envfile to ensure a consistent environment across all workers: -
Prepare data:
-
Train: