Project Layout¶
Overview¶
This page summarizes Balsam architecture at a high level in terms of the roles that various modules play. Here are the files and folders you'll find right under balsam/
:
balsam.schemas
: The source of truth on the REST API schema is here. This defines the various resources which are imported by FastAPI code inbalsam.server
. FastAPI uses this to generate the OpenAPI schema and interactive documentation at/docs
. Moreover, the user-facing SDK inbalsam/_api/models.py
is dynamically-generated from the schemas herein. Runningmake generate-api
(which is invoked duringmake all
) will re-generate_api/models.py
from the schemas.balsam.server
the backend REST API server code that is deployed using Gunicorn, alongside PostgreSQL, to host the Balsam web service. Virtually all User or Site-initiated actions viabalsam.api
will ultimately create an HTTPS request inbalsam.client
that gets handled by this server.balsam.client
: Implementation of the lower-level HTTP clients that are used by the Managers ofbalsam._api
. Auth and retry logic is here. This is what ultimately talks to the server.balsam._api
: implementation of a Django ORM-like library for interacting with the Balsam REST API (e.g. look here to understand howJob.objects.filter()
is implemented)balsam.api
: public re-export of the user-facing resources in_api
balsam.analytics
: user-facing helper functions to analyze Job statisticsbalsam.cmdline
: The command line interfaces.balsam.config.config
: defines the centralClientSettings
class, which loads credentials from~/.balsam/client.yml
, as well as theSettings
class, which loads Site-local settings fromsettings.yml
. To understand the magic in these classes look at Pydantic Settings management.balsam.config.site_builder
: Does the heavy lifting forbalsam site init
.balsam.config.defaults
: The default settings for various HPC platforms that Balsam supports out-of-the-box. Add new defaults here to extend the list of systems that users are prompted to select frombalsam site init.
balsam.platform
: This package defines generic interfaces to compute nodes, MPI app launchers, and schedulers. Concrete subclasses of these implement the platform-specific interfaces (e.g. Thetaaprun
launcher or Slurm scheduler)balsam.shared_apps
: pre-packagedApplicationDefinitions
that users can import, subclass, and deploy to their Sites.balsam.site
: The implementation of the Balsam Site Agent and the Launcher pilot jobs that it ultimately dispatches. Thanks to the generic platform interfaces, all the code that does heavy lifting here is totally platform-agnostic: if you find yourself modifying code in here to accomodate new systems, you're doing it wrong!balsam.util
: logging, signal handling, datetime parsing, and miscellaneous helpers.
balsam.schemas
¶
Pydantic is used to define the REST API data structures, (de-)serialize HTTP JSON payloads, and perform validation. The schemas under balsam.schemas
are used both by the user-facing balsam._api
classes and the backend balsam.server.routers
API. Thus when an update to the schema is made, both the client and server-side code inherit the change.
The script schemas/api_generator.py
is invoked to re-generate the balsam/_api/models.py
whenever the schema changes. Thus, users benefit from always-up-to-date docstrings and type annotations across the Balsam SDK, while the implementation is handled by the internal Models, Managers, and Query classes in balsam/_api
.
balsam.server
¶
This is the self-contained codebase for the API server, implemented with FastAPI and SQLAlchemy. We do not expect Balsam users to ever run or touch this code, unless they are interested in standing up their own server.
server/conf.py
contains the server settings class which loads server settings from the environment using Pydantic.server/main.py
imports all of the API routers which define the HTTP endpoints for/auth/
,/jobs
, etc...server/auth
contains authentication routes and logic.__init__.py::build_auth_router
uses the server settings to determine which login methods will be exposed under the API/auth
URLs.authorization_code_login.py
anddevice_code_login.py
comprise the OAuth2 login capability.db_sessions.py
defines theget_admin_session
andget_webuser_session
functions that manage database connections and connection pooling. The latter can be used to obtain user-level sessions with RLS (row-level security).token.py
has the JWT logic which is used on every single request (not just login!) to authenticate the client request prior to invoking the FastAPI route handler.
server.main
defines the top-level URL routes into views located inbalsam.server.routers
balsam.server.routers
defines the possible API actions. These routes ultimately call into various methods underbalsam.server.models.crud
, where the business logic is defined.balsam.server.models
encapsulates the database and any actions that involve database communication.alembic
contains the database migrationscrud
contains the business logic invoked by the various FastAPI routestables.py
contains the SQLAlchemy model definitions
balsam.client
¶
This package defines the low-level RESTClient
interface used by all the Balsam Python clients.
The implementations capture the details of authentication to the Balsam API and performing HTTP requests.
balsam._api
¶
Whereas the RESTClient
provides a lower-level interface (exchanging JSON data over HTTP),
the balsam._api
defines Django ORM-like Models
and Managers
to emulate the original Balsam API:
from balsam.api import Job
finished_jobs = Job.objects.filter(state="JOB_FINISHED")
This is the primary user-facing SDK. Under the hood, it uses a RESTClient
implementation from the balsam.client
subpackage to actually make HTTP requests.
For example, the Job
model has access via Job.objects
to an instance of JobManager
which in turn was initialized with an authenticated RESTClient
. The Manager is responsible for auto-chunking large requests, lazy-loading queries, handling pagination transparently, and other features inspired by Django ORM that go beyond a typical auto-generated OpenAPI SDK. Rather than emitting SQL, these Models and Managers work together to generate HTTP requests.
ApplicationDefinition
¶
Users write their own subclasses of ApplicationDefinition
(defined in
_api/app.py
) to configure the Apps that may run at a particular Balsam Site.
Each ApplicationDefinition
is serialized and synchronized with the API when users run the sync()
method or submit Jobs using the App.
balsam.platform
¶
The platform
subpackage contains all the platform-specific interfaces to various HPC systems.
The goal of this architecture is to make porting Balsam to new HPC systems easier: a developer
should only have to write minimal interface code under balsam.platform
that subclasses and implements well-defined interfaces.
Here is a summary of the key Balsam platform interfaces:
AppRun
¶
This is the application launcher (mpirun
, aprun
, jsrun
) interface used by the Balsam launcher (pilot job).
It encapsulates the environment, workdir, output streams, compute resource specification (such as MPI ranks and GPUs), and the running process.
AppRun
implementations may or may not use a subprocess implementation to invoke the run.
ComputeNode
¶
The Balsam launcher uses this interface to discover available compute resources within a batch job, as well as to enumerate resources (CPU cores, GPUs) on a node and track their occupancy.
Scheduler
¶
The Balsam Site uses this interface to interact with the local resource manager (e.g. Slurm, Cobalt) to submit new batch jobs, check on job statuses, and inspect other system-wide metrics (e.g. backfill availability).
TransferInterface
¶
The Balsam Site uses this interface to submit new transfer tasks and poll on their status. A GlobusTransfer interface is implemented for batching Job stage-ins/stage-outs into Globus Transfer tasks.
balsam.config
¶
A comprehensive configuration is stored in each Balsam Site as settings.yml
.
This per-site config improves isolation between sites, and enables more flexible configs when multiple sites share a filesystem.
The Settings are also described by a Pydantic schema, which is used to validate
the YAML file every time it is loaded. The loaded settings are held in a
SiteConfig
instance that's defined within this subpackage.
The SiteConfig is available to all users via:
from balsam.api import site_config
. This import statement depends on the resolution of the Balsam site path from the local filesystem (i.e. it can only work when cwd
is inside of a Balsam site, which is the case wherever a Job
is running or pre/post-proccesing.)
balsam.site
¶
This subpackage contains the real functional core of Balsam: the various
components that run on a Site to execute workflows. The Site settings.yml
specifies which platform adapters are needed, and these adapters are injected
into the Site components, which use them generically. This enables all the code
under balsam.site
to run across platforms without modification.
JobSource
¶
Launchers and pre/post-processing modules use this interface to fetch Jobs
from the API. The abstraction
keeps specific API calls out of the launcher code base, and permits different implementation strategies:
FixedDepthJobSource
maintains a queue of pre-fetched jobs using a background processSynchronousJobSource
performs a blocking API call to fetch jobs according to a specification of available resources.
Most importantly, both JobSources
use the Balsam Session
API to acquire
Jobs by performing an HTTP POST /sessions/{session_id}
. This prevents race
conditions when concurrent launchers acquire Jobs at the same Site, and it
ensures that Jobs are not locked in perpetuity if a Session expires. Sessions
are ticked with a periodic heartbeat to refresh the lock on long-running jobs.
Eventually, Sessions are deleted when the corresponding launcher ends. Details of the job acquisition API are in schemas/sessions.py::SessionAcquire
.
StatusUpdater
¶
The StatusUpdater
interface is used to manage job status updates, and also helps to keep API-specific code out of the other Balsam internals. The primary implementation BulkStatusUpdater
pools update events that are passed via queue to a background process, and performs bulk API updates to reduce the frequency of API calls.
ScriptTemplate
¶
The ScriptTemplate
is used to generate shell scripts for submission to the local resource manager, using a Site-specific job template file.
Launcher
¶
The MPI
and serial
job modes of the Balsam launcher are implemented here. These are standalone, executable Python scripts that carry out the execution of Balsam Jobs
(sometimes called a pilot job mechanism). The launchers are invoked from a shell script generated by the ScriptTemplate
which is submitted to the local resource manager (via the Scheduler
interface).
The mpi
mode is a simpler implementation that runs a single process on the
head node of a batch job which launches each job with the MPI launcher (e.g.
aprun
, mpiexec
). The serial
mode is more involved: a Worker process
is first started on each compute node; these workers then receive Jobs
and
send status updates to a Master process. The advantage of Serial mode is that
very large numbers of node-local jobs can be launched without incurring the
overhead of a single mpiexec
per job.
In both mpi
and serial
modes, the leader process uses a JobSource
to acquire jobs and StatusUpdater
to send state updates back to the API (see above).
Serial mode differs from MPI mode by using ZeroMQ to forward jobs from the
leader to the Workers. Moreover, users can pass the --partitions
flag to
split a single Batch job allocation into multiple leader/worker groups. This
allows for scalable launch to hundreds of thousands of simultaneous tasks. For
instance on ThetaKNL, one can launch 131,072 simultaneous jobs across 2048
nodes by packing 64 Jobs per node with node_packing_count=64
. The Workers
will prefetch Jobs and launch them, thereby hiding the overhead of communication
with the leader. To further speed this process, the 2048 node batch job can be
split into 16
partitions of 128 nodes
each. Thus: 1 central API server fans
out to 16 serial mode leaders, each of which fan out to 128 workers, and each
worker prefetches Jobs and launches 64 concurrently. Job launch and polling are almost entirely overlapped with communication in this paradigm, and the serial mode leaders use background processes for the JobSource
and StatusUpdater
, leaving the leader highly available to forward jobs to Workers and route updates back to the StatusUpdater
's queue.
balsam.site.service
¶
The Balsam Site daemon comprises a group of background processes that run on behalf of the user. The daemon may run on a login node, or on any other resource appropriate for a long-running background process. The only requirements are that:
- The Site daemon can access the filesystem with the Site directory, and
- The Site daemon can access the local resource manager (e.g. perform
qsub
) - The Site daemon can access the Balsam API
The Site daemon is organized as a collection of BalsamService
classes, each of which describes a particular
background process. This setup is highly modular: users can easily configure which service modules are in use, and developers can implement additional services that hook directly into the Site.
main
¶
The balsam/site/service/main.py
is the entry point that ultimately loads settings.yml
into a SiteConfig
instance, which defines the various services that the Site Agent will run. Each of these services is launched as a background process and monitored here after invoking balsam site start
. When terminated (e.g. with balsam site stop
) -- the teardown happens here as well.
The following are some of the most common plugins to the Balsam site agent.
SchedulerService
¶
This BalsamService
component syncs with BatchJobs
in the Balsam API and uses the Scheduler
platform interface to submit new BatchJobs
and update the status of existing BatchJobs
. It does not automate the process of job submission -- it only serves to keep the API state and local resource manager state synchronized.
For example, a user performs balsam queue submit
to add a new BatchJob
to the REST API.
The SchedulerService
eventually detects this new BatchJob
, generates an appropriate script from the ScriptTemplate
and job-template.sh
(see above), and submits it to the local Slurm scheduler.
ElasticQueueService
¶
This BalsamService
monitors the backlog of Jobs
and locally available compute resources, and it automatically submits new BatchJobs
to the API to adapt to realtime workloads. This is a form of automated job submission, which works together with the SchedulerService
to fully automate resource allocation and execution.
QueueMaintainerService
¶
This is another, simpler, form of automated job submission, in which a constant number of fixed-size BatchJobs
are maintained at a Site (e.g. keep 5 jobs queued at all times). Intended to get through a long campaign of runs.
ProcessingService
¶
This service carries out the execution of various workflow steps that are defined on the ApplicationDefinition
:
preprocess()
postprocess()
handle_error()
handle_timeout()
These are meant to be lightweight and IO-bound tasks that run in a process pool on the login node or similar resource. Compute-intensive tasks should be performed in the main body of an App.
TransferService
¶
This service automates staging in data from remote locations prior to the preprocess()
step of a Job, and staging results out to other remote locations after postprocess()
. The service batches files and directories that are to be moved between a certain pair of endpoints, and creates batch Transfer tasks via the TransferInterface
.
balsam.cmdline
¶
The command line interfaces to Balsam are written as Python functions decorated with Click