Nodes

A node is a single computational step. Each node has a node.yaml that defines what it runs, what data it consumes and produces, and what resources it needs.

Minimal Example

name: solver
kind: node
type: command

run:
  command: python3
  args: ["main.py"]

outputs:
  - name: result
    type: array[float64, 2]

Full node.yaml Reference

name: solver                    # Required. Unique within the project.
kind: node                      # Required. Always "node".
type: command                   # Required. Always "command".
cluster: archer2                # Optional. Overrides the default cluster.

build:                          # Optional. Runs once, cached automatically.
  command: make

run:                            # Required.
  command: ./bin/solver
  args: ["-n", "1000"]         # Optional. CLI arguments.

modules:                        # Optional. HPC modules to load before execution.
  - python                      # Simple list format
  - cray-fftw
  - cray-hdf5

deps:                           # Optional. Dependency installation settings.
  timeout: "30m"

env:                            # Optional. Environment variables.
  OMP_NUM_THREADS: "4"
  MY_CONFIG: "production"

resources:                      # Optional. Resource requirements.
  nodes: "2"
  memory: "16G"
  walltime: "02:00:00"
  qos: standard                 # Extra fields passed through as scheduler directives
  partition: gpu
  account: e01-myproject

inputs:                         # Optional. List of inputs from other nodes or data/.
  - name: mesh
    from: preprocess/mesh       # <node>/<output> or data/<path>
    type: array[float64, 2]
    required: true              # Optional. Defaults to true.

outputs:                        # Optional. List of outputs.
  - name: solution
    type: array[float64, 2]
    local_copy: true            # Optional. Download to local machine on completion.
    local_copy_ext: h5          # Optional. File extension for local copy.
    required: true              # Optional. Defaults to true.

cache:                          # Optional. Caching behaviour.
  outputs: true                 # Cache outputs when source + inputs are unchanged.

Field Reference

Top-Level Fields

FieldRequiredTypeDescription
nameYesstringUnique node name within the project
kindYesstringAlways "node"
typeYesstringAlways "command"
clusterNostringTarget cluster (overrides project default)
buildNoobjectBuild step configuration
runYesobjectRun step configuration
modulesNolist or objectHPC modules to load (list or object with load/init keys)
depsNoobjectDependency installation settings
envNomapEnvironment variables
resourcesNoobjectResource requirements
inputsNolistInput declarations
outputsNolistOutput declarations
cacheNoobjectCaching configuration

build

FieldTypeDescription
commandstringShell command to build the node (e.g. make, pip install -r requirements.txt)

The build step runs once and is cached automatically. If your source files haven’t changed, the build is skipped on subsequent runs.

modules

Simple list format:

modules:
  - python
  - cray-fftw
  - cray-hdf5

Structured format with module initialisation:

modules:
  load:
    - python
    - cray-fftw
  init: /etc/profile.d/modules.sh
FieldTypeDescription
loadlistModule names to load via module load
initstringPath to a shell script sourced before module load runs

The init field is for clusters that require module system initialisation scripts (e.g. Lmod, Cray modules). The script is sourced before any module load commands execute.

deps

FieldTypeDescription
timeoutstringMaximum time for dependency installation (e.g. "30m", "1h")

Expanse automatically detects dependency manifest files in your node directory: requirements.txt, pyproject.toml, conanfile.txt, fpm.toml, and installs dependencies before execution. Installed dependencies are content-addressably cached: if your manifest file hasn’t changed, installation is skipped entirely on subsequent runs.

run

FieldTypeDescription
commandstringShell command to execute
argslistOptional CLI arguments

inputs

FieldRequiredTypeDescription
nameYesstringLocal name used in your code via expanse_io
fromYesstringSource: <node>/<output> for node outputs or data/<path> for static files
typeYesstringData type (see Supported Types)
requiredNoboolDefault true. Set false for optional inputs.

outputs

FieldRequiredTypeDescription
nameYesstringOutput name, referenced by downstream nodes as <this-node>/<name>
typeYesstringData type (see Supported Types)
local_copyNoboolDownload this output to your local machine on job completion
local_copy_extNostringFile extension for the local copy (e.g. csv, h5, json)
requiredNoboolDefault true. Set false for optional outputs.

Supported Types

TypeDescriptionPythonFortranC
array[float64]1D float64 arraynp.float64real(8)double
array[float64, 2]2D float64 arraynp.float64real(8)double
array[float32]1D float32 arraynp.float32real(4)float
array[int64]1D int64 arraynp.int64integer(8)int64_t
array[int32]1D int32 arraynp.int32integer(4)int32_t
jsonJSON objectdict--
fileArbitrary filepath stringpath stringpath string
stringPlain textstr--

The number after the comma is dimensionality: array[float64, 2] is 2D, array[float64] or array[float64, 1] is 1D.

resources

FieldTypeDescription
cpustringNumber of CPU cores
memorystringMemory (e.g. "16G", "4096M")
walltimestringMaximum execution time in HH:MM:SS format
gpustringNumber of GPUs
storagestringDisk storage
any other keystringPassed through as scheduler directives (e.g. qos, partition, account, constraint)
Extra resource fields
Any field in `resources` that isn't `cpu`, `memory`, `walltime`, `gpu`, or `storage` is passed directly to the scheduler. For Slurm, this means you can set `qos`, `partition`, `account`, `constraint`, or any other `#SBATCH` directive.

cache

FieldTypeDescription
outputsboolWhen true, cache this node’s outputs. On re-run, if source code and inputs haven’t changed, the cached outputs are restored instead of re-executing.

See Caching for details.

Input/Output Wiring

Nodes connect through their input from fields. When node B declares an input from: A/output_name, Expanse knows that B depends on A and will:

  1. Run A first
  2. Transfer A’s output to B’s input directory, regardless of what cluster B is running on
  3. Run B
# nodes/generator/node.yaml
name: generator
outputs:
  - name: sample
    type: array[float64]

# nodes/processor/node.yaml
name: processor
inputs:
  - name: sample
    from: generator/sample      # <node-name>/<output-name>
    type: array[float64]

Language-Specific Setup

Python

Place a requirements.txt or pyproject.toml alongside node.yaml. Expanse automatically detects the manifest, installs dependencies into an isolated virtual environment, and caches the result. On subsequent runs, if the manifest hasn’t changed, dependency installation is skipped entirely.

name: preprocess
kind: node
type: command

run:
  command: python3
  args: ["main.py"]

resources:
  walltime: "00:30:00"

outputs:
  - name: mesh
    type: array[float64, 2]

C / Fortran with Build Step

Include a Makefile, or use a package manager for dependencies. Expanse supports Conan for C/C++ (reads conanfile.py or conanfile.txt) and fpm for Fortran (reads fpm.toml). Dependencies are automatically detected, installed, and cached. If your manifest file hasn’t changed, installation is skipped on subsequent runs.

Use the build field to compile before running.

Long dependency installations
If your dependencies take a long time to install, increase the timeout with `deps.timeout` in your `node.yaml`. The default is usually sufficient for most projects.
name: solver
kind: node
type: command

build:
  command: make

run:
  command: ./bin/solver

modules:
  - cray-fftw
  - cray-hdf5

env:
  OMP_NUM_THREADS: "4"

resources:
  cpu: "128"
  walltime: "02:00:00"

inputs:
  - name: mesh
    from: preprocess/mesh
    type: array[float64, 2]

outputs:
  - name: solution
    type: array[float64, 2]
    local_copy: true
    local_copy_ext: h5

Black-Box Solver

For commercial or pre-built solvers, point the run command at the existing executable:

name: commercial_cfd
kind: node
type: command

run:
  command: sh -c "starccm+ -batch run.java -np $SLURM_NTASKS"

modules:
  - starccm+/2024.1

resources:
  cpu: "512"
  walltime: "12:00:00"
  partition: large
  qos: standard

inputs:
  - name: mesh
    from: mesher/volume_mesh
    type: file

outputs:
  - name: results
    type: file
    local_copy: true
    local_copy_ext: csv