Data & I/O

Expanse handles data transfer between nodes automatically with full cross-language compatibility. Use the expanse_io library in your code to read inputs and write outputs.

The expanse_io API

Python

from expanse_io import read_input, write_output, read_json, write_json

# Read array input from a previous node
mesh = read_input("mesh")           # Returns numpy.ndarray

# Read JSON input
config = read_json("config")        # Returns dict

# Write array output for downstream nodes
write_output("solution", result)    # Accepts numpy.ndarray

# Write JSON output
write_json("metrics", {"loss": 0.05, "accuracy": 0.98})

Fortran

use expanse_io

real(8), allocatable :: mesh(:,:), solution(:,:)
integer :: shape(2), ierr

! Read array input
call expanse_read_real64("mesh", mesh, shape, ierr)
if (ierr /= 0) stop "Failed to read mesh"

! Write array output
call expanse_write_real64("solution", solution, shape, ierr)

Available Fortran functions:

ReadWriteType
expanse_read_real64expanse_write_real64real(8)
expanse_read_real32expanse_write_real32real(4)
expanse_read_int64expanse_write_int64integer(8)
expanse_read_int32expanse_write_int32integer(4)

C

#include "expanse_io.h"

double *mesh;
int64_t shape[2];
int err;

// Read array input
err = expanse_read_real64("mesh", &mesh, shape);
if (err != 0) { fprintf(stderr, "Failed to read mesh
"); exit(1); }

// Write array output
err = expanse_write_real64("solution", solution, shape, 2);

C functions mirror the Fortran API: expanse_read_real64, expanse_write_real64, etc.

Supported Types

TypeDescriptionPythonFortranC
array[float64, N]N-dimensional float64np.float64real(8)double
array[float32, N]N-dimensional float32np.float32real(4)float
array[int64, N]N-dimensional int64np.int64integer(8)int64_t
array[int32, N]N-dimensional int32np.int32integer(4)int32_t
jsonJSON objectdict--
fileArbitrary filepath stringpath stringpath string
stringPlain textstr--

N is the dimensionality: array[float64, 1] is 1D, array[float64, 2] is 2D. Omitting it defaults to 1D.

Data Directory

The data/ directory in your project root holds static input files. These are read-only and automatically transferred to remote clusters when needed.

Reference data files in node.yaml using the data/ prefix:

inputs:
  - name: mesh
    from: data/mesh.vtk
    type: file
  - name: config
    from: data/parameters.json
    type: json

Place your static files in the data directory:

my-simulation/
├── data/
│   ├── mesh.vtk
│   ├── parameters.json
│   └── initial_conditions.csv
├── nodes/
└── workflows/
Data is read-only
Expanse never writes to the `data/` directory. It exists for static inputs that your nodes consume. Node outputs flow through the artifact system.

Environment Variables

Expanse sets these environment variables automatically in every node execution. They are useful for advanced use cases or debugging.

VariableDescription
EXPANSE_INPUTSDirectory containing input files from upstream nodes
EXPANSE_OUTPUTSDirectory for writing output files
EXPANSE_ARTIFACT_DIRWorking artifact directory for this node
EXPANSE_DATAPath to the data directory on the target cluster
EXPANSE_NODE_NAMEName of the current node
EXPANSE_RUN_IDUnique identifier for the current run
EXPANSE_OUTPUT_{NAME}Resolved filename for each declared output (e.g. EXPANSE_OUTPUT_SOLUTION)
Prefer expanse_io
In most cases, use the `expanse_io` library instead of reading environment variables directly. The library handles serialisation, type safety, and cross-language compatibility.

Cross-Cluster Transfer

When workflow stages target different clusters, Expanse transfers data automatically:

stages:
  - name: preprocess
    ref: nodes/preprocess
    cluster: local            # Runs locally

  - name: solver
    ref: nodes/solver
    cluster: archer2          # Data transferred to archer2 automatically

The transfer happens transparently: your node code doesn’t need to know which cluster it’s running on.

Zero-Copy Transfer

Multi-language workflows transfer array data between nodes in a compact binary format with no serialisation overhead. A Python node can write an array and a C or Fortran node can read it directly — no CSV conversion, no extra code.

This is automatic when nodes use the expanse_io library with array types. All combinations of Python, C, and Fortran are supported in both directions. No configuration is needed.

# workflow.yaml
stages:
  - name: generate
    ref: nodes/generate     # Python node writes array output
  - name: process
    ref: nodes/process      # C node reads array input, writes array output
  - name: analyse
    ref: nodes/analyse      # Fortran node reads array input

How Data Flows

  1. Node A writes outputs using expanse_io (e.g. write_output("mesh", data))
  2. Expanse saves the data in a common format that all supported languages can read
  3. If Node B is on a different cluster, Expanse transfers the file automatically
  4. Node B reads the input using expanse_io (e.g. read_input("mesh"))

This means a Python node can produce data that a Fortran or C node consumes directly, with no manual conversion needed.