Data & I/O
Expanse handles data transfer between nodes automatically with full cross-language compatibility. Use the expanse_io library in your code to read inputs and write outputs.
The expanse_io API
Python
from expanse_io import read_input, write_output, read_json, write_json
# Read array input from a previous node
mesh = read_input("mesh") # Returns numpy.ndarray
# Read JSON input
config = read_json("config") # Returns dict
# Write array output for downstream nodes
write_output("solution", result) # Accepts numpy.ndarray
# Write JSON output
write_json("metrics", {"loss": 0.05, "accuracy": 0.98}) Fortran
use expanse_io
real(8), allocatable :: mesh(:,:), solution(:,:)
integer :: shape(2), ierr
! Read array input
call expanse_read_real64("mesh", mesh, shape, ierr)
if (ierr /= 0) stop "Failed to read mesh"
! Write array output
call expanse_write_real64("solution", solution, shape, ierr) Available Fortran functions:
| Read | Write | Type |
|---|---|---|
expanse_read_real64 | expanse_write_real64 | real(8) |
expanse_read_real32 | expanse_write_real32 | real(4) |
expanse_read_int64 | expanse_write_int64 | integer(8) |
expanse_read_int32 | expanse_write_int32 | integer(4) |
C
#include "expanse_io.h"
double *mesh;
int64_t shape[2];
int err;
// Read array input
err = expanse_read_real64("mesh", &mesh, shape);
if (err != 0) { fprintf(stderr, "Failed to read mesh
"); exit(1); }
// Write array output
err = expanse_write_real64("solution", solution, shape, 2); C functions mirror the Fortran API: expanse_read_real64, expanse_write_real64, etc.
Supported Types
| Type | Description | Python | Fortran | C |
|---|---|---|---|---|
array[float64, N] | N-dimensional float64 | np.float64 | real(8) | double |
array[float32, N] | N-dimensional float32 | np.float32 | real(4) | float |
array[int64, N] | N-dimensional int64 | np.int64 | integer(8) | int64_t |
array[int32, N] | N-dimensional int32 | np.int32 | integer(4) | int32_t |
json | JSON object | dict | - | - |
file | Arbitrary file | path string | path string | path string |
string | Plain text | str | - | - |
N is the dimensionality: array[float64, 1] is 1D, array[float64, 2] is 2D. Omitting it defaults to 1D.
Data Directory
The data/ directory in your project root holds static input files. These are read-only and automatically transferred to remote clusters when needed.
Reference data files in node.yaml using the data/ prefix:
inputs:
- name: mesh
from: data/mesh.vtk
type: file
- name: config
from: data/parameters.json
type: json Place your static files in the data directory:
my-simulation/
├── data/
│ ├── mesh.vtk
│ ├── parameters.json
│ └── initial_conditions.csv
├── nodes/
└── workflows/ Environment Variables
Expanse sets these environment variables automatically in every node execution. They are useful for advanced use cases or debugging.
| Variable | Description |
|---|---|
EXPANSE_INPUTS | Directory containing input files from upstream nodes |
EXPANSE_OUTPUTS | Directory for writing output files |
EXPANSE_ARTIFACT_DIR | Working artifact directory for this node |
EXPANSE_DATA | Path to the data directory on the target cluster |
EXPANSE_NODE_NAME | Name of the current node |
EXPANSE_RUN_ID | Unique identifier for the current run |
EXPANSE_OUTPUT_{NAME} | Resolved filename for each declared output (e.g. EXPANSE_OUTPUT_SOLUTION) |
Cross-Cluster Transfer
When workflow stages target different clusters, Expanse transfers data automatically:
stages:
- name: preprocess
ref: nodes/preprocess
cluster: local # Runs locally
- name: solver
ref: nodes/solver
cluster: archer2 # Data transferred to archer2 automatically The transfer happens transparently: your node code doesn’t need to know which cluster it’s running on.
Zero-Copy Transfer
Multi-language workflows transfer array data between nodes in a compact binary format with no serialisation overhead. A Python node can write an array and a C or Fortran node can read it directly — no CSV conversion, no extra code.
This is automatic when nodes use the expanse_io library with array types. All combinations of Python, C, and Fortran are supported in both directions. No configuration is needed.
# workflow.yaml
stages:
- name: generate
ref: nodes/generate # Python node writes array output
- name: process
ref: nodes/process # C node reads array input, writes array output
- name: analyse
ref: nodes/analyse # Fortran node reads array input How Data Flows
- Node A writes outputs using
expanse_io(e.g.write_output("mesh", data)) - Expanse saves the data in a common format that all supported languages can read
- If Node B is on a different cluster, Expanse transfers the file automatically
- Node B reads the input using
expanse_io(e.g.read_input("mesh"))
This means a Python node can produce data that a Fortran or C node consumes directly, with no manual conversion needed.