Remote execution#
Workflows can be executed on a Dask cluster started as an external service.
Below is an example of executing a workflow with two independent nodes. Since the nodes are not connected, they can run in parallel:
from ewoksdask import execute_graph
workflow = {
"graph": {"id": "test"},
"nodes": [
{
"id": "node1",
"task_type": "method",
"task_identifier": "time.sleep",
"default_inputs": [{"name": 0, "value": 10}],
},
{
"id": "node2",
"task_type": "method",
"task_identifier": "time.sleep",
"default_inputs": [{"name": 0, "value": 10}],
},
],
}
result = execute_graph(workflow, scheduler="127.0.0.1:8786")
The scheduler parameter is the address of a Dask scheduler.
Local scheduler#
You can start a local scheduler with multiple workers using the following command:
ewoksdask local --n-workers 5 --threads-per-worker 10
This example launches:
5 worker processes
Each with 10 threads
Allowing up to 50 parallel tasks
Address: tcp://127.0.0.1:8786
Dashboard: http://127.0.0.1:8787/status
Scheduler is running. Press CTRL-C to stop.
More configuration options can be found in the Dask documentation.
The dashboard provides detailed real-time information about running jobs and worker activity.
Distributed scheduler#
To set up a distributed cluster, start the scheduler on a remote machine:
dask scheduler
Example output:
Scheduler at: tcp://192.168.1.47:8786
dashboard at: http://192.168.1.47:8787/status
Next, connect workers to this scheduler. The example below adds 3 workers, each with 4 processes (totaling 12 concurrent tasks):
dask worker 127.0.0.1:8786 --nprocs 4
dask worker 127.0.0.1:8786 --nprocs 4
dask worker 127.0.0.1:8786 --nprocs 4
Slurm scheduler#
To use Dask with a Slurm-managed cluster, launch a scheduler from a Slurm submitter node (i.e., a machine with Slurm client utilities configured):
ewoksdask slurm --minimum-jobs 3 --maximum-jobs 10 --cores=2 --memory=64GB --walltime="01:00:00" --queue=gpu --gpus=1 --log debug
This command will:
Launch 3 permanent jobs (restarted when terminated)
Scale up to 10 jobs as needed
Provide a maximum capacity of 10 concurrent tasks
Each job will submitted to the gpu queue
Each job will be terminated after one hour
Each job will have the following resources for the execution of workflow tasks
2 CPU cores
1 GPU
64GB of RAM
Refer to the Dask JobQueue SlurmCluster documentation for more scheduler configuration options.