Clusters

What is a Cluster?

Use Cases

Use clusters to organize task runners into logical groups, which can help with:

Targeting specific task runners for a particular job
Reserving a group of task runners for specific purposes, such as running certain types of batch jobs
Setting up different clusters for different environments (like development and production)

Even when using different clusters, task runners within the same cluster may still have different capabilities, such as different registered tasks. If multiple task runners have the same set of registered tasks, you can assign them to different clusters to target specific task runners for a particular job.

Adding Task Runners to a Cluster

You can add task runners to a cluster by specifying the cluster’s slug when registering a task runner. Each task runner must always be assigned to a cluster.

Default Cluster

Each team has a default cluster that is automatically created for them. This cluster is used when no cluster is specified when registering a task runner or submitting a job. This is useful when you are just getting started and don’t need to create any custom clusters yet.

Managing Clusters

Before registering a task runner or submitting a job, you must create a cluster. You can also list, fetch, and delete clusters as needed. The following sections explain how to do this. To manage clusters, first instantiate a cluster client using the clusters method in the workflows client.

from tilebox.workflows import Client

client = Client()
clusters = client.clusters()

Creating a Cluster

To create a cluster, use the create method on the cluster client and provide a name for the cluster.

cluster = clusters.create("testing")
print(cluster)

Cluster(slug='testing-CvufcSxcC9SKfe', display_name='testing')

Cluster Slug

Each cluster has a unique identifier, combining the cluster’s name and an automatically generated identifier. Use this slug to reference the cluster for other operations, like submitting a job or subtasks.

Listing Clusters

To list all available clusters, use the all method:

all_clusters = clusters.all()
print(all_clusters)

[Cluster(slug='testing-CvufcSxcC9SKfe', display_name='testing'),
Cluster(slug='production-EifhUozDpwAJDL', display_name='Production')]

Fetching a Specific Cluster

To fetch a specific cluster, use the find method and pass the cluster’s slug:

cluster = clusters.find("testing-CvufcSxcC9SKfe")
print(cluster)

Cluster(slug='testing-CvufcSxcC9SKfe', display_name='testing')

Deleting a Cluster

To delete a cluster, use the delete method and pass the cluster’s slug:

clusters.delete("testing-CvufcSxcC9SKfe")

Jobs Across Different Clusters

When submitting a job, you need to specify which cluster the job’s root task should be executed on. This allows you to direct the job to a specific set of task runners. By default, all sub-tasks within a job are also submitted to the same cluster, but this can be overridden to submit sub-tasks to different clusters if needed. See the example below for a job that spans across multiple clusters.

from tilebox.workflows import Task, ExecutionContext, Client

class MultiCluster(Task):
    def execute(self, context: ExecutionContext) -> None:
        # this submits a task to the same cluster as the one currently executing this task
        same_cluster = context.submit_subtask(DummyTask())
        
        other_cluster = context.submit_subtask(
            DummyTask(),
            # this task runs only on a task runner in the "other-cluster" cluster
            cluster="other-cluster-As3dcSb3D9SAdK",
            # dependencies can be specified across clusters
            depends_on=[same_cluster],
        )

class DummyTask(Task):
    def execute(self, context: ExecutionContext) -> None:
        pass

# submit a job to the "testing" cluster
client = Client()
job_client = client.jobs()
job = job_client.submit(
    "my-job",
    MultiCluster(),
    cluster="testing-CvufcSxcC9SKfe",
)

This workflow requires at least two task runners to complete. One must be in the “testing” cluster, and the other must be in the “other-cluster” cluster. If no task runners are available in the “other-cluster,” the task submitted to that cluster will remain queued until a task runner is available. It won’t execute on a task runner in the “testing” cluster, even if the task runner has the DummyTask registered.

Get Started

Datasets

Storage

Workflows

Use Cases

Adding Task Runners to a Cluster

Default Cluster

Managing Clusters

Creating a Cluster

Cluster Slug

Listing Clusters

Fetching a Specific Cluster

Deleting a Cluster

Jobs Across Different Clusters

Get Started

Datasets

Storage

Workflows

​Use Cases

​Adding Task Runners to a Cluster

​Default Cluster

​Managing Clusters

​Creating a Cluster

​Cluster Slug

​Listing Clusters

​Fetching a Specific Cluster

​Deleting a Cluster

​Jobs Across Different Clusters

Use Cases

Adding Task Runners to a Cluster

Default Cluster

Managing Clusters

Creating a Cluster

Cluster Slug

Listing Clusters

Fetching a Specific Cluster

Deleting a Cluster

Jobs Across Different Clusters