Tilebox Workflows
The Tilebox workflow orchestrator is a parallel processing engine. It simplifies the creation of dynamic tasks that can be executed across various computing environments, including on-premise and auto-scaling clusters in public clouds.
This section provides guides showcasing how to use the Tilebox workflow orchestrator effectively. Here are some of the key learning areas:
Create Tasks
Create tasks using the Tilebox Workflow Orchestrator.
Submit Jobs
Learn how to submit jobs to the workflow orchestrator, which schedules tasks for execution.
Set up Task Runners
Learn how to set up task runners to execute tasks in a distributed manner.
Gain insights through observability
Understand how to gain insights into task executions using observability features like tracing and logging.
Configure shared data access
Learn to configure shared data access for all tasks of a job using caches.
Trigger Jobs in near-real-time
Trigger jobs based on events or schedules, such as new data availability or CRON schedules.
Terminology
Before exploring Tilebox Workflows in depth, familiarize yourself with some common terms used throughout this section.
Tasks
Tasks
A Task is the smallest unit of work, designed to perform a specific operation. Each task represents a distinct operation or process that can be executed, such as processing data, performing calculations, or managing resources. Tasks can operate independently or as components of a more complex set of connected tasks known as a Workflow. Tasks are defined by their code, inputs, and dependencies on other tasks. To create tasks, you need to define the input parameters and specify the action to be performed during execution.
Jobs
Jobs
A job is a specific execution of a workflow with designated input parameters. It consists of one or more tasks that can run in parallel or sequentially, based on their dependencies. Submitting a job involves creating a root task with specific input parameters, which may trigger the execution of other tasks within the same job.
Task Runners
Task Runners
Task runners are the execution agents within the Tilebox Workflows ecosystem that execute tasks. They can be deployed in different computing environments, including on-premise servers and cloud-based auto-scaling clusters. Task runners execute tasks as scheduled by the workflow orchestrator, ensuring they have the necessary resources and environment for effective execution.
Clusters
Clusters
Clusters are a logical grouping for task runners. Using clusters, you can scope certain tasks to a specific group of task runners. Tasks, which are always submitted to a specific cluster, are only executed on task runners assigned to the same cluster.
Caches
Caches
Caches are shared storage that enable data storage and retrieval across tasks within a single job. They store intermediate results and share data among tasks, enabling distributed computing and reducing redundant data processing.
Observability
Observability
Observability refers to the feature set in Tilebox Workflows that provides visibility into the execution of tasks and jobs. Tools like tracing and logging allow users to monitor performance, diagnose issues, and gain insights into job operations, enabling efficient troubleshooting and optimization.