Storage Event Triggers
Trigger jobs after objects are created or modified in a storage location
Creating a Storage Event Task
Storage Event Tasks are recurring tasks triggered when objects are created or modified in a storage location. To create a Cron task, use tilebox.workflows.recurrent_tasks.StorageEventTask
as your tasks base class instead of the regular tilebox.workflows.Task
.
Storage Locations
Storage Event tasks are triggered when objects are created or modified in a storage location. This location can be a cloud storage bucket or a local file system. Tilebox supports the following storage locations:
Google Cloud Storage
Amazon S3
Local File System
Registering a Storage Location
To make a storage location available within Tilebox workflows, it must be registered first. This involves specifying the location and setting up a notification system that forwards events to Tilebox, enabling task triggering. The setup varies depending on the storage location type.
For example, a GCP storage bucket is integrated by setting up a PubSub Notification with a push subscription. A local file system requires installing a filesystem watcher. To set up a storage location registered with Tilebox, please get in touch.
Listing Available Storage Locations
To list all available storage locations, use the all
method on the storage location client.
Reading Files from a Storage Location
Once a storage location is registered, you can read files from it using the read
method on the storage client.
The read
method instantiates a client for the specific storage location. This requires that
the storage location is accessible by a task runner and may require credentials for cloud storage
or physical/network access to a locally mounted file system.
To set up authentication and enable access to a GCS storage bucket, check out the Google Client docs for authentication.
Registering a Storage Event Trigger
After implementing a Storage Event task, register it to trigger each time a storage event occurs. This registration submits a new job consisting of a single task instance derived from the registered Storage Event task prototype.
The syntax for specifying glob patterns follows Standard Wildcards.
Additionally, you can use **
as a super-asterisk, a matching operator not sensitive to slash separators.
Here are some examples of valid glob patterns:
Pattern | Matches |
---|---|
*.ext | Any file ending in .ext in the root directory |
**/*.ext | Any file ending in .ext in any subdirectory, but not in the root directory |
**.ext | Any file ending in .ext in any subdirectory, including the root directory |
folder/* | Any file directly in a folder subdirectory |
folder/** | Any file directly or recursively part of a folder subdirectory |
[a-z].txt | Matches a.txt , b.txt , etc. |
Start a Storage Event Task Runner
With the Storage Event task registered, a job is submitted whenever a storage event occurs. But unless a task runner is available to execute the Storage Event task the submitted jobs remain in a task queue. Once an eligible task runner becomes available, all jobs in the queue are executed.
Triggering an Event
Creating an object in the bucket where the task is registered results in a job being submitted:
Inspecting the task runner output reveals that the job was submitted and the task executed:
Inspecting in the Console
The Tilebox Console provides an easy way to inspect all registered storage event tasks.
Deleting Storage Event triggers
To delete a registered storage event task, use recurrent_tasks.delete
. After deletion, no new jobs will be submitted by the storage event trigger. Past jobs already triggered will still remain queued.
Submitting Storage Event jobs manually
You can submit Storage event tasks as regular tasks for testing purposes or as part of a larger workflow. To do so, instantiate the task with a specific storage location and object name using the once
method.
Was this page helpful?