Async support
Tilebox offers a standard synchronous API by default, but also give you to option of an async client if you need it.
Why async?
Often case when interacting with external datasets, such as Tilebox datasets loading data
can take a little while. One way to speed up this process is to run those requests in parallel. This can be achieved
by multi-threading or multi-processing, but this is not always easiest method of achieving this. An alternative is
to perform data loading tasks in an async manner, leveraging co-routines and asyncio
to achieve this.
Switching to an async datasets client
Typically all you need to do is swap out your import statement of the Client
and you’re good to go. Check out
the example below to see how that is done works.
Once you have switched to the async client, you can use the async
and await
keywords to make your code async.
Check out the examples below to see how that works for a few examples.
Jupyter notebooks or similar interactive environments also support asynchronous code execution. You can even use
await some_async_call()
as the output of a code cell.
Benefits
The main benefit of using an async client is that you can run requests concurrently, which improve performance. This is especially useful when you are loading data from different collections. Check out the example below to see how that works.
Example: Fetching data concurrently
The following example fetches data from different collections. In the synchronous example, it fetches the data sequentially, whereas in the async example it fetches the data concurrently. This means that the async approach is faster for such use cases.
The output is shown below. As you can see, the async approach is 5 seconds faster. If you have show_progress
enabled,
the progress bars are updated concurrently. In this example the second collection contains less data than the first one,
so it finishes first.
Async workflows
The Tilebox workflows Python client doesn’t offer an async client. That’s because workflows are already designed to be
executed in a distributed and concurrent fashion - outside of the context of a single async event loop.
But within a single task execution, you may still want to use async
code, to leverage the benefits of async execution, such
as loading data in parallel. Achieving this is straightforward, by wrapping your async code in asyncio.run
.
Below is an example of how you can leverage async code within a workflow task.
If you encounter an error like RuntimeError: asyncio.run() cannot be called from a running event loop
, it means
you are trying to start another asyncio event loop (with asyncio.run
) from within an already running event loop.
One situation where this can easily occur is if you are using asyncio.run
in a Jupyter notebook, since Jupyter
automatically starts an event loop for you. One way to work around this is to use nest-asyncio.
Was this page helpful?