Loading Time Series Data
Learn how to load data from Time Series Dataset collections.
Overview
This section provides an overview of the API for loading data from a collection. It includes usage examples for many common scenarios.
Method | API Reference | Description |
---|---|---|
collection.load | Loading data | Load data points from a collection. |
collection.find | Loading a data point | Find a specific datapoint in a collection by its id. |
Check out the examples below for common scenarios when loading data from collections. The examples assume you have already created a client and accessed a specific dataset collection.
Loading data
To load data points from a dataset collection, use the load method. It requires a time_or_interval
parameter to specify the time or time interval for loading.
TimeInterval
To load data for a specific time interval, use a tuple
in the form (start, end)
as the time_or_interval
parameter. Both start
and end
must be TimeScalars, which can be datetime
objects or strings in ISO 8601 format.
The show_progress
parameter is optional and can be used to display a tqdm progress bar while loading data.
A time interval specified as a tuple is interpreted as a half-closed interval. This means the start time is inclusive, and the end time is exclusive. For instance, using an end time of 2023-01-01
includes data points up to 2022-12-31 23:59:59.999
, but excludes those from 2023-01-01 00:00:00.000
. This behavior mimics the Python range
function and is useful for chaining time intervals.
Above example demonstrates how to split a large time interval into smaller chunks while loading data in separate requests. Typically, this is not necessary as the datasets client auto-paginates large intervals.
TimeInterval objects
For greater control over inclusivity of start and end times, you can use the TimeInterval
dataclass instead of a tuple with the load
parameter. This class allows you to specify the start
and end
times, as well as their inclusivity. Here’s an example of creating equivalent TimeInterval
objects in two different ways.
Time scalars
You can load all points for a specific time using a TimeScalar
for the time_or_interval
parameter to load
. A TimeScalar
can be a datetime
object or a string in ISO 8601 format. When passed to the load
method, it retrieves all data points matching the specified time. Note that the time
field of data points in a collection may not be unique, so multiple data points could be returned. If you want to fetch only a single data point, use find instead.
Here’s how to load a data point at a specific time from a collection.
Tilebox uses millisecond precision for timestamps. To load all data points for a specific second, it’s a time interval request. Refer to the examples below for details.
The output of the load
method is an xarray.Dataset
object. To learn more about Xarray, visit the dedicated Xarray page.
Time iterables
You can specify a time interval by using an iterable of TimeScalar
s as the time_or_interval
parameter. This is especially useful when you want to use the output of a previous load
call as input for another load. Here’s how that works.
This feature works by constructing a TimeInterval
object from the first and last elements of the iterable, making both the start and end time inclusive.
Fetching only metadata
Sometimes, it may be useful to load only the time series metadata without the actual data fields. This can be done by setting the skip_data
parameter to True
when using load
. Here’s an example.
Empty response
The load
method always returns an xarray.Dataset
object, even if there are no data points for the specified time. In such cases, the returned dataset will be empty, but no error will be raised.
Timezone handling
When a TimeScalar
is specified as a string, the time is treated as UTC. If you want to load data for a specific time in another timezone, use a datetime
object. In this case, the Tilebox API will convert the datetime to UTC
before making the request. The output will always contain UTC timestamps, which will need to be converted again if a different timezone is required.
Loading a data point by ID
If you know the ID of the data point you want to load, you can use the collection.find.
This method always returns a single data point or raises an exception if no data point with the specified ID exists. Here’s how to do this.
Since find
returns only a single data point, the output dataset does not include a time
dimension.
You can also set the skip_data
parameter when calling find
to load only the metadata of the data point, same as for load
.
Possible exceptions
NotFoundError
: raised if no data point with the given ID is found in the collectionValueError
: raised if the specifieddatapoint_id
is not a valid UUID
Was this page helpful?