Overview

This section provides an overview of the API for loading data from a collection. It includes usage examples for many common scenarios.

MethodAPI ReferenceDescription
collection.loadLoading dataLoad data points from a collection.
collection.findLoading a data pointFind a specific datapoint in a collection by its id.

Check out the examples below for common scenarios when loading data from collections. The examples assume you have already created a client and accessed a specific dataset collection.

Loading data

To load data points from a dataset collection, use the load method. It requires a time_or_interval parameter to specify the time or time interval for loading.

TimeInterval

To load data for a specific time interval, use a tuple in the form (start, end) as the time_or_interval parameter. Both start and end must be TimeScalars, which can be datetime objects or strings in ISO 8601 format.

Output
<xarray.Dataset> Size: 725MB
Dimensions:                (time: 1109597, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 9MB 2024-06-21T11:03:33.8524...
    id                     (time) <U36 160MB '01595763-bae7-a646-99a5-d7f40d7...
  * time                   (time) datetime64[ns] 9MB 2017-01-01T00:17:50.8230...
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           (time) object 9MB 'S1A_IW_RAW__0SSV_20170101T00175...
    processing_level       (time) <U2 9MB 'L0' 'L0' 'L0' 'L0' ... 'L0' 'L0' 'L0'
    satellite              (time) object 9MB 'SENTINEL-1' ... 'SENTINEL-1'
    ...                     ...

The show_progress parameter is optional and can be used to display a tqdm progress bar while loading data.

A time interval specified as a tuple is interpreted as a half-closed interval. This means the start time is inclusive, and the end time is exclusive. For instance, using an end time of 2023-01-01 includes data points up to 2022-12-31 23:59:59.999, but excludes those from 2023-01-01 00:00:00.000. This behavior mimics the Python range function and is useful for chaining time intervals.

Above example demonstrates how to split a large time interval into smaller chunks while loading data in separate requests. Typically, this is not necessary as the datasets client auto-paginates large intervals.

TimeInterval objects

For greater control over inclusivity of start and end times, you can use the TimeInterval dataclass instead of a tuple with the load parameter. This class allows you to specify the start and end times, as well as their inclusivity. Here’s an example of creating equivalent TimeInterval objects in two different ways.

Output
Inclusivity is indicated by interval notation: ( and [
[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC)
[2017-01-01T00:00:00.000 UTC, 2022-12-31T23:59:59.999 UTC]
They are equivalent: True
[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC)

Time scalars

You can load all points for a specific time using a TimeScalar for the time_or_interval parameter to load. A TimeScalar can be a datetime object or a string in ISO 8601 format. When passed to the load method, it retrieves all data points matching the specified time. Note that the time field of data points in a collection may not be unique, so multiple data points could be returned. If you want to fetch only a single data point, use find instead.

Here’s how to load a data point at a specific time from a collection.

Output
<xarray.Dataset> Size: 721B
Dimensions:                (time: 1, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499
    id                     (time) <U36 144B '01910b3c-8552-7671-3345-b902cc08...
  * time                   (time) datetime64[ns] 8B 2024-08-01T00:00:01.362000
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           (time) object 8B 'S1A_IW_RAW__0SDV_20240801T000001...
    processing_level       (time) <U2 8B 'L0'
    satellite              (time) object 8B 'SENTINEL-1'
    ...                     ...

Tilebox uses millisecond precision for timestamps. To load all data points for a specific second, it’s a time interval request. Refer to the examples below for details.

The output of the load method is an xarray.Dataset object. To learn more about Xarray, visit the dedicated Xarray page.

Time iterables

You can specify a time interval by using an iterable of TimeScalars as the time_or_interval parameter. This is especially useful when you want to use the output of a previous load call as input for another load. Here’s how that works.

Output
<xarray.Dataset> Size: 33kB
Dimensions:                (time: 50, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 400B 2024-06-21T11:03:33.852...
    id                     (time) <U36 7kB '01595763-bae7-a646-99a5-d7f40d7e6...
  * time                   (time) datetime64[ns] 400B 2017-01-01T00:17:50.823...
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           (time) object 400B 'S1A_IW_RAW__0SSV_20170101T0017...
    processing_level       (time) <U2 400B 'L0' 'L0' 'L0' ... 'L0' 'L0' 'L0'
    satellite              (time) object 400B 'SENTINEL-1' ... 'SENTINEL-1'
    ...                     ...

This feature works by constructing a TimeInterval object from the first and last elements of the iterable, making both the start and end time inclusive.

Fetching only metadata

Sometimes, it may be useful to load only the time series metadata without the actual data fields. This can be done by setting the skip_data parameter to True when using load. Here’s an example.

Output
<xarray.Dataset> Size: 160B
Dimensions:         (time: 1)
Coordinates:
    ingestion_time  (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499
    id              (time) <U36 144B '01910b3c-8552-7671-3345-b902cc0813f3'
  * time            (time) datetime64[ns] 8B 2024-08-01T00:00:01.362000
Data variables:
    *empty*

Empty response

The load method always returns an xarray.Dataset object, even if there are no data points for the specified time. In such cases, the returned dataset will be empty, but no error will be raised.

Output
<xarray.Dataset> Size: 0B
Dimensions:  ()
Data variables:
    *empty*

Timezone handling

When a TimeScalar is specified as a string, the time is treated as UTC. If you want to load data for a specific time in another timezone, use a datetime object. In this case, the Tilebox API will convert the datetime to UTC before making the request. The output will always contain UTC timestamps, which will need to be converted again if a different timezone is required.

Output
2017-01-01 11:45:25.679000+09:00
<xarray.Dataset> Size: 725B
Dimensions:                (time: 1, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 8B 2024-06-21T11:03:33.852435
    id                     (time) <U36 144B '015957ea-d82f-e454-34ab-a87603ee...
  * time                   (time) datetime64[ns] 8B 2017-01-01T02:45:25.679000
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables:
    ...

Loading a data point by ID

If you know the ID of the data point you want to load, you can use the collection.find.

This method always returns a single data point or raises an exception if no data point with the specified ID exists. Here’s how to do this.

Output
<xarray.Dataset> Size: 725B
Dimensions:                (latlon: 2)
Coordinates:
    ingestion_time         datetime64[ns] 8B 2024-08-20T05:53:08.600528
    id                     <U36 144B '01916d89-ba23-64c9-e383-3152644bcbde'
    time                   datetime64[ns] 8B 2024-08-20T02:07:08.323000
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           object 8B 'S1A_IW_RAW__0SDV_20240820T020708_202408...
    processing_level       <U2 8B 'L0'
    satellite              object 8B 'SENTINEL-1'
    ...                     ...

Since find returns only a single data point, the output dataset does not include a time dimension.

You can also set the skip_data parameter when calling find to load only the metadata of the data point, same as for load.

Possible exceptions

  • NotFoundError: raised if no data point with the given ID is found in the collection
  • ValueError: raised if the specified datapoint_id is not a valid UUID