Overview

Here is a quick overview of the API for loading data from a collection which is covered in this page. Some usage examples for different use-cases are provided below.

MethodAPI ReferenceDescription
collection.loadLoading dataLoad data points from a collection.
collection.findLoading a data pointLoad a specific data point from a collection.

Check out the examples below for some common use-cases when loading data from collections. The examples assume that you have already created a client and accessed a specific dataset collection.

Loading data

The load method of a dataset collection object can be used to load data points from a collection. It takes a time_or_interval parameter that defines the time or time interval for which data should be loaded.

Time scalars

One common operation is to load all points for a given time, represented by a TimeScalar. A TimeScalar is either a datetime object or a string in ISO 8601 format. When you pass a TimeScalar to the load method, it loads all data points that match the specified time. Since the time field of data points in a collection is not unique, this can result in many data points being returned. If you only want to load a single data point instead, you can use find instead.

Check out the example below to see how to load a data point at a specific time from a collection.

Output
<xarray.Dataset> Size: 721B
Dimensions:                (time: 1, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499
    id                     (time) <U36 144B '01910b3c-8552-7671-3345-b902cc08...
  * time                   (time) datetime64[ns] 8B 2024-08-01T00:00:01.362000
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           (time) object 8B 'S1A_IW_RAW__0SDV_20240801T000001...
    processing_level       (time) <U2 8B 'L0'
    satellite              (time) object 8B 'SENTINEL-1'
    flight_direction       (time) <U9 36B 'ASCENDING'
    product_type           (time) object 8B 'IW_RAW__0S'
    copernicus_id          (time) <U36 144B 'aaa523a5-b8c3-4de9-bd11-3d587ac6...
    ...                     ...
    acquisition_mode       (time) <U2 8B 'IW'
    bands                  (time) int64 8B 0
    path                   (time) int64 8B 0
    row                    (time) int64 8B 0
    sun_azimuth            (time) float64 8B 0.0
    sun_elevation          (time) float64 8B 0.0

Tilebox uses a millisecond precision for timestamps. If you want to load all data points for a specific second, this is already a time interval request, so take a look at the examples below to learn how to achieve that.

The output of the preceding load method is a xarray.Dataset object. If you are unfamiliar with Xarray, you can find out more about it on the dedicated Xarray page.

Fetching only metadata

For certain use-cases it can be useful to only load the time series metadata of data points without loading the actual data fields. This can be achieved by setting the skip_data parameter to True when calling load. Check out the example below to see this in action.

Output
<xarray.Dataset> Size: 160B
Dimensions:         (time: 1)
Coordinates:
    ingestion_time  (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499
    id              (time) <U36 144B '01910b3c-8552-7671-3345-b902cc0813f3'
  * time            (time) datetime64[ns] 8B 2024-08-01T00:00:01.362000
Data variables:
    *empty*

Empty response

load always return a xarray.Dataset object, even if no data points were found for the specified time. In that case the returned dataset is empty, but it does not raise an error.

Output
<xarray.Dataset> Size: 0B
Dimensions:  ()
Data variables:
    *empty*

Timezone handling

Whenever a TimeScalar is specified as a string, the time is interpreted as in UTC. If you want to load data for a specific time in a different timezone, you can use a datetime object instead. In that case the Tilebox API internally convert the specified datetime to UTC before making a request. The output also always contain UTC timestamps, which would need to be manually converted again to different timezones.

Output
2017-01-01 11:45:25.679000+09:00
<xarray.Dataset> Size: 725B
Dimensions:                (time: 1, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 8B 2024-06-21T11:03:33.852435
    id                     (time) <U36 144B '015957ea-d82f-e454-34ab-a87603ee...
  * time                   (time) datetime64[ns] 8B 2017-01-01T02:45:25.679000
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables:
    ...

Time intervals

Another common operation is to load data for a specific time interval. This can be done using the load method of the dataset collection object.

TimeInterval tuples

One common way of achieving this is to use a tuple of the form (start, end) as the time_or_interval parameter. The start and end parameters are again TimeScalars and can be specified as either a datetime object or as a string in ISO 8601 format.

Output
<xarray.Dataset> Size: 725MB
Dimensions:                (time: 1109597, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 9MB 2024-06-21T11:03:33.8524...
    id                     (time) <U36 160MB '01595763-bae7-a646-99a5-d7f40d7...
  * time                   (time) datetime64[ns] 9MB 2017-01-01T00:17:50.8230...
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           (time) object 9MB 'S1A_IW_RAW__0SSV_20170101T00175...
    processing_level       (time) <U2 9MB 'L0' 'L0' 'L0' 'L0' ... 'L0' 'L0' 'L0'
    satellite              (time) object 9MB 'SENTINEL-1' ... 'SENTINEL-1'
    flight_direction       (time) <U10 44MB 'ASCENDING' ... 'ASCENDING'
    product_type           (time) object 9MB 'IW_RAW__0S' ... 'IW_RAW__0S'
    copernicus_id          (time) <U36 160MB 'f3f6ec28-0f72-5d28-9e14-93f96b3...
    ...                     ...
    acquisition_mode       (time) <U2 9MB 'IW' 'IW' 'IW' 'IW' ... 'IW' 'IW' 'IW'
    bands                  (time) int64 9MB 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    path                   (time) int64 9MB 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    row                    (time) int64 9MB 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    sun_azimuth            (time) float64 9MB 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
    sun_elevation          (time) float64 9MB 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0

The show_progress parameter is optional and can be used to display a tqdm progress bar while the data is being loaded.

A time interval specified as a tuple is always interpreted as a half-closed interval. This means that the data start time is inclusive, and the end time is exclusive. For example, given the preceding end time of 2023-01-01, data points with a time of 2022-12-31 23:59:59.999 would still be included, but every data point from 2023-01-01 00:00:00.000 would not be part of the result. This mimics the behaviour of the Python range function and is especially useful when chaining time intervals together. For example, the following code fetches the exact same data as preceding.

This code example shows a way to manually split up a large time interval into smaller chunks and make load data in different requests. Typically this is not necessary as the API automatically split up large intervals into different requests and paginate them for you. But it demonstrates how the half-closed interval behaviour can be useful, since it guarantees that there is no duplicated data points when chaining requests in that manner.

TimeInterval objects

In case you want more control whether the start and end time are inclusive or exclusive, you can use an object of the TimeInterval dataclass instead of a tuple as parameter for load. This class allows you to specify the start and end time as well as whether they are inclusive or exclusive. Check out the example below to see two ways of creating an equivalent TimeInterval object.

Output
Notice the different end characters ) and ] in the interval notations below:
[2017-01-01T00:00:00.000 UTC, 2023-01-01T00:00:00.000 UTC)
[2017-01-01T00:00:00.000 UTC, 2022-12-31T23:59:59.999 UTC]
They are equivalent: True

Time iterables

Another way of specifying a time interval when loading data is to use an iterable of TimeScalars as the time_or_interval parameter. This can be especially useful when you want to use the output of a previous call to load as the input for another call. Check out the example below to see how this can be done.

Output
<xarray.Dataset> Size: 33kB
Dimensions:                (time: 50, latlon: 2)
Coordinates:
    ingestion_time         (time) datetime64[ns] 400B 2024-06-21T11:03:33.852...
    id                     (time) <U36 7kB '01595763-bae7-a646-99a5-d7f40d7e6...
  * time                   (time) datetime64[ns] 400B 2017-01-01T00:17:50.823...
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           (time) object 400B 'S1A_IW_RAW__0SSV_20170101T0017...
    processing_level       (time) <U2 400B 'L0' 'L0' 'L0' ... 'L0' 'L0' 'L0'
    satellite              (time) object 400B 'SENTINEL-1' ... 'SENTINEL-1'
    flight_direction       (time) <U10 2kB 'ASCENDING' ... 'DESCENDING'
    product_type           (time) object 400B 'IW_RAW__0S' ... 'IW_RAW__0S'
    copernicus_id          (time) <U36 7kB 'f3f6ec28-0f72-5d28-9e14-93f96b342...
    ...                     ...
    acquisition_mode       (time) <U2 400B 'IW' 'IW' 'IW' ... 'IW' 'IW' 'IW'
    bands                  (time) int64 400B 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    path                   (time) int64 400B 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    row                    (time) int64 400B 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
    sun_azimuth            (time) float64 400B 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0
    sun_elevation          (time) float64 400B 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0

This capability is implemented by constructing a TimeInterval object from the first and last element of the iterable which has both the start and end time inclusive.

Loading a datapoint by ID

In case you already know the ID of the data point you want to load, you can use the find method of a collection instead.

It always return a single data point, or raise an exception if no data point with the specified ID exists in the collection. Check out the example below to see how this can be done.

Output
<xarray.Dataset> Size: 725B
Dimensions:                (latlon: 2)
Coordinates:
    ingestion_time         datetime64[ns] 8B 2024-08-20T05:53:08.600528
    id                     <U36 144B '01916d89-ba23-64c9-e383-3152644bcbde'
    time                   datetime64[ns] 8B 2024-08-20T02:07:08.323000
  * latlon                 (latlon) <U9 72B 'latitude' 'longitude'
Data variables: (12/30)
    granule_name           object 8B 'S1A_IW_RAW__0SDV_20240820T020708_202408...
    processing_level       <U2 8B 'L0'
    satellite              object 8B 'SENTINEL-1'
    flight_direction       <U10 40B 'DESCENDING'
    product_type           object 8B 'IW_RAW__0S'
    copernicus_id          <U36 144B 'e4f25888-825d-4c70-87cd-dd5a5ced7930'
    ...                     ...
    acquisition_mode       <U2 8B 'IW'
    bands                  int64 8B 0
    path                   int64 8B 0
    row                    int64 8B 0
    sun_azimuth            float64 8B 0.0
    sun_elevation          float64 8B 0.0

Since find always only return a single data point, there is no time dimension in the output dataset.

You can also specify the skip_data parameter when calling find to only load the metadata of the data point, in the same way as with load.

Possible exceptions

  • NotFoundError: If no data point with the given ID was found in the collection
  • ValueError: if the given datapoint_id is not a valid UUID