Tilebox offers a powerful and flexible querying API to access and filter data from your datasets. When querying, you can filter by time and for Spatio-temporal datasets optionally also filter by a location in the form of a geometry.

Selecting a collection

Querying is always done on a collection level, so to get started first select a collection to query.

from tilebox.datasets import Client

client = Client()
sentinel2_msi = client.dataset("open_data.copernicus.sentinel2_msi")
collection = sentinel2_msi.collection("S2A_S2MSI2A")

Querying multiple dataset collections at once is a feature already on our roadmap. If you need this functionality, please get in touch so we can let you know as soon as it is available.

Running a query

To query data points from a dataset collection, use the query method which is available for both python and go.

Below is a simple example of querying all Sentinel-2 S2A_S2MSI2A data for April 2025 over the state of Colorado.

from shapely import Polygon
from tilebox.datasets import Client

area = Polygon(  # area roughly covering the state of Colorado
    ((-109.05, 41.00), (-109.045, 37.0), (-102.05, 37.0), (-102.05, 41.00), (-109.05, 41.00)),
)

collection = sentinel2_msi.collection("S2A_S2MSI2A")
data = collection.query(
    temporal_extent=("2025-04-01", "2025-05-01"),
    spatial_extent=area,
    show_progress=True,
)

To learn more about how to specify filters to narrow down the query results, check out the following sections about filtering by time, by geometry or by datapoint ID.

Automatic pagination

Querying large datasets can result in a large number of data points. For those cases Tilebox automatically handles pagination for you by sending paginated requests to the server.

When using the python SDK in an interactive notebook environment, you can additionally also display a progress bar to keep track of the progress of the query by setting the show_progress parameter to True.

Skipping data fields

Sometimes, only the ID or timestamp associated with a datapoint is required. In those cases, you can speed up querying by skipping downloading of all dataset fields except of the time, the id and the ingestion_time by setting the skip_data parameter to True.

For example, when checking how many datapoints exist in a given time interval, you can use skip_data=True to avoid loading the data fields.

interval = ("2023-01-01", "2023-02-01")
data = collection.query(temporal_extent=interval, skip_data=True)
print(f"Found {data.sizes['time']} data points.")

Output

<xarray.Dataset> Size: 160B
Dimensions:         (time: 1)
Coordinates:
    ingestion_time  (time) datetime64[ns] 8B 2024-08-01T08:53:08.450499
    id              (time) <U36 144B '01910b3c-8552-7671-3345-b902cc0813f3'
  * time            (time) datetime64[ns] 8B 2024-08-01T00:00:01.362000
Data variables:
    *empty*

Empty response

Query will not raise an error if no data points are found for the specified query, instead an empty result is returned.

In python, this is an empty xarray.Dataset object. In Go, an empty slice of datapoints. To check for an empty response, you can coerce the result to a boolean or check the length of the slice.

timestamp_with_no_data_points = "1997-02-06 10:21:00:000"
data = collection.query(temporal_extent=timestamp_with_no_data_points)
if not data:
    print("No data points found")

Output

No data points found