Collections group data points within a dataset. They help represent logical groupings of data points that are commonly queried together. For example, if your dataset includes data from a specific instrument on different satellites, you can group the data points from each satellite into a collection.

Overview

This section provides a quick overview of the API for listing and accessing collections. Below are some usage examples for different scenarios.

MethodDescription
dataset.collectionsList all available collections for a dataset.
dataset.create_collectionCreate a collection in a dataset.
dataset.get_or_create_collectionGet a collection, create it if it doesn’t exist.
dataset.collectionAccess an individual collection by its name.
collection.infoRequest information about a collection.

Refer to the examples below for common use cases when working with collections. These examples assume that you have already created a client and listed the available datasets.

from tilebox.datasets import Client

client = Client()
datasets = client.datasets()

Listing collections

To list the collections for a dataset, use the collections method on the dataset object.

dataset = datasets.open_data.copernicus.landsat8_oli_tirs
collections = dataset.collections()
print(collections)
Output
{'L1GT': Collection L1GT: [2013-03-25T12:08:43.699 UTC, 2024-08-19T12:57:32.456 UTC] (154288 data points),
 'L1T': Collection L1T: [2013-03-26T09:33:19.763 UTC, 2020-08-24T03:21:50.000 UTC] (87958 data points),
 'L1TP': Collection L1TP: [2013-03-24T00:25:55.457 UTC, 2024-08-19T12:58:20.229 UTC] (322041 data points),
 'L2SP': Collection L2SP: [2015-01-01T07:53:35.391 UTC, 2024-08-12T12:52:03.243 UTC] (191110 data points)}

dataset.collections returns a dictionary mapping collection names to their corresponding collection objects. Each collection has a unique name within its dataset.

Creating collections

To create a collection in a dataset, use dataset.create_collection(). This method returns the created collection object.

collection = dataset.create_collection("My-collection")

Alternatively, you can use dataset.get_or_create_collection() to get a collection by its name. If the collection does not exist, it will be created.

collection = dataset.get_or_create_collection("My-collection")

Accessing individual collections

Once you have listed the collections for a dataset using dataset.collections(), you can access a specific collection by retrieving it from the resulting dictionary with its name. Use collection.info() to get details (name, availability, and count) about it.

collections = dataset.collections()
terrain_correction = collections["L1GT"]
collection_info = terrain_correction.info()
print(collection_info)
Output
L1GT: [2013-03-25T12:08:43.699 UTC, 2024-08-19T12:57:32.456 UTC] (154288 data points)

You can also access a specific collection directly using the dataset.collection method on the dataset object. This method allows you to get the collection without having to list all collections first.

terrain_correction = dataset.collection("L1GT")
collection_info = terrain_correction.info()
print(collection_info)
Output
L1GT: [2013-03-25T12:08:43.699 UTC, 2024-08-19T12:57:32.456 UTC] (154288 data points)

Errors you may encounter

NotFoundError

If you attempt to access a collection with a non-existent name, a NotFoundError is raised. For example:

dataset.collection("Sat-X").info() # raises NotFoundError: 'No such collection Sat-X'

Next steps