Walk through the full process of ingesting GeoParquet data into a Tilebox timeseries dataset, from downloading source files to previewing the results.
Open as Notebook
This guide is also available as a Google Colab notebook. Click here for an interactive version.
This page guides you through the process of ingesting data into a Tilebox dataset. Starting from an existing
dataset available as file in the GeoParquet format, you’ll go through the process of
ingesting that data into Tilebox as a Timeseries dataset.
This example uses a couple of python packages for reading parquet files and for visualizing the dataset. Install the
required packages using your preferred package manager. For new projects, Tilebox recommend using uv.
Now you’ll create a Spatio-temporal dataset with the same schema as the given MODIS dataset.
To do so, you’ll use the Tilebox Console, navigate to My Datasets and click Create Dataset. Then select
Spatio-temporal Dataset as the dataset type.
For more information on creating a dataset, check out the Creating a dataset guide for a
Step by step guide.
Now, to match the given MODIS dataset, you’ll specify the following fields:
Field
Type
Note
granule_name
string
MODIS granule name
end_time
Timestamp
Measurement end time
horizontal_tile_number
int64
Horizontal modis tile number (0-35)
vertical_tile_number
int64
Vertical modis tile number (0-17)
tile_id
int64
Modis Tile ID
file_size
uint64
File size of the product in bytes
checksum
string
Hash checksum of the file
checksum_type
string
Checksum algorithm (MD5 / CKSUM)
day_night_flag
int64
Day / Night / Both
browse_granule_id
string
Optional granule ID for browsing
published_at
Timestamp
The time the product was published
In the console, this will look like the following:
Your newly created dataset is now available. You can access it from Python. For this, you’ll need to know the dataset slug,
which was assigned automatically based on the specified code_name. To find out the slug, navigate to the dataset overview
in the console.
You can now instantiate the dataset client and access the dataset.
Copy
Ask AI
from tilebox.datasets import Clientclient = Client()dataset = client.dataset("tilebox.modis") # replace with your dataset slug