Ingesting data
Learn how to ingest an existing dataset into Tilebox
This page guides you through the process of ingesting data into a Tilebox dataset. Starting from an existing dataset available as file in the GeoParquet format, we’ll walk you through the process of ingestion that data into Tilebox as a Timeseries dataset.
Related documentation
Datasets
Learn about Tilebox datasets and how to use them.
Ingest
Learn how to ingest data into a Tilebox dataset.
Downloading the example dataset
The dataset used in this example is available as a GeoParquet file. You can download it from here: modis_MCD12Q1.geoparquet.
Installing the necessary packages
This example uses a couple of python packages for reading parquet files and for visualizing the dataset. Install the required packages using your preferred package manager. For new projects, we recommend using uv.
Reading and previewing the dataset
The dataset is available as a GeoParquet file. You can read it using the geopandas.read_parquet
function.
Exploring it visually
Geopandas comes with a built in explorer to visually explore the dataset.


Create a Tilebox dataset
Now we’ll create a Timeseries dataset with the same schema as the given MODIS dataset.
To do so, we’ll use the Tilebox Console, navigate to My Datasets
and click Create Dataset
. We then select
Timeseries Dataset
as the dataset type.
For more information on creating a dataset, check out the Creating a dataset guide for a Step by step guide.
Now, to match the given MODIS dataset, we’ll specify the following fields:
Field | Type | Note |
---|---|---|
granule_name | string | MODIS granule name |
geometry | Geometry | Tile boundary coords of the granule |
end_time | Timestamp | Measurement end time |
horizontal_tile_number | int64 | Horizontal modis tile number (0-35) |
vertical_tile_number | int64 | Vertical modis tile number (0-17) |
tile_id | int64 | Modis Tile ID |
file_size | uint64 | File size of the product in bytes |
checksum | string | Hash checksum of the file |
checksum_type | string | Checksum algorithm (MD5 / CKSUM) |
day_night_flag | int64 | Day / Night / Both |
browse_granule_id | string | Optional granule ID for browsing |
published_at | Timestamp | The time the product was published |
In the console, this will look like the following:


Access the dataset from Python
Our newly created dataset is now available. Let’s access it from Python. For this, we’ll need to know the dataset slug,
which was assigned automatically based on the specified code_name
. To find out the slug, navigate to the dataset overview
in the console.


We can now instantiate the dataset client and access the dataset.
Create a collection
Next, we’ll create a collection to insert our data into.
Ingest the data
Now, we’ll finally ingest the MODIS data into the collection.
Query the newly ingested data
We can now query the newly ingested data. Let’s query a subset of the data for a specific time range.
Since the data is now stored directly in the Tilebox dataset, you can query and access it from anywhere.
For more information on accessing and querying data, check out querying data.
View the data in the console
You can also view your data in the Console, by navigate to the dataset, selecting the collection and then clicking on one of the data points.


Next steps
Congrats! You’ve successfully ingested data into Tilebox. You can now explore the data in the console and use it for further processing and analysis.
Was this page helpful?