> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tilebox.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Ingesting data

> Walk through the full process of ingesting GeoParquet data into a Tilebox timeseries dataset, from downloading source files to previewing the results.

<Columns cols={1}>
  <Card title="Open as Notebook" icon="book" href="https://colab.research.google.com/drive/1QS-srlWPMJg4csc0ycn36yCX9U6mvIpW" horizontal>
    This guide is also available as a Google Colab notebook. Click here for an interactive version.
  </Card>
</Columns>

This page guides you through the process of ingesting data into a Tilebox dataset. Starting from an existing
dataset available as file in the [GeoParquet](https://geoparquet.org/) format, you'll go through the process of
ingesting that data into Tilebox as a [Timeseries](/datasets/types/timeseries) dataset.

<Note>
  If you have your data in a different format, check out the [Ingesting from common file formats](/guides/datasets/ingest-format) examples of how to ingest it.
</Note>

## Related documentation

<Columns cols={2}>
  <Card title="Datasets" icon="database" href="/datasets/concepts/datasets" horizontal>
    Learn about Tilebox datasets and how to use them.
  </Card>

  <Card title="Ingest" icon="up-from-bracket" href="/datasets/ingest" horizontal>
    Learn how to ingest data into a Tilebox dataset.
  </Card>
</Columns>

## Downloading the example dataset

The dataset used in this example is available as a [GeoParquet](https://geoparquet.org/) file. You can download it
from here: [modis\_MCD12Q1.geoparquet](https://storage.googleapis.com/tbx-web-assets-2bad228/docs/data-samples/modis_MCD12Q1.geoparquet).

## Installing the necessary packages

This example uses a couple of python packages for reading parquet files and for visualizing the dataset. Install the
required packages using your preferred package manager. For new projects, Tilebox recommend using [uv](https://docs.astral.sh/uv/).

<CodeGroup>
  ```bash uv theme={"system"}
  uv add tilebox-datasets geopandas lonboard
  ```

  ```bash pip theme={"system"}
  pip install tilebox-datasets geopandas lonboard
  ```

  ```bash poetry theme={"system"}
  poetry add tilebox-datasets="*" geopandas="*" lonboard="*"
  ```

  ```bash pipenv theme={"system"}
  pipenv install tilebox-datasets geopandas lonboard
  ```
</CodeGroup>

## Reading and previewing the dataset

The dataset is available as a [GeoParquet](https://geoparquet.org/) file. You can read it using the `geopandas.read_parquet` function.

<CodeGroup>
  ```python Python theme={"system"}
  import geopandas as gpd

  modis_data = gpd.read_parquet("modis_MCD12Q1.geoparquet")
  modis_data.head(5)
  ```
</CodeGroup>

<CodeGroup>
  ```plaintext Output theme={"system"}
               time                  end_time                                   granule_name                                           geometry  horizontal_tile_number  vertical_tile_number   tile_id  file_size    checksum checksum_type day_night_flag browse_granule_id                     published_at
  0 2001-01-01 00:00:00+00:00 2001-12-31 23:59:59+00:00  MCD12Q1.A2001001.h00v08.061.2022146024956.hdf  POLYGON ((-180 10, -180 0, -170 0, -172.62252 ...                       0                     8  51000008     275957   941243048         CKSUM            Day              None 2022-06-23 10:54:43.824000+00:00
  1 2001-01-01 00:00:00+00:00 2001-12-31 23:59:59+00:00  MCD12Q1.A2001001.h00v09.061.2022146024922.hdf  POLYGON ((-180 0, -180 -10, -172.62252 -10, -1...                       0                     9  51000009     285389  3014510714         CKSUM            Day              None 2022-06-23 10:54:44.697000+00:00
  2 2001-01-01 00:00:00+00:00 2001-12-31 23:59:59+00:00  MCD12Q1.A2001001.h00v10.061.2022146032851.hdf  POLYGON ((-180 -10, -180 -20, -180 -20, -172.6...                       0                    10  51000010     358728  2908215698         CKSUM            Day              None 2022-06-23 10:54:44.669000+00:00
  3 2001-01-01 00:00:00+00:00 2001-12-31 23:59:59+00:00  MCD12Q1.A2001001.h01v08.061.2022146025203.hdf  POLYGON ((-172.62252 10, -170 0, -160 0, -162....                       1                     8  51001008     146979  1397661843         CKSUM            Day              None 2022-06-23 10:54:44.309000+00:00
  4 2001-01-01 00:00:00+00:00 2001-12-31 23:59:59+00:00  MCD12Q1.A2001001.h01v09.061.2022146025902.hdf  POLYGON ((-170 0, -172.62252 -10, -162.46826 -...                       1                     9  51001009     148935  2314263965         CKSUM            Day              None 2022-06-23 10:54:44.023000+00:00
  ```
</CodeGroup>

## Exploring it visually

Geopandas comes with a built in explorer to visually explore the dataset.

<CodeGroup>
  ```python Python theme={"system"}
  from lonboard import viz

  viz(modis_data, map_kwargs={"show_tooltip": True})
  ```
</CodeGroup>

<Frame>
  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/modis-explore-light.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=9bad3180c201356d51c9b3cefe1f7216" alt="Explore the MODIS dataset" className="dark:hidden" width="799" height="513" data-path="assets/guides/ingest/modis-explore-light.png" />

  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/modis-explore-dark.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=8c0424f670139f2f2b986b72a5a7de0b" alt="Explore the MODIS dataset" className="hidden dark:block" width="796" height="510" data-path="assets/guides/ingest/modis-explore-dark.png" />
</Frame>

## Create a Tilebox dataset

Now you'll create a [Spatio-temporal](/datasets/types/spatiotemporal) dataset with the same schema as the given MODIS dataset.
To do so, you'll use the [Tilebox Console](/console), navigate to `My Datasets` and click `Create Dataset`. Then select
`Spatio-temporal Dataset` as the dataset type.

<Tip>
  For more information on creating a dataset, check out the [Creating a dataset](/guides/datasets/create) guide for a
  Step by step guide.
</Tip>

Now, to match the given MODIS dataset, you'll specify the following fields:

| Field                    | Type      | Note                                |
| ------------------------ | --------- | ----------------------------------- |
| `granule_name`           | string    | MODIS granule name                  |
| `end_time`               | Timestamp | Measurement end time                |
| `horizontal_tile_number` | int64     | Horizontal modis tile number (0-35) |
| `vertical_tile_number`   | int64     | Vertical modis tile number (0-17)   |
| `tile_id`                | int64     | Modis Tile ID                       |
| `file_size`              | uint64    | File size of the product in bytes   |
| `checksum`               | string    | Hash checksum of the file           |
| `checksum_type`          | string    | Checksum algorithm (MD5 / CKSUM)    |
| `day_night_flag`         | int64     | Day / Night / Both                  |
| `browse_granule_id`      | string    | Optional granule ID for browsing    |
| `published_at`           | Timestamp | The time the product was published  |

In the console, this will look like the following:

<Frame>
  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/dataset-schema-light.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=6d38b72505f6f29580fc517df80a479c" alt="Tilebox Console" className="dark:hidden" width="1235" height="1098" data-path="assets/guides/ingest/dataset-schema-light.png" />

  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/dataset-schema-dark.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=10b622043bade0b3c5c309aa2b3c9786" alt="Tilebox Console" className="hidden dark:block" width="1235" height="1098" data-path="assets/guides/ingest/dataset-schema-dark.png" />
</Frame>

## Access the dataset from Python

Your newly created dataset is now available. You can access it from Python. For this, you'll need to know the dataset slug,
which was assigned automatically based on the specified `code_name`. To find out the slug, navigate to the dataset overview
in the console.

<Frame>
  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/dataset-slug-light.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=d08f1e675a0eb63b338bb22070db0f55" alt="Dataset slug display in the console" className="dark:hidden" width="797" height="425" data-path="assets/guides/ingest/dataset-slug-light.png" />

  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/dataset-slug-dark.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=b532f9de43737bec1610510158e6cff4" alt="Dataset slug display in the console" className="hidden dark:block" width="797" height="416" data-path="assets/guides/ingest/dataset-slug-dark.png" />
</Frame>

You can now instantiate the dataset client and access the dataset.

<CodeGroup>
  ```python Python theme={"system"}
  from tilebox.datasets import Client

  client = Client()
  dataset = client.dataset("tilebox.modis")  # replace with your dataset slug
  ```
</CodeGroup>

## Create a collection

Next, you'll create a collection to insert your data into.

<CodeGroup>
  ```python Python theme={"system"}
  collection = dataset.get_or_create_collection("MCD12Q1")
  ```
</CodeGroup>

## Ingest the data

Now, you'll finally ingest the MODIS data into the collection.

<CodeGroup>
  ```python Python theme={"system"}
  datapoint_ids = collection.ingest(modis_data)
  print(f"Successfully ingested {len(datapoint_ids)} datapoints!")
  ```
</CodeGroup>

<CodeGroup>
  ```plaintext Output theme={"system"}
  Successfully ingested 7245 datapoints!
  ```
</CodeGroup>

## Query the newly ingested data

You can now query the newly ingested data. You can query a subset of the data for a specific time range.

<Tip>
  Since the data is now stored directly in the Tilebox dataset, you can query and access it from anywhere.
</Tip>

<CodeGroup>
  ```python Python theme={"system"}
  from shapely import Polygon

  area = Polygon(  # area roughly covering the US
      ((-124.45, 49.19), (-120.88, 29.31), (-66.87, 24.77), (-65.34, 47.84), (-124.45, 49.19)),
  )

  data = collection.query(
      temporal_extent=("2015-01-01", "2020-01-01"),
      spatial_extent=area
  )
  data
  ```
</CodeGroup>

<CodeGroup>
  ```plaintext Output theme={"system"}
  <xarray.Dataset> Size: 28kB
  Dimensions:                 (time: 110)
  Coordinates:
    * time                    (time) datetime64[ns] 880B 2015-01-01 ... 2019-01-01
  Data variables: (12/14)
      id                      (time) <U36 16kB '014aa2ca-b000-100a-dd34-fae14c5...
      ingestion_time          (time) datetime64[ns] 880B 2025-07-09T09:21:34.70...
      geometry                (time) object 880B POLYGON ((-160 60, -124.457906...
      granule_name            (time) object 880B 'MCD12Q1.A2015001.h10v03.061.2...
      end_time                (time) datetime64[ns] 880B 2015-12-31T23:59:59 .....
      horizontal_tile_number  (time) int64 880B 10 13 12 11 11 10 ... 8 8 10 7 15
      ...                      ...
      file_size               (time) uint64 880B 11719196 10878403 ... 263319
      checksum                (time) object 880B '2878522088' ... '190901039'
      checksum_type           (time) object 880B 'CKSUM' 'CKSUM' ... 'CKSUM'
      day_night_flag          (time) object 880B 'Day' 'Day' 'Day' ... 'Day' 'Day'
      browse_granule_id       (time) object 880B 'UR:10:DsShESDTUR:UR:15:DsShSc...
      published_at            (time) datetime64[ns] 880B 2022-06-15T01:26:58.61...
  ```
</CodeGroup>

<Tip>
  For more information on accessing and querying data, check out [querying data](/datasets/query/querying-data).
</Tip>

## View the data in the console

You can also view your data in the Console, by navigate to the dataset, selecting the collection and then clicking
on one of the data points.

<Frame>
  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/explorer-light.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=9bd06383be8cd0217ec159d4ab08d194" alt="Explore the MODIS dataset" className="dark:hidden" width="1408" height="897" data-path="assets/guides/ingest/explorer-light.png" />

  <img src="https://mintcdn.com/tilebox/9yPiIuCV-2WPK6fa/assets/guides/ingest/explorer-dark.png?fit=max&auto=format&n=9yPiIuCV-2WPK6fa&q=85&s=9d23be33d9798266c05c0c288cdae4ff" alt="Explore the MODIS dataset" className="hidden dark:block" width="1397" height="911" data-path="assets/guides/ingest/explorer-dark.png" />
</Frame>

## Next steps

Congrats. You've successfully ingested data into Tilebox. You can now explore the data in the console and use it for
further processing and analysis.

<Columns cols={2}>
  <Card title="Query Data" icon="server" href="/datasets/query/querying-data" horizontal>
    Learn all about [querying your newly created dataset](https://docs.tilebox.com/datasets/query)
  </Card>

  <Card title="Dataset Types" icon="puzzle" href="/datasets/types/timeseries" horizontal>
    Explore the different dataset types available in Tilebox
  </Card>

  <Card title="Open Data" icon="star" href="/datasets/open-data" horizontal>
    Check out a growing number of publicly available open data datasets on Tilebox
  </Card>
</Columns>
