Introduction
In this section we showcase how the AI agent helps maintain an open lakehouse stack specifically for GIS datasets. The agent is capable of ingesting, processing, and visualizing raster and vector data in a variety of formats, including COG GeoTIFF, RaQuet and GeoParquet.GIS-specific Dataset formats
The following video provides an overview of the plethora of dataformats we need to be able to use in GIS applications.Raster Data in the Open Lakehouse
RaQuet (pronounced “rock-et”) is CARTO’s new specification for storing raster data directly in Apache Parquet, the ubiquitous columnar format in modern data warehouses: A RaQuet row is a single tile block identified by a Web Mercator z/x/y quadbin in a block column. The format includes per band binary blobs eg. columns band_1, band_2, … hold row-major-order pixel data (optionally gzip-compressed). Metadata are stored in the file itself - a special “block = 0x00” row stores JSON metadata (coverage area, resolution, data type), much like GeoParquet. The format allows in-memory databases such as DuckDB to use SELECT specific tiles or bands, join against vector tables, and leverage Parquet pruning & predicate pushdown.DuckDB Compute Engine
Predicate Pushdown Example
Parquet files are divided into row groups, and each row group carries metadata (min/max statistics) for every column. When you run a query with aWHERE clause, DuckDB will:
- Read only the metadata for the relevant column(s) from each row group.
- Use your filter predicate (e.g.
block = 12345) to decide which row groups could contain matching rows. - Skip scanning any row group whose metadata shows it can’t match.
- Only actually read & decode the matching row groups.
raquet.parquet with these columns:
block(an integer quadbin ID)band_1,band_2, … (binary or numeric arrays)
block = 120124381. That’s predicate pushdown.
Partition Pruning Example
If you organize your files in a directory structure that encodes a “partition” column—e.g.:WHERE block = … filter before even opening those files, and skip entire directories that don’t match. That’s partition pruning.
For example, if your tiles are laid out as:
block=120124381/ folder, because it infers from the path that block there ≠ your filter value. That’s partition pruning—whole files (or directories) are skipped up-front.
Data Visualization


