Zarr Import API
Zarr Import API enables you to import your own Zarr data and access it just like standard platform datasets when certain conditions are met.
These are:
- Store your raster data in the Zarr format in your own S3 bucket in the supported region.
- Zarr data must conform to data constraints.
- Configure the bucket's permissions so that we can read them.
Data Constraints
Since Zarr is a generic data format, there are additional constraints that must be met in order to ingest the data into the system:
- Data must be stored as a single Zarr group that contains coordinate arrays and data arrays.
- Data array names should be valid JavaScript identifiers so they can be safely used in evalscripts; valid identifiers are case-sensitive, can contain Unicode letters, $, _, and digits (0-9), but may not start with a digit, and should not be one of the reserved JavaScript keywords.
- Data arrays must have two or three dimensions. There must be exactly two spatial coordinate arrays, named either
xandyorlatandlon, and an optionaltimecoordinate array. - Data arrays must be stored in row-major order (
"order": "C", that is, the last dimension varies fastest). The ordering of the dimensions has to be[time, lat, lon]or[time, y, x]for 3 dimensional data, and[lat, lon]or[y, x]for 2 dimensional data. - All data, including the spatial and the optional
timecoordinate arrays, must consist of 32-bit or 64-bit integers and floats (Zarr data typesu4,i4,i8,f4,f8). - The chunk size in the two spatial dimensions must be less than or equal to 3072, but does not need to be the same for all data arrays.
- For 3 dimensional data:
- the
timearray must include theunitsattribute in itszattrs, which has to be in the format<unit> since <instant>. Where supported units are days/hours/minutes/seconds/millis/micros/nanos andinstantshould either be in the format ISO8601 or should follow the definition of thetime:unitsfield of the CF time coordinate convention. For example, unix epoch could be encoded asseconds since 1970-01-01 00:00:00. - the chunk size in time dimension must be the same for all data arrays and must be less than or equal to 50.
- the
- Data must not cross any of the two poles.
- Data must use an equidistant spatial grid, that is, the two spatial coordinate arrays must be equidistant. The time coordinate array can be non-equidistant.
- The projection needs to be one of: WGS84 (EPSG:4326), WebMercator (EPGS:3857), any UTM zone (EPSG:32601-32660, 32701-32760), or Europe LAEA (EPSG:3035).
- Subgroups within the Zarr group will be ignored, but may be ingested separately.
Please refer to Zarr specification for explanation of various Zarr format properties.
Rate Limiting
The Zarr Import API follows the general rate limiting policies described in Rate Limiting.
Zarr Deployment
Zarr is available on AWS (2 regions). The Zarr Import API endpoint depends on the chosen deployment as specified in the table below.
The bucket where data is stored MUST be in the same region as the endpoint region you will use.
| Deployment | API endpoint | Region |
|---|---|---|
| AWS EU (Frankfurt) | https://services.sentinel-hub.com/zarr/v1 | eu-central-1 |
| AWS US (Oregon) | https://services-uswest2.sentinel-hub.com/zarr/v1 | us-west-2 |
AWS Bucket Settings
Bucket region
The bucket containing your Zarr data needs to be in the same region as the Zarr deployment you will use.
Bucket settings
As with other APIs, your AWS bucket needs to be configured to allow access from the system.
To do this, update your bucket policy to include the following statement (do not forget to replace <bucket_name> with your actual bucket name):
- JSON
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Sentinel Hub permissions",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::614251495211:root"
},
"Action": ["s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject"],
"Resource": ["arn:aws:s3:::<bucket_name>", "arn:aws:s3:::<bucket_name>/*"]
}
]
}
Creating Zarr Collections
Each Zarr collection will correspond to a single Zarr group. When creating a collection, you need to provide:
- the S3 bucket where your data is located
- the path in the bucket where the Zarr group resides, that is, the directory containing the
.zgroupfile - the Coordinate Reference System (CRS) in which your data is defined
- a name for the collection
Ingesting the Arrays
After a collection is created, the ingestion will start automatically. The service will try to ingest every data array found in the group in the given S3 bucket and path. If the Zarr data does not fulfill any of the above constraints, the ingestion will either fail entirely or the offending data arrays will be skipped.
Zarr service automatically configures collection arrays named after the data arrays of the Zarr group, that is, the folder names of the arrays.
For example, in a Zarr file that contains B1 and B2 array folders, the resulting arrays will be named B1 and B2.
The no data value will be read from data arrays' metadata, that is, from the fill_value property inside the array's .zarray file.
Querying Ingestion Status
Querying a collection will return the status of the ingestion as well as an error message if something went wrong.
If the returned status is INGESTED you can start using your new Zarr data with our services.
Accessing Zarr Data
After you create a Zarr collection and arrays are ingested, you can access your data using the Processing API, just like standard platform datasets. You need your collection ID, which can be obtained from your Data Collections app or via the Zarr Import API.
Data Type Identifier
Use zarr-<collectionId> as the value of the input.data.type parameter in your Processing API requests. For example, set it to zarr-123e4567-e89b-12d3-a456-426614174000 for a Zarr collection with id 123e4567-e89b-12d3-a456-426614174000. Each Zarr collection in Planet Insights Platform contains data from a single Zarr group.
Request Resolution Limit
The maximum meters per pixel limit is set by the service and is approximately three times the resolution of the actual ingested data.
Filtering Options
mosaickingOrder
Sets the sensing time order of preference.
| Value | Description |
|---|---|
| mostRecent | For SIMPLE mosaicking, the values for the most recent sensing time will be returned. For ORBIT and TILE mosaicking, samples in the evalscript will have values sorted by descending sensing times. |
| leastRecent | Same as mostRecent but in reverse order. |
Mosaicking works differently for Zarr collections than for other collections.
Processing Options
| Parameter | Description | Values | Default |
|---|---|---|---|
| upsampling | Interpolation when requested resolution > source resolution | NEAREST - nearest neighbor interpolation BILINEAR - bilinear interpolation BICUBIC - bicubic interpolation | NEAREST |
| downsampling | Interpolation when requested resolution < source resolution | NEAREST - nearest neighbor interpolation BILINEAR - bilinear interpolation BICUBIC - bicubic interpolation | NEAREST |
Available Data Arrays
The available arrays are the ones that the Zarr group contains. To find the available array names for your evalscript, list the ingested array names within the Zarr group using the Zarr Import API.
Data Mask and Mosaicking
A Zarr collection only contains data arrays of a single Zarr group. Zarr metadata does not contain cover geometries (other than the envelope), so all data arrays are considered to cover the full Zarr envelope. The value of dataMask is always 1 inside the Zarr envelope and 0 outside. Areas for which an array has no data chunks are filled with the no data value.
Consequently, mosaicking works differently than for other collections:
- Timeless (two-dimensional) Zarrs contain data for a single (unspecified) sensing time. This data will be returned for both
SIMPLEandTILEmosaicking. - Three-dimensional Zarrs contain data for multiple sensing times. The data returned will be:
- For
SIMPLEmosaicking, only the data for a single sensing time. The data is considered to cover the full Zarr envelope, so there are no missing areas where data from other sensing times would be mosaicked in. - For
TILEmosaicking, an array of tiles corresponding to sensing times.
- For
ORBITmosaicking is not supported because Zarrs do not contain orbit metadata.
Units
The only units available are digital numbers (DN), so any unit conversions, if necessary, are the responsibility of your evalscript.
Reingesting the Zarr collection
When reingesting the Zarr, the data already ingested cannot be changed, but new chunks can be added to the existing data arrays and the temporal array can be expanded accordingly.