Zarr Import API

Zarr Import API enables you to import your own Zarr data and access it just like any other data when certain conditions are met.

These are:

Store your raster data in the Zarr format in your own S3 bucket in the supported region.
Zarr data must conform to data constraints.
Configure the bucket's permissions so that we can read them.

Data Constraints

Since Zarr is a generic data format, there are additional constraints that must be met in order to ingest the data into the system:

Data must be stored as a single Zarr group that contains coordinate arrays and data arrays.
Data array names should be valid JavaScript identifiers so they can be safely used in evalscripts; valid identifiers are case-sensitive, can contain Unicode letters, $, _, and digits (0-9), but may not start with a digit, and should not be one of the reserved JavaScript keywords.
Data arrays must have two or three dimensions. There must be exactly two spatial coordinate arrays, named either xandyorlatandlon, and an optional time coordinate array.
Data arrays must be stored in row-major order ("order": "C", that is, the last dimension varies fastest). The ordering of the dimensions has to be [time, lat, lon] or [time, y, x] for 3 dimensional data, and [lat, lon] or [y, x] for 2 dimensional data.
All data, including the spatial and the optional time coordinate arrays, must consist of 32-bit or 64-bit integers and floats (Zarr data types u4, i4, i8, f4, f8).
The chunk size in the two spatial dimensions must be less than or equal to 3072, but does not need to be the same for all data arrays.
For 3 dimensional data:
- the time array must include the units attribute in its zattrs, which has to be in the format <unit> since <instant>. Where supported units are days/hours/minutes/seconds/millis/micros/nanos and instant should either be in the format ISO8601 or should follow the definition of the time:units field of the CF time coordinate convention. For example, unix epoch could be encoded as seconds since 1970-01-01 00:00:00.
- the chunk size in time dimension must be the same for all data arrays and must be less than or equal to 50.
Data must not cross any of the two poles.
Data must use an equidistant spatial grid, that is, the two spatial coordinate arrays must be equidistant. The time coordinate array can be non-equidistant.
The projection needs to be one of: WGS84 (EPSG:4326), WebMercator (EPGS:3857), any UTM zone (EPSG:32601-32660, 32701-32760), or Europe LAEA (EPSG:3035).
Subgroups within the Zarr group will be ignored, but may be ingested separately.

Please refer to Zarr specification for explanation of various Zarr format properties.

Zarr Deployment

Zarr is available on AWS (2 regions). The Zarr Import API endpoint depends on the chosen deployment as specified in the table below.

note

The bucket where data is stored MUST be in the same region as the endpoint region you will use.

Deployment	API endpoint	Region
AWS EU (Frankfurt)	https://services.sentinel-hub.com/api/v1/zarr	eu-central-1
AWS US (Oregon)	https://services-uswest2.sentinel-hub.com/api/v1/zarr	us-west-2

AWS Bucket Settings

Bucket region

The bucket containing your Zarr data needs to be in the same region as the Zarr deployment you will use.

Bucket settings

As with other APIs, your AWS bucket needs to be configured to allow access from the system. To do this, update your bucket policy to include the following statement (do not forget to replace <bucket_name> with your actual bucket name):

JSON

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Sentinel Hub permissions",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::614251495211:root"
      },
      "Action": ["s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject"],
      "Resource": ["arn:aws:s3:::<bucket_name>", "arn:aws:s3:::<bucket_name>/*"]
    }
  ]
}

Creating Zarr Collections

Each Zarr collection will correspond to a single Zarr group. When creating a collection, you need to provide:

the S3 bucket where your data is located
the path in the bucket where the Zarr group resides, that is, the directory containing the .zgroup file
the Coordinate Reference System (CRS) in which your data is defined
a name for the collection

Ingesting the Arrays

After a collection is created, the ingestion will start automatically. The service will try to ingest every data array found in the group in the given S3 bucket and path. If the Zarr data does not fulfill any of the above constraints, the ingestion will either fail entirely or the offending data arrays will be skipped.

Zarr service automatically configures collection bands named after the data arrays of the Zarr group, that is, the folder names of the arrays. For example, in a Zarr file that contains B1 and B2 array folders, the resulting arrays will be named B1 and B2.

The no data value will be read from data arrays' metadata, that is, from the fill_value property inside the array's .zarray file.

Querying Ingestion Status

Querying a collection will return the status of the ingestion as well as an error message if something went wrong. If the returned status is INGESTED you can start using your new Zarr data with our services.

Reingesting the Zarr collection

When reingesting the Zarr, the data already ingested cannot be changed, but new chunks can be added to the existing data arrays and the temporal array can be expanded accordingly.

Data Constraints​

Zarr Deployment​

AWS Bucket Settings​

Bucket region​

Bucket settings​

Creating Zarr Collections​

Ingesting the Arrays​

Querying Ingestion Status​

Reingesting the Zarr collection​