Zarr Import API
Zarr Import API enables you to import your own Zarr data and access it just like any other data when certain conditions are met.
These are:
- Store your raster data in the Zarr format in your own S3 bucket in the supported region.
- Zarr data must conform to data constraints.
- Configure the bucket's permissions so that we can read them.
Data Constraints
Since Zarr is a generic data format, there are additional constraints that must be met in order to ingest the data into the system:
- Data must be stored as a single Zarr group that contains coordinate arrays and data arrays.
- Data array names should be valid JavaScript identifiers so they can be safely used in evalscripts; valid identifiers are case-sensitive, can contain Unicode letters, $, _, and digits (0-9), but may not start with a digit, and should not be one of the reserved JavaScript keywords.
- Data arrays must have two or three dimensions. There must be exactly two spatial coordinate arrays, named either
xandyorlatandlon, and an optionaltimecoordinate array. - Data arrays must be stored in row-major order (
"order": "C", that is, the last dimension varies fastest). The ordering of the dimensions has to be[time, lat, lon]or[time, y, x]for 3 dimensional data, and[lat, lon]or[y, x]for 2 dimensional data. - All data, including the spatial and the optional
timecoordinate arrays, must consist of 32-bit or 64-bit integers and floats (Zarr data typesu4,i4,i8,f4,f8). - The chunk size in the two spatial dimensions must be less than or equal to 3072, but does not need to be the same for all data arrays.
- For 3 dimensional data:
- the
timearray must include theunitsattribute in itszattrs, which has to be in the format<unit> since <instant>. Where supported units are days/hours/minutes/seconds/millis/micros/nanos andinstantshould either be in the format ISO8601 or should follow the definition of thetime:unitsfield of the CF time coordinate convention. For example, unix epoch could be encoded asseconds since 1970-01-01 00:00:00. - the chunk size in time dimension must be the same for all data arrays and must be less than or equal to 50.
- the
- Data must not cross any of the two poles.
- Data must use an equidistant spatial grid, that is, the two spatial coordinate arrays must be equidistant. The time coordinate array can be non-equidistant.
- The projection needs to be one of: WGS84 (EPSG:4326), WebMercator (EPGS:3857), any UTM zone (EPSG:32601-32660, 32701-32760), or Europe LAEA (EPSG:3035).
- Subgroups within the Zarr group will be ignored, but may be ingested separately.
Please refer to Zarr specification for explanation of various Zarr format properties.
Zarr Deployment
Zarr is available on AWS (2 regions). The Zarr Import API endpoint depends on the chosen deployment as specified in the table below.
The bucket where data is stored MUST be in the same region as the endpoint region you will use.
| Deployment | API end-point | Region |
|---|---|---|
| AWS EU (Frankfurt) | https://services.sentinel-hub.com/api/v1/zarr | eu-central-1 |
| AWS US (Oregon) | https://services-uswest2.sentinel-hub.com/api/v1/zarr | us-west-2 |
AWS Bucket Settings
Bucket region
The bucket containing your Zarr data needs to be in the same region as the Zarr deployment you will use.
Bucket settings
As with other APIs, your AWS bucket needs to be configured to allow access from the system.
To do this, update your bucket policy to include the following statement (do not forget to replace <bucket_name> with your actual bucket name):
- JSON
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Sentinel Hub permissions",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::614251495211:root"
},
"Action": ["s3:GetBucketLocation", "s3:ListBucket", "s3:GetObject"],
"Resource": ["arn:aws:s3:::<bucket_name>", "arn:aws:s3:::<bucket_name>/*"]
}
]
}
Creating Zarr Collections
Each Zarr collection will correspond to a single Zarr group. When creating a collection, you need to provide:
- the S3 bucket where your data is located
- the path in the bucket where the Zarr group resides, that is, the directory containing the
.zgroupfile - the Coordinate Reference System (CRS) in which your data is defined
- a name for the collection
Ingesting the Arrays
After a collection is created, the ingestion will start automatically. The service will try to ingest every data array found in the group in the given S3 bucket and path. If the Zarr data does not fulfill any of the above constraints, the ingestion will either fail entirely or the offending data arrays will be skipped.
Zarr service automatically configures collection bands named after the data arrays of the Zarr group, that is, the folder names of the arrays.
For example, in a Zarr file that contains B1 and B2 array folders, the resulting arrays will be named B1 and B2.
The no data value will be read from data arrays' metadata, that is, from the fill_value property inside the array's .zarray file.
Querying Ingestion Status
Querying a collection will return the status of the ingestion as well as an error message if something went wrong.
If the returned status is INGESTED you can start using your new Zarr data with our services.
Reingesting the Zarr collection
When reingesting the Zarr, the data already ingested cannot be changed, but new chunks can be added to the existing data arrays and the temporal array can be expanded accordingly.