Skip to main content

Batch Statistical API

Requires Enterprise License

The Batch Statistical API is only available for users on enterprise plans. If you do not have an enterprise plan, and would like to try it out, contact us or upgrade.

The Batch Statistical API enables you to request statistics similarly to the Statistical API, but for multiple polygons at once and/or for longer aggregations. A typical use case would be calculating statistics for all parcels in a country.

Similar to the Batch Processing API, this is an asynchronous REST service. This means that data will not be immediately returned in the response of the request but delivered to your object storage, which needs to be specified in the request.

You can find more details about the API in the API Reference or in the examples of the workflow.

Deployments

DeploymentAPI endpointRegion
AWS EU (Frankfurt)https://services.sentinel-hub.com/api/v1/statistics/batcheu-central-1

Data Sources Restrictions

All data sources must be from the same deployment where the request is made.

Workflow

The Batch Statistical API workflow in many ways resembles the Batch Processing API workflow. Available actions and statuses are:

  • user's actions: ANALYSE, START and STOP.
  • request statuses: CREATED, ANALYSING, ANALYSIS_DONE, STOPPED, PROCESSING, DONE, and FAILED.

The Batch Statistical API comes with a set of REST actions that support the execution of various steps in the workflow. The diagram below shows all possible statuses of the Batch Statistical request and users' actions that trigger transitions among them.

The workflow starts when you post a new Batch Statistical request. In this step, the system:

  • creates a new Batch Statistical request with status CREATED,
  • validates your input (not the evalscript),
  • returns the overview of the created request.

You can then decide to either request an additional analysis of the request or start the processing. When an additional analysis is requested:

  • the status of the request changes to ANALYSING,
  • the evalscript is validated,
  • After the analysis is finished, the status of the request changes to ANALYSIS_DONE.

If you choose to start processing directly, the system still executes the analysis, but when the analysis is done, it automatically starts processing. This is not explicitly shown in the diagram to keep it simple.

When you start the processing:

  • the status of the request changes to PROCESSING (this may take a while),
  • the processing starts,
  • spent processing units are billed periodically.

When the processing finishes, the status of the request changes to DONE.

Stopping the request

A request might be stopped for the following reasons:

  • it is requested by a user (user action)
  • user is out of processing units (see chapter below)
  • something is wrong with the processing of the request

A user may stop the request in the following states: ANALYSING, ANALYSIS_DONE, and PROCESSING. However:

  • if the status is ANALYSING, the analysis will complete
  • if the status is PROCESSING, all features (polygons) that have been processed or are being processed at that moment are charged for
  • you are not allowed to restart the request in the next 30 minutes

The service itself may also stop the request when processing of a lot of features is repeatedly failing. stoppedStatusReason of such requests will be UNHEALTHY. This can happen if the service is unstable or if something is wrong with the request. If the former, the request should eventually be restarted by our team.

Processing unit costs

To create, analyse, or initiate a request, the user must have at least 1000 processing units available in their account. If available processing units of a user drop below 1000 while the request is being processed, the request is automatically stopped and cannot be restarted in the next 60 minutes. Therefore, it is highly recommended to start a request with a sufficient reserve.

More information about batch statistical costs is available here.

Automatic deletion of stale data

Stale (inactive) requests will be deleted after a specific period of inactivity, depending on their status:

  • requests with status CREATED are deleted after 7 days of inactivity
  • requests with status FAILED are deleted after 15 days of inactivity
  • all other requests are deleted after 30 days of inactivity
note

Note that only such requests themselves will be deleted, while the requests' result (created statistics) will remain under your control in your object storage bucket.

Input Polygons as a GeoPackage File

The Batch Statistical API accepts a GeoPackage file containing features (polygons) as an input. The GeoPackage must be stored in your object storage and must be accessible to the service for reading (find more details about this in the object storage configuration section below). In a batch statistical request, the input GeoPackage is specified by setting the path to the .gpkg file in the input.features.s3 or input.features.gs parameter.

All features (polygons) in an input GeoPackage must be in the same CRS supported by us. There can be a maximum of 700,000 features in the GeoPackage.

Evalscript and Batch Statistical API

The exact specifics as described for evalscript and Statistical API also apply to the Batch Statistical API.

Evalscripts smaller than 32KB in size can be provided directly in a batch statistical request under evalscript parameter. If your evalscript exceeds this limit, you can store it in your object storage bucket and provide a reference to it in a batch statistical request under the evalscriptReference parameter.

Processing Results

Outputs of a Batch Statistical API request are JSON files stored in your object storage. Each .json file will contain the requested statistics for one feature (polygon) in the provided GeoPackage. You can connect statistics in a JSON file with the corresponding feature (polygon) in the GeoPackage based on:

  • id of a feature from GeoPackage is used as the name of the JSON file (for example, 1.json, 2.json) and available in the JSON file as id property OR
  • a custom column identifier of type string can be added to GeoPackage and its value will be available in json file as identifier property

The outputs will be stored in the bucket and the folder specified by output.s3.url or output.gs.url parameter of the batch statistical request. The URL supports templating with placeholders like <REQUEST_ID>, <ID>, and <IDENTIFIER> to customize output file organization (see API Reference for details). The outputs will be available in a sub-folder named after the ID of your request (for example, s3://{bucket}/{my-folder}/db7de265-dfd4-4dc0-bc82-74866078a5ce).

Object Storage Configuration

The Batch Statistical API requires access to object storage for reading input data and storing processing results. We support two object storage providers:

  • Amazon S3
  • Google Cloud Storage (GCS)

Supported Use Cases

Object storage is used for:

  • Reading GeoPackage files with input features (polygons)
  • Reading evalscript files (optional, evalscripts can also be provided directly in the request)
  • Uploading processing results (JSON files with statistics)

One bucket or different buckets can be used for all three purposes.

AWS S3 Configuration

The Batch Statistical API supports two authentication methods for AWS S3. We recommend using the IAM Assume Role method for enhanced security and fine-grained access control.

Authentication Methods

The IAM Assume Role method provides better security by allowing temporary credentials and fine-grained access control without exposing long-term credentials.

To use this method, provide the ARN of an IAM role that has access to your S3 bucket:

{
"output": {
"s3": {
"url": "s3://{bucket}/{key}",
"region": "{region}",
"iamRoleARN": "{IAM-role-ARN}"
}
}
}

Setup Steps:

  1. Create an IAM Policy for S3 Access

Create a policy that grants the necessary permissions to your S3 bucket:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": ["arn:aws:s3:::{bucket}", "arn:aws:s3:::{bucket}/*"]
}
]
}
  1. Create an IAM Role
  • In the AWS IAM console, create a new role
  • Choose "AWS account" as the trusted entity type
  • Select "Another AWS account" and enter account ID: 614251495211
  • Attach the policy created in step 1
  • Note the Role ARN for use in your API requests
  1. Configure Trust Relationship (Optional but Recommended)

For additional security, modify the role's trust policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::614251495211:root"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "{domain-account-id}"
},
"StringLike": {
"sts:RoleSessionName": "sentinelhub"
}
}
}
]
}

Replace {domain-account-id} with your domain account ID from the Dashboard.

Option 2: Access Key & Secret Key

Alternatively, you can provide AWS access credentials directly:

{
"output": {
"s3": {
"url": "s3://{bucket}/{key}",
"accessKey": "{access-key}",
"secretAccessKey": "{secret-access-key}",
"region": "{region}"
}
}
}

The access key and secret must be linked to an IAM user with the following permissions on your S3 bucket:

  • s3:GetObject
  • s3:PutObject
  • s3:DeleteObject
  • s3:ListBucket

To create access keys, see the AWS documentation on programmatic access.

Using S3 Configuration in Requests

The S3 configuration can be used in:

  • input.features.s3 - to specify the bucket where the GeoPackage file is available (required)
  • evalscriptReference.s3 - to specify the bucket where the evalscript .js file is available (optional)
  • output.s3 - to specify the bucket where the results will be stored (required)

Check Batch Statistical API reference for more information.

Google Cloud Storage Configuration

Google Cloud Storage is supported for GeoPackage input, evalscript input, and output delivery. Authentication requires a service account with base64-encoded credentials.

Preparing Credentials

  1. Download your service account credentials in JSON format (not P12)
  2. Encode them as a base64 string:
cat my_creds.json | base64

Using GCS for Input

To read a GeoPackage or evalscript from Google Cloud Storage:

{
"input": {
"features": {
"gs": {
"url": "gs://{bucket}/{key}",
"credentials": "{base64-encoded-credentials}"
}
}
}
}

Using GCS for Output

To deliver results to Google Cloud Storage:

{
"output": {
"gs": {
"url": "gs://{bucket}/{key}",
"credentials": "{base64-encoded-credentials}"
}
}
}

Required GCS Permissions

The service account must have the following permissions on the specified bucket:

  • storage.objects.create
  • storage.objects.get
  • storage.objects.delete
  • storage.objects.list

These permissions can be granted through IAM roles such as Storage Object Admin or custom roles. If possible, restrict access to the specific delivery path within the bucket for enhanced security.

Cross-Cloud and Cross-Region Support

The Batch Statistical API provides complete flexibility in choosing storage locations for both input and output. Surcharges apply based on where your processing results are delivered (output storage location).

Storage Configuration Options

The table below shows output storage options for each deployment and their associated costs. Surcharges apply only to the volume of output data transferred to your storage.

Important: Input and output storage can be configured independently - you can mix and match any combination. For example, you can read input from GCS and write output to S3, or read from S3 in one region and write to S3 in another region. Input storage location does not affect costs.

DeploymentRegionOutput Storage LocationAdditional PU Cost
AWS EU (Frankfurt)eu-central-1S3 eu-central-1None
AWS EU (Frankfurt)eu-central-1S3 (any other region)0.03 PU/MB
AWS EU (Frankfurt)eu-central-1Google Cloud Storage0.1 PU/MB

Output Data Transfer Surcharges Summary:

  • Cross-region (same cloud): 0.03 PU per MB
  • Cross-cloud: 0.1 PU per MB

Important Notes:

  • Surcharges apply only to output data transfer (processing results)
  • Input location (GeoPackage and evalscript files) does not affect costs
  • When using an AWS S3 bucket in a different region than the deployment, specify the region parameter in your request:
{
"output": {
"s3": {
"url": "s3://{bucket}/{key}",
"region": "{region}",
"iamRoleARN": "{IAM-role-ARN}"
}
}
}