SuperRes Overview
Planet SuperRes uses the Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) with a collection of 120,000 pairs of SkySat and PlanetScope imagery with a perceptual loss training technique to produce imagery that looks natural and remains faithful to the ground truth. We include a confidence layer to help assess the degree of accuracy of the generated pixel by evaluating various sources of error.
Planet SuperRes is currently available as SuperRes Mosaics and SuperRes PlanetScope Scenes. To learn more about Planet SuperRes, visit Planet University - Introduction to Planet SuperRes. They share the model features described below:

Vinkeveen-Utrecht, Netherlands SuperRes Visual Mosaic November 2025
| Product Specifications |
|
The Training Dataset
The training dataset is the foundation of Planet Visual SuperRes. It consists of 120,476 high-quality image pairs, each comprising a 3 m PlanetScope scene and a 0.5 m SkySat scene of the same location. Dataset quality is ensured through strict curation standards:
- Temporal consistency: Every pair was captured within a 12-hour window to minimize change on the ground between acquisitions.
- Geometric precision: Sub-pixel co-registration was achieved via phase correlation; pairs with a phase correlation error exceeding 0.2 were discarded.
- Radiometric normalization: Per-pixel normalization aligns the SkySat spectral response to PlanetScope, preventing radiometric artifacts in the model output.
- Clarity: A minimum 95% clear-sky requirement was applied to both images in every pair, eliminating cloud contamination.
The dataset spans diverse geographies, seasons, and land cover types — providing the broad training signal needed for reliable performance at a global scale. It is split into 100,476 training pairs, 10,000 validation pairs, and a held-out test set of 10,000 pairs.
Model Architecture and Training
Planet Visual SuperRes uses an Enhanced Super-Resolution Generative Adversarial Network (ESRGAN), a leading generative AI approach for single-image super-resolution.

SuperRes Visual Mosaic, October 2025
Key architectural elements include:
- Residual-in-Residual Dense Blocks (RRDBs): Replace traditional residual blocks with denser connections for improved feature representation and training stability.
- Perceptual Loss Function: Combines pixel-wise loss with perceptual loss derived from a pre-trained deep network, training the model to produce outputs that look sharp and natural to the human eye — not just pixel-accurate.
- Adversarial Training: A discriminator network enforces realistic textures and fine details, enabling sharper outputs than traditional bicubic or convolutional methods.
The production 1.5x model was further optimized for deployment: input channels were reduced from 8 to 4 bands with no performance loss, iterative pruning reduced the computational footprint, and the final model was quantized to half-precision (float16) for efficient inference at scale.
The Confidence Layer
The Confidence Layer is an auxiliary output channel added to the ESRGAN generator network. It is trained concurrently with the primary super-resolution task to predict the expected Mean Absolute Error (MAE) between the model's output and ground truth image, using an L1 loss function. At inference time, the layer produces a per-pixel MAE estimate from the low-resolution input alone with no reference high-resolution image required. The raw MAE output is rescaled to a 0–100 confidence score:
- Confidence = 100 * 1 - min((MAE / 2500), 1)

SuperRes Visual Mosaic, October 2025
Reliable Performance
The production 1.5x model was evaluated against the 10,000-image held-out test set. Our primary metric is Learned Perceptual Image Patch Similarity (LPIPS), which evaluates image similarity based on deep network feature maps (AlexNet) and aligns most closely with human visual perception. We report 1 − LPIPS so that higher scores indicate better perceptual quality.
| Metric | Score |
|---|---|
| 1 − Learned Perceptual Image Patch Similarity (LPIPS) | 0.961 |
| Peak Signal-to-Noise Ratio (PSNR) | 33.53 |
| Structural Similarity Index Measure (SSIM) | 0.876 |
| Confidence Layer Accuracy* | 0.993 |
*Confidence Layer Accuracy provides a direct comparison between the ground truth MAE and our predicted MAE.
In a world where the gap between coverage and clarity has constrained what satellite imagery can deliver, Planet Visual SuperRes offers something new: the persistent, near-daily global reach of PlanetScope, with sharper imagery that makes features easier to see, characterize, and monitor at scale. With a purpose-built training dataset, a model tuned for human visual perception, and a built-in Confidence Layer that provides per-pixel transparency, Planet SuperRes gives analysts a clearer view of the world — and more time to focus on what they see in it.
Disclaimer
Neural networks, including the processes used to create SuperRes Mosaics are a form of artificial intelligence (AI). While the per-pixel confidence layer (noted above) is intended to help Licensee identify the accuracy of each pixel of SuperRes, the SuperRes neural network may generate incorrect, incomplete, or misleading information, and may fail to identify (or may hallucinate) features or objects depicted. Planet does not warrant or guarantee the accuracy, completeness, or suitability of SuperRes Mosaics. Licensee is solely responsible for reviewing, validating, and approving any SuperRes pixel, particularly before Licensee takes action based on such SuperRes pixel, including, without limitation, visual comparison of each pixel with the corresponding standard visual mosaic.