Exploring Satellite Embeddings with Google’s Alpha Earth

Introduction

Google’s Alpha Earth project represents a breakthrough in satellite imagery analysis by providing pre-trained deep learning embeddings of Earth’s surface. These embeddings capture rich semantic information about land use, vegetation, urban development, and environmental changes in a compact vector representation.

What are Satellite Embeddings?

Satellite embeddings are high-dimensional vector representations of satellite imagery patches that encode semantic meaning. Unlike raw RGB bands, these embeddings capture:

  • Land use patterns: Urban, agricultural, forest, water bodies
  • Temporal changes: Seasonal variations, development, deforestation
  • Spectral signatures: Beyond visible light, incorporating multispectral data
  • Spatial relationships: Context-aware features that understand neighboring regions

The Alpha Earth Dataset

Google’s Alpha Earth embeddings are derived from Sentinel-2 imagery using a self-supervised vision transformer model. Key features:

  • Global coverage: Annual embeddings from 2017-2024
  • 10m resolution: High-resolution analysis capabilities
  • Multi-temporal: Track changes over time
  • Ready-to-use: Pre-processed embeddings via Google Earth Engine

In this notebook, we’ll explore these powerful embeddings by analyzing temporal changes in Delhi, India’s capital region, demonstrating how satellite embeddings can reveal urban dynamics and environmental patterns over time.

Prerequisites and Setup

Before running this notebook, ensure you have:

  1. Google Earth Engine Access: Sign up at earthengine.google.com
  2. Required packages: pip install earthengine-api geemap
  3. Authentication: Run the authentication cell below and follow the prompts

Note: If you encounter authentication errors, make sure you have a Google Cloud project associated with your Earth Engine account. This may require setting up billing for some users.

import ee
import geemap
import datetime

# Initialize Earth Engine with your project
PROJECT_ID = 'ee-nipunbatra0'

try:
    ee.Initialize(project=PROJECT_ID)
    print(f"✅ Earth Engine initialized with project: {PROJECT_ID}")
except Exception as e:
    print(f"❌ Earth Engine initialization failed: {e}")
    print("\n🔧 Troubleshooting steps:")
    print("1. Ensure you have run: ee.Authenticate()")
    print("2. Verify project access at https://console.cloud.google.com/")
    print("3. Check Earth Engine API is enabled for your project")
    raise e

# Define Delhi boundary
try:
    delhi = ee.FeatureCollection("FAO/GAUL/2015/level1") \
               .filter(ee.Filter.eq("ADM1_NAME", "Delhi")) \
               .geometry()
    
    # Load satellite embeddings collection
    embeddings = ee.ImageCollection("GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL") \
                    .filterBounds(delhi)
    
    bands = ["A00", "A01", "A02"]  # Use first 3 embedding dimensions as RGB
    
    # Create mapping between image IDs and years
    ids   = embeddings.aggregate_array("system:index").getInfo()
    times = embeddings.aggregate_array("system:time_start").getInfo()
    years = [datetime.datetime.utcfromtimestamp(t/1000).year for t in times]
    
    index_year_map = dict(zip(ids, years))
    print(f"📊 Found {len(years)} years of data: {sorted(years)}")
    print(f"🗺️ Data loaded successfully for Delhi region")
    
except Exception as e:
    print(f"Data loading failed: {e}")
    print("This usually means the Earth Engine project setup is incomplete.")
    print("Please ensure you have a properly configured Google Cloud project.")
✅ Earth Engine initialized with project: ee-nipunbatra0
📊 Found 8 years of data: [2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]
🗺️ Data loaded successfully for Delhi region
/var/folders/1x/wmgn24mn1bbd2vgbqlk98tbc0000gn/T/ipykernel_44242/3943313447.py:34: DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a future version. Use timezone-aware objects to represent datetimes in UTC: datetime.datetime.fromtimestamp(timestamp, datetime.UTC).
  years = [datetime.datetime.utcfromtimestamp(t/1000).year for t in times]

The code above loads the satellite embeddings collection and examines the available years. We can see we have annual data from 2017 to 2024, providing 8 years of temporal coverage for analysis.

Key observations: - Complete temporal coverage: Annual embeddings from 2017-2024 - Consistent availability: All years present in the dataset - Unique identifiers: Each image has a system index for precise filtering

Visualization Preparation

Study Area: Delhi, India

Delhi serves as an excellent case study for satellite embedding analysis due to its:

  • Rapid urbanization: One of the world’s fastest-growing megacities
  • Diverse land use: Mix of urban, agricultural, and green spaces
  • Environmental challenges: Air quality, water resources, urban heat
  • Geographic significance: Capital territory with distinct administrative boundaries

Delhi’s National Capital Territory covers approximately 1,484 km² and has experienced dramatic changes over the past decade, making it ideal for temporal analysis using satellite embeddings.

Setup and Data Loading

delhi_outline = ee.Image().byte().paint(
    featureCollection=ee.FeatureCollection(delhi),
    color=1,
    width=3
)

def add_overlay(img):
    # visualize embeddings as RGB
    emb_vis = img.select(bands).unitScale(-1, 1).clip(delhi).visualize(
        bands=bands, min=0, max=1
    )
    # blend with outline (use palette instead of color)
    return emb_vis.blend(delhi_outline.visualize(palette=['white'])).set(
        {"system:time_start": img.get("system:time_start")}
    )

embeddings_overlay = embeddings.map(add_overlay)

This preprocessing step creates visualizable RGB representations from the high-dimensional embeddings. The key steps:

  1. Band Selection: We use the first 3 embedding dimensions (A00, A01, A02) as RGB channels
  2. Normalization: Scale values from [-1,1] to [0,1] for proper visualization
  3. Boundary Overlay: Add Delhi’s administrative boundary as a white outline for geographic reference
  4. Clipping: Restrict analysis to Delhi’s boundaries

Interactive Map Visualization

def normalize(img):
    return img.select(bands).unitScale(-1, 1).clip(delhi)

Map = geemap.Map(center=[28.61, 77.23], zoom=9)
Map.add_basemap("SATELLITE")  # Google Satellite

for idx, yr in index_year_map.items():
    img = embeddings.filter(ee.Filter.eq("system:index", idx)).first()
    if img:
        Map.addLayer(normalize(img), {"bands": bands, "min": 0, "max": 1}, f"Embedding {yr}")

Map.addLayer(delhi, {}, "Delhi boundary")
Map

The interactive map above allows you to explore each year’s embeddings by toggling layers. Notice how different areas show varying colors across years, indicating changes in land use, vegetation, or urban development patterns.

Temporal Animation

Now let’s create an animated visualization to see changes over time more clearly:

# Clip collection to Delhi + normalize and add visualization parameters
embeddings_vis = embeddings.map(lambda img: normalize(img).visualize(
    bands=bands, min=0, max=1
))

gif_params = {
    "region": delhi,
    "dimensions": 1080,            # higher resolution
    "framesPerSecond": 2,
    "titles": [str(y) for y in years],  # overlay year labels
    "fontSize": 40,
    "fontColor": "white",
    "progressBarColor": "blue"
}

gif_path = "delhi_embeddings.gif"
geemap.download_ee_video(embeddings_vis, gif_params, gif_path)

gif_path = "delhi_embeddings.gif"
mp4_path = "delhi_embeddings.mp4"

geemap.gif_to_mp4(gif_path, mp4_path)
print("Saved:", mp4_path)
Generating URL...
Downloading GIF image from https://earthengine.googleapis.com/v1/projects/ee-nipunbatra0/videoThumbnails/3eb896b75ef500a8f5046ac2ae9b56f3-a0da9b5d2b8f754f584e5eb4298e4459:getPixels
Please wait ...
The GIF image has been saved to: /Users/nipun/git/blog/posts/delhi_embeddings.gif
Saved: delhi_embeddings.mp4
from IPython.display import Video

Video("delhi_embeddings.mp4")

Analysis: What the Embeddings Reveal

The temporal animation above reveals several fascinating patterns in Delhi’s satellite embeddings from 2017-2024:

Key Observations

  1. Urban Expansion Patterns:
    • Notice the gradual changes in peripheral areas, indicating new construction and urban sprawl
    • The embeddings capture subtle changes in building density and infrastructure development
  2. Seasonal and Vegetation Cycles:
    • Year-to-year color variations often reflect vegetation health and seasonal patterns
    • Monsoon impacts on agricultural areas around Delhi are visible in the embedding space
  3. Infrastructure Development:
    • Major infrastructure projects (metro extensions, highways, new residential areas) create distinct embedding signatures
    • The model learns to differentiate between different types of urban development
  4. Environmental Changes:
    • Air quality variations may influence the spectral signatures captured in embeddings
    • Changes in water bodies and green spaces are clearly distinguishable

Technical Insights

The satellite embeddings demonstrate their power by: - Semantic Understanding: Unlike raw spectral bands, embeddings understand contextual meaning - Change Detection: Subtle changes invisible to human eyes are amplified in embedding space - Multi-scale Patterns: From individual buildings to city-wide land use changes - Temporal Consistency: Embeddings maintain consistent representation across years

This analysis showcases how pre-trained satellite embeddings can democratize remote sensing analysis, making sophisticated Earth observation accessible without requiring deep learning expertise.

Conclusion and Future Directions

This exploration of Google’s Alpha Earth satellite embeddings demonstrates the transformative potential of pre-trained deep learning models for Earth observation. By analyzing Delhi’s temporal changes from 2017-2024, we’ve shown how embeddings can:

  • Simplify Complex Analysis: No need for spectral band expertise or custom model training
  • Reveal Hidden Patterns: Detect subtle changes invisible in traditional imagery
  • Enable Rapid Prototyping: Quick insights for urban planning and environmental monitoring
  • Democratize Remote Sensing: Make satellite analysis accessible to non-experts

Next Steps

  1. Quantitative Analysis: Compute embedding similarity metrics to measure change rates
  2. Multi-city Comparison: Extend analysis to other rapidly developing urban areas
  3. Temporal Clustering: Group years by similarity to identify distinct development phases
  4. Ground Truth Validation: Correlate embedding changes with known infrastructure projects
  5. Real-time Monitoring: Set up automated change detection pipelines

Applications

This approach has immediate applications in: - Urban Planning: Track development patterns and inform policy decisions - Environmental Monitoring: Detect deforestation, water body changes, agricultural shifts - Disaster Response: Rapid damage assessment using pre/post event embeddings - Climate Research: Long-term land use change analysis for climate impact studies

The combination of Google Earth Engine’s computational power and Alpha Earth’s semantic embeddings opens new possibilities for scalable, intelligent Earth observation at unprecedented scale.


This notebook demonstrates the power of modern satellite AI for understanding our changing planet. The embeddings capture not just what we see, but what the data means - transforming pixels into insights.