edmt.workflow.connector

Module Contents

edmt.workflow.connector.get_satellite_collection(product, satellite=None, start_date=None, end_date=None)

Retrieve an Earth Engine ImageCollection and metadata for a supported environmental data product.

This function serves as a unified entry point to access preconfigured satellite or gridded datasets (e.g., NDVI, LST, precipitation) by delegating to specialized internal pipelines. It returns both the raw image collection and a dictionary of processing parameters.

Parameters

productstr

Name of the environmental data product. Supported values include: - Vegetation indices: "NDVI", "EVI" - Land Surface Temperature: "LST" - Precipitation: "CHIRPS" (Case-insensitive; aliases are normalized internally.)

satellitestr, optional

Satellite platform or sensor (e.g., "MODIS", "LANDSAT", "SENTINEL2"). Required for vegetation and LST products; ignored for grid-based products like CHIRPS.

start_datestr, optional

Start date in 'YYYY-MM-DD' format. Required if end_date is provided.

end_datestr, optional

End date in 'YYYY-MM-DD' format. Required if start_date is provided.

Returns

tuple[ee.ImageCollection, dict]
  • ImageCollection: Filtered and preprocessed Earth Engine image collection.

  • meta: Dictionary containing:
    • Product-specific scaling factors, band names, and units

    • Input arguments: "product", "satellite", "start_date", "end_date"

Raises

ValueError

If product is not supported or if required arguments are missing for the selected pipeline.

Notes

  • Date filtering is applied during collection construction.

  • No cloud masking, quality filtering, or unit conversion is performed beyond what is defined in the underlying pipeline.

  • All collections preserve native temporal and spatial metadata for downstream use.

edmt.workflow.connector.compute_period_feature(product: str, start: ee.Date, collection: ee.ImageCollection, geometry: ee.Geometry, frequency: str, meta: Dict[str, Any], scale: int | None = None) ee.Feature

Compute spatial summary statistics for a given environmental product over a time period and region of interest, returning an Earth Engine Feature.

This function aggregates images in the input collection over a temporal window (defined by start and frequency), computes statistics over geometry, and packages results into a feature with standardized properties—including product-specific metadata from meta.

Parameters

productstr

Environmental product name (e.g., "NDVI", "LST", "CHIRPS"). Used to select appropriate statistic computation logic.

startee.Date

Start date of the aggregation period.

collectionee.ImageCollection

Pre-filtered ImageCollection containing the relevant band(s).

geometryee.Geometry

Region of interest for spatial reduction.

frequency{"daily", "weekly", "monthly", "yearly"}

Temporal interval defining the period length. Determines end date.

metadict

Metadata dictionary (typically from get_satellite_collection) containing at least: - "scale_m": default spatial resolution (in meters) - "band": primary band name (e.g., "NDVI") - "unit": measurement unit (e.g., "NDVI", "°C", "mm") Additional keys may be used by _compute.

scaleint, optional

Spatial resolution (in meters) for reduction. If omitted, defaults to meta["scale_m"].

Returns

ee.Feature

A feature with no geometry and the following properties: - "date": Period start formatted as "YYYY-MM-dd" - "product": Uppercase product name - "band": Band name used (from meta) - "unit": Unit of measurement (from meta) - "n_images": Number of images in the period - Statistic keys (e.g., "mean", "median", "min", "max") — values are null if no data.

Raises

ValueError

If scale is missing and meta["scale_m"] is not present or invalid.

Notes

  • For empty periods (no images), returns a feature with "n_images": 0 and all statistics as null.

  • Geometry is reprojected to the image’s native CRS before reduction for accuracy.

  • Designed for use in time-series generation (e.g., mapping over date sequences).

edmt.workflow.connector.ComputeTimeseries(product: str, start_date: str, end_date: str, frequency: str, roi_gdf: geopandas.GeoDataFrame, satellite: str | None = None, scale: int | None = None) pandas.DataFrame

Generate a time series of environmental metrics (e.g., NDVI, LST, precipitation) over a region of interest.

This function retrieves satellite or gridded data for a specified product, aggregates it over regular time intervals (daily, weekly, monthly, or yearly), computes spatial statistics,and returns results as a pandas DataFrame with standardized columns.

Parameters

productstr

Environmental product to retrieve. Supported values include: - "NDVI", "EVI" (vegetation indices) - "LST" (Land Surface Temperature) - "CHIRPS" (precipitation)

start_datestr

Start date of the time series in 'YYYY-MM-DD' format.

end_datestr

End date of the time series in 'YYYY-MM-DD' format.

frequency{"daily", "weekly", "monthly", "yearly"}

Temporal aggregation interval.

roi_gdfgeopandas.GeoDataFrame

Region of interest as a GeoDataFrame containing Polygon or MultiPolygon geometries. Must be provided; cannot be None.

satellitestr, optional

Satellite platform (e.g., "MODIS", "LANDSAT", "SENTINEL2"). Required for vegetation and LST products. Ignored for grid-based products like CHIRPS.

scaleint, optional

Spatial resolution (in meters) for reduction. If omitted, a product- and sensor-appropriate default is used (e.g., 500 m for MODIS LST, 10 m for Sentinel-2).

Returns

pd.DataFrame

A DataFrame with one row per time period, containing: - "date": Period start as "YYYY-MM-dd" - "product": Uppercase product name - "satellite": Satellite name (if applicable) - Statistic columns (e.g., "mean", "median", "ndvi", "precipitation_mm") - "n_images": Number of source images used per period - "unit": Measurement unit (e.g., "NDVI", "°C", "mm") - (Optional) "month": Full month name (e.g., "January") if frequency="monthly"

Rows with all-null statistics are removed based on product-specific logic.

Raises

ValueError

If roi_gdf is not provided.

Notes

  • For MODIS products, the ROI geometry is reprojected to the native sinusoidal projection to ensure accurate spatial reduction.

  • The collection is pre-filtered to the ROI using filterBounds for performance.

  • Time periods are generated using calendar-aware intervals (not fixed day counts).

  • Missing or invalid data points are filtered out post-reduction based on the primary metric:
    • LST: removes rows where "mean" is null

    • CHIRPS: removes rows where "precipitation_mm" is null

    • Vegetation indices: removes rows where the index column ("ndvi", "evi") is null

  • Requires an initialized Earth Engine session (ee.Initialize()); ensured via ee_initialized().

edmt.workflow.connector.CompositeImage(product: str, start_date: str, end_date: str, satellite: str | None = None, roi_gdf: geopandas.GeoDataFrame | None = None, reducer: str = 'mean') ee.Image

Generate a single composite Earth Engine image by aggregating a time series of environmental data.

This function retrieves a filtered ImageCollection for the specified product and time range, applies a temporal reducer (e.g., mean, median), and optionally clips the result to a region of interest.

Parameters

productstr

Environmental product name (e.g., "NDVI", "LST", "CHIRPS", "EVI"). Case-insensitive; normalized internally.

start_datestr

Start date in 'YYYY-MM-DD' format.

end_datestr

End date in 'YYYY-MM-DD' format.

satellitestr, optional

Satellite platform (e.g., "MODIS", "LANDSAT", "SENTINEL2"). Required for sensor-based products; ignored for gridded datasets like CHIRPS.

roi_gdfgeopandas.GeoDataFrame, optional

Region of interest as a GeoDataFrame. If provided, the output image is clipped to this geometry.

reducer{"mean", "median", "min", "max"}, optional

Temporal aggregation method applied across the time series (default: "mean").

Returns

ee.Image

A single-band (or multi-band) composite image with: - Band name(s) preserved from the source collection (e.g., "NDVI", "LST_C") - Properties including:

  • "product": normalized product name

  • "satellite": satellite used (if applicable)

  • "start_date", "end_date": time range

  • "reducer": aggregation method

  • "unit": measurement unit (e.g., "NDVI", "°C", "mm")

Notes

  • Uses internally to handle product-specific compositing logic.

  • For MODIS, the ROI geometry is reprojected to the native projection before clipping (if roi_gdf is provided).

  • The input collection is pre-filtered to the ROI using filterBounds for efficiency.

  • No cloud masking or quality filtering is applied beyond what is defined in get_satellite_collection.

  • Requires an initialized Earth Engine session (ee.Initialize()); ensured via ee_initialized().

edmt.workflow.connector.CollectionImage(product: str, start_date: str, end_date: str, frequency: edmt.workflow.builder.Frequency = 'monthly', satellite: str | None = None, roi_gdf: geopandas.GeoDataFrame | None = None, reducer: edmt.workflow.builder.ReducerName = 'mean') ee.ImageCollection

Generate an Earth Engine ImageCollection of temporally aggregated composites over regular intervals.

This function divides the input date range into periods, aggregates imagery within each period using a specified reducer, and returns a time-series ImageCollection suitable for animation, charting, or further analysis.

Parameters

productstr

Environmental data product. Supported values include: - Vegetation: "NDVI", "EVI" - Temperature: "LST" - Precipitation: "CHIRPS" (Case-insensitive; normalized internally.)

start_datestr

Start date in 'YYYY-MM-DD' format.

end_datestr

End date in 'YYYY-MM-DD' format.

frequency{"daily", "weekly", "monthly", "yearly"}, optional

Temporal interval for compositing (default: "monthly").

satellitestr, optional

Satellite platform (e.g., "MODIS", "LANDSAT", "SENTINEL2"). Required for sensor-based products; ignored for gridded datasets like CHIRPS.

roi_gdfgeopandas.GeoDataFrame, optional

Region of interest as a GeoDataFrame. If provided, the collection is filtered to this region and output images are clipped to it.

reducer{"mean", "median", "min", "max", "sum"}, optional

Temporal aggregation method. For "CHIRPS", "sum" is allowed (for total precipitation); other products support only statistical reducers (default: "mean").

Returns

ee.ImageCollection

An ImageCollection where each image: - Represents one time period (e.g., January 2023) - Contains band(s) named per the source product - Has properties:

  • "system:time_start": period start (milliseconds since Unix epoch)

  • "period_start": formatted as "YYYY-MM-dd"

  • "product", "satellite", "frequency", "reducer", "unit"

  • (For monthly frequency) Includes a "month" property with full month name (e.g., "January")

Raises

ValueError

If an unsupported reducer is specified for the given product (e.g., "sum" for NDVI).

Notes

  • For MODIS vegetation products, the ROI geometry is reprojected to the native sinusoidal projection before filtering and clipping to ensure spatial accuracy.

  • The collection is pre-filtered to the ROI (if provided) for performance.

  • Time periods are generated using calendar-aware intervals (not fixed day counts).

  • CHIRPS supports "sum" to compute total precipitation over the period; all other products use pixel-wise statistics (mean, median, etc.).

  • Requires an initialized Earth Engine session (ee.Initialize()); ensured via ee_initialized().

edmt.workflow.connector.ee_to_points(image: ee.Image, scale: int = 30, num_pixels: int = 5000) geopandas.GeoDataFrame

Sample pixel values from an Earth Engine image and return them as a GeoDataFrame.

This function extracts a uniform random subset of pixels from the input ee.Image at a specified spatial resolution. Each sampled pixel is converted to a point geometry with its corresponding band values stored as attributes. The resulting data is downloaded synchronously and formatted as a geopandas.GeoDataFrame with WGS84 (EPSG:4326) projection.

Args:
image (ee.Image): The input Earth Engine image to sample. Must be a valid,

initialized Earth Engine image object.

scale (int, optional): The nominal scale in meters at which to sample the image.

Defaults to 30. Should closely match the native resolution of the target bands for accurate value extraction.

num_pixels (int, optional): The maximum number of pixels to sample. Defaults to

5000. Earth Engine will return up to this number (or fewer if the image contains fewer valid/unmasked pixels).

Returns:
gpd.GeoDataFrame: A GeoDataFrame where each row represents a sampled pixel.

Columns include the point geometry (named geometry) and one column per image band containing the sampled values. The coordinate reference system (CRS) is explicitly set to EPSG:4326.

Raises:
ee.EEException: If the image is invalid, the scale is unsupported, Earth Engine

computation times out, or the response payload exceeds the getInfo() limit.

Example:
>>> import ee
>>> import geopandas as gpd
>>> ee.Initialize()
>>> img = ee.Image('COPERNICUS/S2_SR/20230615T123456').select(['B4', 'B8'])
>>> gdf = ee_to_points(img, scale=10, num_pixels=1000)
>>> print(gdf.head())
>>> print(gdf.crs)  # EPSG:4326