Coordinate Bounds Calculation#

Overview#

Coordinate bounds (e.g., lat_bnds, lon_bnds, plev_bnds, depth_bnds) are required for CMIP compliance. They define the edges of grid cells and vertical levels, and are essential for proper data interpretation, especially for:

  • Calculating accurate spatial averages

  • Determining grid cell areas and volumes

  • Ensuring proper regridding and interpolation

  • Defining vertical level boundaries for atmospheric and oceanic data

  • Meeting CF conventions and CMIP standards

Automatic Bounds Calculation#

As of the latest version, pymorize automatically calculates coordinate bounds if they are not present in your grid file. This feature is integrated into the setgrid function and works seamlessly with your existing workflows.

How It Works#

When you use the setgrid function with a grid file that doesn’t contain bounds variables:

  1. The function detects missing lat_bnds and lon_bnds

  2. Automatically calculates bounds from coordinate values

  3. Adds the bounds to your dataset

  4. Sets the appropriate bounds attribute on coordinates

Algorithm#

The bounds calculation uses a robust algorithm that:

  • For interior cells: Uses midpoints between adjacent coordinate values

  • For edge cells: Extrapolates using the same spacing as adjacent cells

  • Ensures continuity: No gaps between cells (upper bound of cell i equals lower bound of cell i+1)

  • Works for both: Regular and irregular grids

Example Usage#

import xarray as xr
from pycmor.std_lib.setgrid import setgrid

# Your data
data = xr.DataArray(
    np.random.rand(10, 100),
    dims=["time", "ncells"],
    coords={"time": np.arange(10)},
    name="temperature"
)

# Rule with grid file (even if it doesn't have bounds)
rule = {"grid_file": "path/to/grid.nc"}

# Apply setgrid - bounds will be calculated automatically if missing
result = setgrid(data, rule)

# Result now has lat_bnds and lon_bnds
print("lat_bnds" in result.data_vars)  # True
print("lon_bnds" in result.data_vars)  # True

Manual Bounds Calculation#

You can also calculate bounds manually using the bounds module:

from pycmor.std_lib.bounds import add_bounds_to_grid, calculate_bounds_1d
import xarray as xr
import numpy as np

# For a simple 1D coordinate
lat = xr.DataArray(
    np.linspace(-90, 90, 180),
    dims=["lat"],
    attrs={"long_name": "latitude", "units": "degrees_north"}
)

lat_bnds = calculate_bounds_1d(lat)
print(lat_bnds.shape)  # (180, 2)

# For a complete grid dataset
grid = xr.open_dataset("grid.nc")
grid_with_bounds = add_bounds_to_grid(grid)

Vertical Bounds Calculation#

Similar to CDO’s genlevelbounds operator, pymorize can automatically calculate bounds for vertical coordinates such as pressure levels, depth, and height.

Overview#

Vertical bounds are required for:

  • Atmospheric pressure levels (plev, plev19, etc.)

  • Ocean depth levels (depth)

  • Height coordinates (height, altitude)

The add_vertical_bounds function automatically detects common vertical coordinate names and calculates appropriate bounds.

Note

Automatic in Default Pipeline: As of the latest version, vertical bounds calculation is automatically included in the DefaultPipeline. If you’re using the default pipeline, vertical bounds will be added automatically to any dataset with vertical coordinates—no additional configuration needed!

Usage in Default Pipeline#

The add_vertical_bounds step is automatically included in the DefaultPipeline (step 3, after variable extraction). This means:

  • All datasets processed with the default pipeline automatically get vertical bounds

  • No manual configuration required

  • Only applies to datasets with vertical coordinates (non-intrusive)

  • Preserves existing bounds if already present

from pycmor.core.pipeline import DefaultPipeline

# The default pipeline includes add_vertical_bounds automatically
pipeline = DefaultPipeline()

# Process your data - vertical bounds added automatically if applicable
result = pipeline.run(data, rule_spec)

Usage in Custom Pipelines#

You can also explicitly add add_vertical_bounds to custom pipelines:

from pycmor.std_lib import add_vertical_bounds

# In your pipeline configuration
pipeline = [
    "load_data",
    "get_variable",
    "add_vertical_bounds",  # Add this step
    "convert_units",
    "time_average",
    # ... other steps
]

Standalone Usage#

You can also use it directly on datasets:

from pycmor.std_lib.bounds import add_vertical_bounds
import xarray as xr
import numpy as np

# Dataset with pressure levels
ds = xr.Dataset({
    'ta': (['time', 'plev', 'lat', 'lon'], np.random.rand(10, 8, 5, 6)),
}, coords={
    'plev': [100000, 92500, 85000, 70000, 60000, 50000, 40000, 30000],
    'lat': np.linspace(-90, 90, 5),
    'lon': np.linspace(0, 360, 6),
})

# Add vertical bounds
ds_with_bounds = add_vertical_bounds(ds)

# Now ds_with_bounds contains 'plev_bnds'
print(ds_with_bounds['plev_bnds'])

Supported Vertical Coordinates#

The function automatically detects these coordinate names:

  • Pressure levels: plev, plev19, plev8, plev7, plev4, plev3, lev, level, pressure

  • Depth: depth

  • Height: height, alt, altitude

Example with Ocean Depth#

# Ocean data with depth levels
ds = xr.Dataset({
    'thetao': (['time', 'depth', 'lat', 'lon'], np.random.rand(10, 8, 5, 6)),
}, coords={
    'depth': [0, 10, 20, 50, 100, 200, 500, 1000],
    'lat': np.linspace(-90, 90, 5),
    'lon': np.linspace(0, 360, 6),
})
ds['depth'].attrs['units'] = 'm'
ds['depth'].attrs['positive'] = 'down'

# Add depth bounds
ds_with_bounds = add_vertical_bounds(ds)

# Bounds are calculated automatically
print(ds_with_bounds['depth_bnds'].values[:3])
# Output: [[-5.  5.], [ 5. 15.], [15. 35.]]

CMIP Compliance#

According to CMIP6/CMIP7 coordinate specifications:

  • Both latitude and longitude have "must_have_bounds": "yes"

  • Bounds are required for proper data interpretation

  • The bounds should follow CF conventions

With automatic bounds calculation, pymorize ensures your data meets these requirements even if your original grid files don’t include pre-computed bounds.

Technical Details#

Bounds Format#

Bounds are stored as 2D arrays with shape (n, 2) where:

  • n is the number of coordinate points

  • First column ([:, 0]) contains lower bounds

  • Second column ([:, 1]) contains upper bounds

Coordinate Attributes#

When bounds are added, the coordinate variable gets a bounds attribute:

lat.attrs["bounds"] = "lat_bnds"

This follows CF conventions and allows tools to automatically discover the bounds.

Supported Grids#

  • 1D regular grids: Evenly spaced coordinates (e.g., lat = [-90, -89, ..., 90])

  • 1D irregular grids: Unevenly spaced coordinates (e.g., lat = [10, 15, 25, 45, 50])

  • 2D unstructured grids: Limited support (pre-computed bounds recommended)

Limitations#

For 2D unstructured grids (e.g., FESOM meshes), automatic bounds calculation is simplified and may not be accurate. For these cases, it’s recommended to:

  1. Pre-compute bounds using your model’s grid generation tools

  2. Include bounds in your grid files

  3. The existing bounds will be preserved and used

See Also#