Calculating derived variables

Calculations are used to derive a variable from one or multiple inputs, to resample a variable to a new frequency or generally to modify a variable so it will match fully the corresponding definition in a CMOR table.

How calculations work

Calculations are defined in the mapping file under the filed by the same name. The calculation string gets literally evaluated by the tool using python eval() function. As an example

simple calculation could be summing avariable across all its vertical levels:

mrso;fld_s08i223;var[0].sum(dim='depth')

var represents the list of input variables, in this case there’s only one which var[0] in the calculation string. In this case the calculation is very simple and can be fully defined in the mapping itself. If the calculation is more complex it’s easier to use a pre-defined function, for example:

hus24;fld_s00i010 fld_s00i408;plevinterp(var[0], var[1], 24)

Here plevinterp is called to interpolate specific humidity from model levels to pressure levels, this function takes three input arguments, the variable to interpolate, pressure at model levels and finally the number of pressure levels, which corresponds to a specific definition of the pressure levels coordinate. Already available functions are listed below.

Note

When more than one variable is used as input, if the variables are not all in the same file, more than one file pattern can be specified in the mapping row.

Resample

If a variable is available in the raw model output but not at the desired frequency, the tool will try to see if a higher frequency is available to be resampled. For example, if a user is interested in daily surface temperature but this is available only as hourly data, during the mop setup phase the tool will add a resample attribute with value ‘D’ to the variable and this will used as argument for the resample function. Which kind of statistics to use for the function is defined based on the timeshot attribute, so if a variable is defined as a maximum, minimum or sum these are used in the resample instead of the mean.

Contributing

TBA

Available functions

Atmosphere and aerosol

Ocean

SeaIce

Land

mopper.calc_land.average_tile(var, tilefrac=None, lfrac=1, landfrac=None, lev=None)[source]

Returns variable averaged over grid-cell, counting only specific tile/s and land fraction when suitable.

For example: nLitter is nitrogen mass in litter and should be calculated only over land fraction and each tile type will have different amounts of litter. average = sum_over_tiles(N amount on tile * tilefrac) * landfrac

Parameters:
  • var (Xarray DataArray) – Variable to process defined opver tiles

  • tilefrac (Xarray DataArray, optional) – Variable defining tiles’ fractions (default is None) if None, read from ancil file

  • lfrac (int, optional) – Controls if landfrac is considered (1) or not (0) (deafault 1)

  • landfrac (Xarray DataArray) – Variable defining land fraction (default is None) If None, read from ancil file

  • lev (str) – Name of pseudo level to add to output array (default is None)

Returns:

vout – averaged input variable

Return type:

Xarray DataArray

mopper.calc_land.calc_landcover(ctx, var, model)

Returns land cover fraction variable

Parameters:
  • ctx (click context obj) – Dictionary including ‘cmor’ settings and attributes for experiment

  • var (list(xarray.DataArray)) – List of input variables to sum

  • model (str) – Name of land surface model to retrieve land tiles definitions

Returns:

vout – Land cover faction variable

Return type:

xarray.DataArray

mopper.calc_land.calc_topsoil(ctx, soilvar)

Returns the variable over the first 10cm of soil.

Parameters:
  • ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes

  • soilvar (Xarray DataArray) – Soil moisture over soil levels

Returns:

topsoil – Variable defined on top 10cm of soil

Return type:

Xarray DataArray

mopper.calc_land.extract_tilefrac(ctx, tilefrac, tilenum, landfrac=None, lev=None)

Calculates the land fraction of a specific type: crops, grass, etc.

Parameters:
  • ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes

  • tilefrac (Xarray DataArray) – variable

  • tilenum (Int or [Int]) – the number indicating the tile

  • landfrac (Xarray DataArray) – Land fraction variable if None (default) is read from ancil file

  • lev (str) – name of pseudo level to add to output array (default is None)

Returns:

vout – land fraction of object

Return type:

Xarray DataArray

Raises:

Exception – tile number must be an integer or list

mopper.calc_land.landuse_frac(ctx, var, landfrac=None, nwd=0, tiles='cmip6')

Defines new tile fractions variables where original model tiles are re-organised in 4 super-categories

0 - psl Primary and secondary land (includes forest, grasslands,

and bare ground) (1,2,3,4,5,6,7,11,14) or (6,7,11,14?) if nwd is true. Possibly excluding barren soil is an error?

1 - pst Pastureland (includes managed pastureland and rangeland) (2) or (7) if nwd 2 - crp Cropland (9) or (7) if nwd 3 - Urban settlement (15) or (14) if nwd is true??

Tiles in CABLE: 1. Evergreen Needleleaf 2. Evergreen Broadleaf 3. Deciduous Needleleaf 4. Deciduous Broadleaf 5. Shrub 6. C3 Grassland 7. C4 Grassland 8. Tundra 9. C3 Cropland 10. C4 Cropland 11. Wetland 12. empty 13. empty 14. Barren 15. Urban 16. Lakes 17. Ice

NB this is currently hardcoded for above definitions, but potentially output could depend on different categories and land model used.

Parameters:
  • var (Xarray DataArray) – Tile variable

  • landfrac (Xarray DataArray) – Land fraction variable if None (default) is read from ancil file

  • nwd (int) – Indicates if only non-woody categories (1) or all (0 - default) should be used

  • tiles (str) – Tiles definition to use for landUse dimension, default is cmip

Returns:

vout – Input tile variable redifined over 4 super-categories

Return type:

Xarray DataArray

Other

mopper.calc_utils.K_degC(ctx, var, inverse=False)

Converts temperature from/to K to/from degC.

Parameters:
  • ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes

  • var (Xarray DataArray) – temperature array

Returns:

vout – temperature array in degrees Celsius or Kelvin if inverse is True

Return type:

Xarray DataArray

mopper.calc_utils.add_axis(var, name, value)[source]

Returns the same variable with an extra singleton axis added

Parameters:
  • var (Xarray DataArray) – Variable to modify

  • name (str) – cmor name for axis

  • value (float) – value of the new singleton dimension

Returns:

var – Same variable with added axis at start

Return type:

Xarray DataArray

mopper.calc_utils.get_coords(ctx, coords)

Get lat/lon and their boundaries from ancil file

ctxclick context

Includes obj dict with ‘cmor’ settings, exp attributes

coordslist

List of coordinates retrieved from variable encoding

mopper.calc_utils.sum_vars(ctx, varlist)

Returns sum of all variables in list :param varlist: Variables to sum :type varlist: list(xarray.DataArray)

Returns:

varout – Sum of input variables

Return type:

xarray.DataArray

mopper.calc_utils.time_resample(ctx, var, rfrq, tdim, orig_tshot, sample='down', stats='mean')

Resamples the input variable to the specified frequency using specified statistic.

Resample is used with the options: origin = ‘start_day’ closed = ‘right’ This puts the time label to the start of the interval and offset is applied to get a centered time label. The rfrq valid labels are described here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#period-aliases

Parameters:
  • ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes

  • var (xarray.DataArray) – Variable to resample.

  • rfrq (str) – Resample frequency see above for valid inputs.

  • tdim (str) – The name of the time dimension

  • orig_tshot (str) – original timeshot of input variable

  • sample (str) – The type of resampling to perform. Valid inputs are ‘up’ for upsampling or ‘down’ for downsampling. (default down)

  • stats (str) – The reducing function to follow resample: mean, min, max, sum. (default mean)

Returns:

vout – The resampled variable.

Return type:

xarray.DataArray or xarray.Dataset

Raises:
  • ValueError – If the input variable is not a valid Xarray object.

  • ValueError – If the sample parameter is not ‘up’ or ‘down’.