Calculating derived variables
Calculations are used to derive a variable from one or multiple inputs, to resample a variable to a new frequency or generally to modify a variable so it will match fully the corresponding definition in a CMOR table.
How calculations work
Calculations are defined in the mapping file under the filed by the same name. The calculation string gets literally evaluated by the tool using python eval() function. As an example
simple calculation could be summing avariable across all its vertical levels:
mrso;fld_s08i223;var[0].sum(dim='depth')
var represents the list of input variables, in this case there’s only one which var[0] in the calculation string. In this case the calculation is very simple and can be fully defined in the mapping itself. If the calculation is more complex it’s easier to use a pre-defined function, for example:
hus24;fld_s00i010 fld_s00i408;plevinterp(var[0], var[1], 24)
Here plevinterp is called to interpolate specific humidity from model levels to pressure levels, this function takes three input arguments, the variable to interpolate, pressure at model levels and finally the number of pressure levels, which corresponds to a specific definition of the pressure levels coordinate. Already available functions are listed below.
Note
When more than one variable is used as input, if the variables are not all in the same file, more than one file pattern can be specified in the mapping row.
Resample
If a variable is available in the raw model output but not at the desired frequency, the tool will try to see if a higher frequency is available to be resampled. For example, if a user is interested in daily surface temperature but this is available only as hourly data, during the mop setup phase the tool will add a resample attribute with value ‘D’ to the variable and this will used as argument for the resample function. Which kind of statistics to use for the function is defined based on the timeshot attribute, so if a variable is defined as a maximum, minimum or sum these are used in the resample instead of the mean.
Contributing
TBA
Available functions
Atmosphere and aerosol
Ocean
SeaIce
Land
- mopper.calc_land.average_tile(var, tilefrac=None, lfrac=1, landfrac=None, lev=None)[source]
Returns variable averaged over grid-cell, counting only specific tile/s and land fraction when suitable.
For example: nLitter is nitrogen mass in litter and should be calculated only over land fraction and each tile type will have different amounts of litter. average = sum_over_tiles(N amount on tile * tilefrac) * landfrac
- Parameters:
var (Xarray DataArray) – Variable to process defined opver tiles
tilefrac (Xarray DataArray, optional) – Variable defining tiles’ fractions (default is None) if None, read from ancil file
lfrac (int, optional) – Controls if landfrac is considered (1) or not (0) (deafault 1)
landfrac (Xarray DataArray) – Variable defining land fraction (default is None) If None, read from ancil file
lev (str) – Name of pseudo level to add to output array (default is None)
- Returns:
vout – averaged input variable
- Return type:
Xarray DataArray
- mopper.calc_land.calc_landcover(ctx, var, model)
Returns land cover fraction variable
- Parameters:
ctx (click context obj) – Dictionary including ‘cmor’ settings and attributes for experiment
var (list(xarray.DataArray)) – List of input variables to sum
model (str) – Name of land surface model to retrieve land tiles definitions
- Returns:
vout – Land cover faction variable
- Return type:
xarray.DataArray
- mopper.calc_land.calc_topsoil(ctx, soilvar)
Returns the variable over the first 10cm of soil.
- Parameters:
ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes
soilvar (Xarray DataArray) – Soil moisture over soil levels
- Returns:
topsoil – Variable defined on top 10cm of soil
- Return type:
Xarray DataArray
- mopper.calc_land.extract_tilefrac(ctx, tilefrac, tilenum, landfrac=None, lev=None)
Calculates the land fraction of a specific type: crops, grass, etc.
- Parameters:
ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes
tilefrac (Xarray DataArray) – variable
tilenum (Int or [Int]) – the number indicating the tile
landfrac (Xarray DataArray) – Land fraction variable if None (default) is read from ancil file
lev (str) – name of pseudo level to add to output array (default is None)
- Returns:
vout – land fraction of object
- Return type:
Xarray DataArray
- Raises:
Exception – tile number must be an integer or list
- mopper.calc_land.landuse_frac(ctx, var, landfrac=None, nwd=0, tiles='cmip6')
Defines new tile fractions variables where original model tiles are re-organised in 4 super-categories
- 0 - psl Primary and secondary land (includes forest, grasslands,
and bare ground) (1,2,3,4,5,6,7,11,14) or (6,7,11,14?) if nwd is true. Possibly excluding barren soil is an error?
1 - pst Pastureland (includes managed pastureland and rangeland) (2) or (7) if nwd 2 - crp Cropland (9) or (7) if nwd 3 - Urban settlement (15) or (14) if nwd is true??
Tiles in CABLE: 1. Evergreen Needleleaf 2. Evergreen Broadleaf 3. Deciduous Needleleaf 4. Deciduous Broadleaf 5. Shrub 6. C3 Grassland 7. C4 Grassland 8. Tundra 9. C3 Cropland 10. C4 Cropland 11. Wetland 12. empty 13. empty 14. Barren 15. Urban 16. Lakes 17. Ice
NB this is currently hardcoded for above definitions, but potentially output could depend on different categories and land model used.
- Parameters:
var (Xarray DataArray) – Tile variable
landfrac (Xarray DataArray) – Land fraction variable if None (default) is read from ancil file
nwd (int) – Indicates if only non-woody categories (1) or all (0 - default) should be used
tiles (str) – Tiles definition to use for landUse dimension, default is cmip
- Returns:
vout – Input tile variable redifined over 4 super-categories
- Return type:
Xarray DataArray
Other
- mopper.calc_utils.K_degC(ctx, var, inverse=False)
Converts temperature from/to K to/from degC.
- Parameters:
ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes
var (Xarray DataArray) – temperature array
- Returns:
vout – temperature array in degrees Celsius or Kelvin if inverse is True
- Return type:
Xarray DataArray
- mopper.calc_utils.add_axis(var, name, value)[source]
Returns the same variable with an extra singleton axis added
- Parameters:
var (Xarray DataArray) – Variable to modify
name (str) – cmor name for axis
value (float) – value of the new singleton dimension
- Returns:
var – Same variable with added axis at start
- Return type:
Xarray DataArray
- mopper.calc_utils.get_coords(ctx, coords)
Get lat/lon and their boundaries from ancil file
- ctxclick context
Includes obj dict with ‘cmor’ settings, exp attributes
- coordslist
List of coordinates retrieved from variable encoding
- mopper.calc_utils.sum_vars(ctx, varlist)
Returns sum of all variables in list :param varlist: Variables to sum :type varlist: list(xarray.DataArray)
- Returns:
varout – Sum of input variables
- Return type:
xarray.DataArray
- mopper.calc_utils.time_resample(ctx, var, rfrq, tdim, orig_tshot, sample='down', stats='mean')
Resamples the input variable to the specified frequency using specified statistic.
Resample is used with the options: origin = ‘start_day’ closed = ‘right’ This puts the time label to the start of the interval and offset is applied to get a centered time label. The rfrq valid labels are described here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#period-aliases
- Parameters:
ctx (click context) – Includes obj dict with ‘cmor’ settings, exp attributes
var (xarray.DataArray) – Variable to resample.
rfrq (str) – Resample frequency see above for valid inputs.
tdim (str) – The name of the time dimension
orig_tshot (str) – original timeshot of input variable
sample (str) – The type of resampling to perform. Valid inputs are ‘up’ for upsampling or ‘down’ for downsampling. (default down)
stats (str) – The reducing function to follow resample: mean, min, max, sum. (default mean)
- Returns:
vout – The resampled variable.
- Return type:
xarray.DataArray or xarray.Dataset
- Raises:
ValueError – If the input variable is not a valid Xarray object.
ValueError – If the sample parameter is not ‘up’ or ‘down’.