Skip to content
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
5ae3e93
Make CMOR tables configurable through new configuration system
bouweandela Jan 19, 2026
aaaeb9e
Fix another test
bouweandela Jan 19, 2026
ea4aa6a
Add docstrings and type hints
bouweandela Jan 23, 2026
64f365b
Gracefully handle missing out_name in CMIP5-style CMOR tables and upd…
bouweandela Jan 23, 2026
1b58631
Merge branch 'main' into new-config-cmor
bouweandela Jan 23, 2026
7e74268
Merge branch 'main' of github.com:ESMValGroup/ESMValCore into new-con…
bouweandela Jan 26, 2026
d676fbd
Remove config_developer_file from default configuration
bouweandela Jan 26, 2026
d0090d4
Use explicit configuration instead of YAML anchors
bouweandela Jan 26, 2026
d11aa25
Merge branch 'main' into new-config-cmor
valeriupredoi Jan 28, 2026
33ce271
Small improvements
bouweandela Jan 28, 2026
ed35a66
Do not load CMOR tables from built-in config-developer.yml prior to l…
bouweandela Jan 28, 2026
4e11d10
Add note to docs about what to do if CMOR_TABLES is empty
bouweandela Jan 28, 2026
ab823bd
Add upgrade instructions for users moving away from config-developer.yml
bouweandela Jan 29, 2026
d08827f
Fix no CMOR table case and update more docs
bouweandela Jan 29, 2026
01129d4
Fix documentation build
bouweandela Jan 29, 2026
f80b1a4
Further docs improvements
bouweandela Jan 29, 2026
d14d09f
Add test
bouweandela Jan 29, 2026
e2dfa5d
Improve test coverage
bouweandela Jan 29, 2026
74f27cc
Test failure to load cmor tables cases
bouweandela Jan 29, 2026
0cba683
More tests
bouweandela Jan 29, 2026
adff349
Add another test and update to pytest style tests
bouweandela Jan 29, 2026
067c6f0
Add more tests
bouweandela Jan 29, 2026
565849e
Merge branch 'main' of github.com:ESMValGroup/ESMValCore into new-con…
bouweandela Jan 29, 2026
69487f0
Add more tests
bouweandela Jan 29, 2026
8c13364
Add copilot instructions
bouweandela Jan 30, 2026
dd2f560
Implement suggestions from code review
bouweandela Jan 30, 2026
08924a9
Address more review comments
bouweandela Jan 30, 2026
7a6c11a
Fix typo in link
bouweandela Jan 30, 2026
aff84c8
Deprecate esmvalcore.cmor.table.CMOR_TABLES becuase of #2954
bouweandela Jan 30, 2026
4199ca5
Small improvements
bouweandela Jan 30, 2026
fda2000
Improve AI instructions
bouweandela Jan 30, 2026
eb36404
Merge branch 'main' into new-config-cmor
bouweandela Jan 30, 2026
bf8d6e5
Merge branch 'main' into new-config-cmor
valeriupredoi Feb 2, 2026
5519c3f
Improve type hint
bouweandela Feb 2, 2026
a18ee08
Add punctuation
bouweandela Feb 4, 2026
2b711c9
Clarify data requirements for non-cmorized data
bouweandela Feb 4, 2026
e0edb85
Spelling
bouweandela Feb 4, 2026
e502d7c
Merge branch 'main' into new-config-cmor
valeriupredoi Feb 4, 2026
daec426
Add examples to docstrings
bouweandela Feb 5, 2026
fea3efd
Add link for branded variable background
bouweandela Feb 5, 2026
403bd62
Fix docs
bouweandela Feb 5, 2026
739fe73
Better error messages
bouweandela Feb 5, 2026
5a1852a
Add tests
bouweandela Feb 5, 2026
8c0cbbd
Separate custom tables and reduce race conditions in tests
bouweandela Feb 5, 2026
ef94629
Merge branch 'main' into new-config-cmor
bouweandela Feb 5, 2026
b3de2a8
Restore custom tables test from config-developer and add another patc…
bouweandela Feb 6, 2026
e67aea3
Add another deprecation warning
bouweandela Feb 6, 2026
1dec95f
Ensure config_developer_file populates esmvalcore.cmor.table.CMOR_TAB…
bouweandela Feb 6, 2026
610c23d
Fix support for legacy data sources without custom config-developer.yml
bouweandela Feb 6, 2026
b5064fd
Improve test coverage and fix bug
bouweandela Feb 6, 2026
eb3306c
Merge branch 'main' of github.com:ESMValGroup/ESMValCore into new-con…
bouweandela Feb 9, 2026
81317b1
Update siextent units in CMIP6 custom table
bouweandela Feb 9, 2026
a982de2
Improve docs with suggestions by @schlunma
bouweandela Feb 9, 2026
5a2147d
Remove impractical suggestion on how to organize data
bouweandela Feb 9, 2026
2ee9aad
Make defaults/cmor_tables.yml a link
bouweandela Feb 9, 2026
b43a466
Add note about which files are read
bouweandela Feb 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/instructions/*.instructions.md
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -109,3 +109,7 @@ doc/_sidebar.rst.inc

# ESMF log files
*.ESMF_LogFile


#Ignore vscode AI rules
.github/instructions/codacy.instructions.md
2 changes: 1 addition & 1 deletion doc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -813,7 +813,7 @@ Bug fixes
~~~~~~~~~

- Respect ``ignore_warnings`` settings from the project configuration in config-developer.yml in :func:`esmvalcore.dataset.Dataset.load` (:pull:`2046`) by :user:`schlunma`
- Fixed usage of custom location for :ref:`custom CMOR tables <custom_cmor_tables>` (:pull:`2052`) by :user:`schlunma`
- Fixed usage of custom location for custom CMOR tables (:pull:`2052`) by :user:`schlunma`
- Fix issue with writing index.html when :ref:`running a recipe <running>` with ``--resume-from`` (:pull:`2055`) by :user:`bouweandela`
- Fixed bug in ICON CMORizer that lead to shifted time coordinates (:pull:`2038`) by :user:`schlunma`
- Include ``-`` in allowed characters for bibtex references (:pull:`2097`) by :user:`alistairsellar`
Expand Down
202 changes: 112 additions & 90 deletions doc/develop/fixing_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,16 @@ missing coordinate you can create a fix for this model:
Customizing checker strictness
==============================

The baseline case for ESMValCore input data is fully
:ref:`CMOR compliant <cmor_tables>` data and this is checked when data is
loaded by :meth:`esmvalcore.dataset.Dataset.load`.

However, it is possible to disable these checks completely by
:ref:`configuring the project <cmor_table_configuration>` so it uses the
:class:`esmvalcore.cmor.table.NoInfo` CMOR table,
or to adjust the strictness of the checks using the :ref:`configuration option
<config_options>` ``check_level``.

The data checker classifies its issues using four different levels of
severity. From highest to lowest:

Expand Down Expand Up @@ -345,6 +355,13 @@ below from the lowest level of strictness to the highest:
strictness. Mostly useful for checking datasets that you have produced, to
be sure that future users will not be distracted by inoffensive warnings.

.. warning::

While it is possible to work with datasets that are not described in a CMOR
table or only partially follow the CMOR standards, the
:ref:`preprocessor functions <preprocessor>` and
:ref:`diagnostics <esmvaltool:recipes>` have been designed to work with
CMORized data and may not work as expected with non-CMORized data.

Comment thread
valeriupredoi marked this conversation as resolved.
.. _add_new_fix_native_datasets:

Expand All @@ -357,14 +374,23 @@ under project ``native6``.

.. _add_new_fix_native_datasets_config:

Configuration
-------------
CMOR Table Configuration
------------------------

An example of a CMOR table configuration for projects used for native datasets is given in
Comment thread
schlunma marked this conversation as resolved.
Outdated

.. literalinclude:: ../configurations/defaults/cmor_tables.yml
:language: yaml
:caption: Example native data format projects in ``defaults/cmor_tables.yml``
:prepend: projects:
:start-at: # Observational and reanalysis data that can be read in its native format by ESMValCore.
:end-before: CESM:

An example of a configuration in ``config-developer.yml`` for projects used for
native datasets is given :ref:`here <configure_native_models>`.
Make sure to use the option ``cmor_strict: false`` for these projects if you
want to make use of :ref:`custom_cmor_tables`.
This allows reading arbitrary variables from native datasets.
The option ``strict: false`` is convenient for these projects, if you
Comment thread
schlunma marked this conversation as resolved.
Outdated
want to make use of the feature that looks in all tables instead of
only the one specified by the ``mip`` facet in the recipe or
Comment thread
schlunma marked this conversation as resolved.
Outdated
:class:`~esmvalcore.dataset.Dataset`: like this, a custom variable only needs to
be defined for a single ``mip`` and can then be used with all ``mip`` s.
Comment thread
valeriupredoi marked this conversation as resolved.

.. _add_new_fix_native_datasets_locate_data:

Expand All @@ -373,89 +399,85 @@ Locate data

To allow ESMValCore to locate the data files, use the following steps:

- If you want to use the ``native6`` project (recommended for datasets whose
input files can be easily moved to the usual ``native6`` directory
structure given by the :ref:`configuration option <config_options>`
``rootpath``; this is usually the case for native reanalysis/observational
datasets):

The entry ``native6`` of ``config-developer.yml`` should be complemented
with sub-entries for ``input_dir`` and ``input_file`` that go under a new
key representing the data organization (such as ``MY_DATA_ORG``), and
these sub-entries can use an arbitrary list of ``{placeholders}``.
Example :

.. code-block:: yaml

native6:
...
input_dir:
default: 'Tier{tier}/{dataset}/{version}/{frequency}/{short_name}'
MY_DATA_ORG: '{dataset}/{exp}/{simulation}/{version}/{type}'
input_file:
default: '*.nc'
MY_DATA_ORG: '{simulation}_*.nc'
...

To find your native data (e.g., called ``MYDATA``) that is for example
located in ``{rootpath}/MYDATA/amip/run1/42-0/atm/run1_1979.nc``
(``{rootpath}`` is ESMValTool's ``rootpath`` :ref:`configuration option
<config_options>` for the project ``native6``), use the following dataset
entry in your recipe

.. code-block:: yaml

datasets:
- {project: native6, dataset: MYDATA, exp: amip, simulation: run1, version: 42-0, type: atm}

and make sure to use the following :ref:`configuration option
<config_options>` ``drs``:

.. code-block:: yaml

drs:
native6: MY_DATA_ORG

- If you want to use a dedicated project for your native dataset
(recommended for datasets for which you cannot control the location of the
input files; this is usually the case for native model output):

A new entry for the project needs to be added to ``config-developer.yml``.
For example, for the ICON model, create a new project ``ICON``:

.. code-block:: yaml

ICON:
...
input_dir:
default:
- '{exp}'
- '{exp}/outdata'
- '{exp}/output'
input_file:
default: '{exp}_{var_type}*.nc'
...

To find your ICON data that is for example located in files like
``{rootpath}/amip/amip_atm_2d_ml_20000101T000000Z.nc`` (``{rootpath}`` is
ESMValCore's :ref:`configuration option <config_options>` ``rootpath`` for
the project ``ICON``), use the following dataset entry in your recipe:

.. code-block:: yaml

datasets:
- {project: ICON, dataset: ICON, exp: amip}

Please note the duplication of the name ``ICON`` in ``project`` and
``dataset``, which is necessary to comply with ESMValTool's data finding
and CMORizing functionalities.
For other native models, ``dataset`` could also refer to a subversion of
the model.
Note that it is possible to predefine facets via :ref:`extra facets
<add_new_fix_native_datasets_extra_facets>`.
In this ICON example, the facet ``var_type`` is :download:`predefined
</../esmvalcore/config/configurations/defaults/extra_facets_icon.yml>`
for many variables.
- If you want to use the ``native6`` project, recommended for datasets whose
input files can be easily moved to the usual ``native6`` directory
structure given by

.. literalinclude:: ../configurations/data-local-esmvaltool.yml
:language: yaml
:caption: ``native6`` standard directory organization in ``data-local-esmvaltool.yml``
:end-before: # Data that has been CMORized by ESMValTool according to the CMIP6 standard.

this is preferred. This is usually the case for native reanalysis/observational
datasets.
Comment thread
schlunma marked this conversation as resolved.
Outdated

- If moving the data into a particular directory structure is not possible,
Comment thread
schlunma marked this conversation as resolved.
Outdated
the ``data`` entry of the ``native6`` project could be complemented
with another data source that goes under a new
key representing the data organization (such as ``MY_DATA_ORG``), and
these sub-entries can use an arbitrary list of ``{placeholders}``.

Example:

.. code-block:: yaml

projects:
native6:
data:
MY_DATA_ORG:
type: esmvalcore.io.local.LocalDataSource
rootpath: /path/to/data
dirname_template: "{dataset}/{exp}/{simulation}/{version}/{type}"
filename_template: '{simulation}_*.nc'

would allow the tool to find your native data (e.g., a ``dataset`` called ``MYDATA``)
that is for example located in ``/path/to/data/MYDATA/amip/run1/42-0/atm/run1_1979.nc``
if you use the following dataset entry in your recipe

.. code-block:: yaml

datasets:
- {project: native6, dataset: MYDATA, exp: amip, simulation: run1, version: 42-0, type: atm}

- If you want to use a dedicated project for your native dataset
(this is usually the case for native model output):

A new entry for the project needs to be added under :ref:`config-projects`.
For example, for the ICON model, create a new project ``ICON`` and define
its data sources:

.. literalinclude:: ../configurations/data-native-icon.yml
:language: yaml
:caption: ``ICON`` standard directory organization in ``data-native-icon.yml``

and a CMOR table configuration:

.. literalinclude:: ../configurations/defaults/cmor_tables.yml
:language: yaml
:caption: ``ICON`` CMOR table configuration from ``defaults/cmor_tables.yml``
:prepend: projects:
:start-at: ICON:
:end-at: strict: false

To find your ICON data that is for example located in files like
Comment thread
bouweandela marked this conversation as resolved.
Outdated
``~/climate_data/amip/amip_atm_2d_ml_20000101T000000Z.nc``, use the following
dataset entry in your recipe:

.. code-block:: yaml

datasets:
- {project: ICON, dataset: ICON, exp: amip}

Please note the duplication of the name ``ICON`` in ``project`` and
``dataset``, which is necessary to comply with ESMValTool's data finding
and CMORizing functionalities.
For other native models, ``dataset`` could also refer to a subversion of
the model.
Note that it is possible to predefine facets via :ref:`extra facets
<add_new_fix_native_datasets_extra_facets>`.
In this ICON example, the facet ``var_type`` is :download:`predefined
</../esmvalcore/config/configurations/defaults/extra_facets_icon.yml>`
for many variables.

.. _add_new_fix_native_datasets_fix_data:

Expand Down
Loading
Loading