Using CleF - Climate Finder to discover ESGF data at NCI#
Paola Petrelli, CLEX CMS
This blog is the first of three showing examples of how to use the CleF (Climate Finder) python module to search for ESGF data on the NCI server.
Currently the tool is set up for CMIP5, CMIP6 and CORDEX data published by the ESGF.
CleF is currently installed in the CMS conda module analysis3, and analysis3-unstable for the latest version. This is managed by the CMS and is available simply by running
module use /g/data3/hh5/public/modules
module load conda/analysis3
You need to be a member of hh5 to use the modules and of one of the CMIP projects: oi10,rr3, fs38, al33 to access the data and the clef database.
This blog covers the basic usage of the command line options.
In the next blog we’ll cover how to:
save the query results as a csv file
get a summary of available data
run more complex queries
get extra information from the ESDOC documentation as errate and citations
In the third blog we will cover how to import and use clef in your own python code.
Let’s start!
Command syntax#
# run this if you haven't done so already in the terminal
#!module use /g/data3/hh5/public/modules
#!module load conda/analysis3-unstable
# using unstable guarantees that the latest features ar eall available, but clef is also available in stable
!clef
Usage: clef [OPTIONS] COMMAND [ARGS]...
Options:
--remote returns only ESGF search results
--local returns only local files matching arguments in local database
--missing returns only missing files matching ESGF search
--request send NCI request to download missing files matching ESGF search
--debug Show debug info
--help Show this message and exit.
Commands:
cmip5 Search ESGF and local database for CMIP5 files Constraints can be...
cmip6 Search ESGF and local database for CMIP6 files Constraints can be...
cordex Search ESGF and local database for CORDEX files.
ds Search local database for non-ESGF datasets
By simpling running the command clef with no arguments, the tool shows the help message and then exits, basically it is equivalent to
clef –help
We can see currently there are 3 sub-commands, ds to query non-ESGF collections and one for each cmip dataset: cmip5 and cmip6.
There are also five different options that can be passed before the sub-commands, one we have already seen is --help
. The others are used to modify how the tool will deal with the main query output. We will have a look at them and at ds later.
Let’s start from quering some CMIP5 data, to see what we can pass to the cmip5 sub-command we can simply run it with its --help
option.
CMIP5#
!clef cmip5 --help
Usage: clef cmip5 [OPTIONS] [QUERY]...
Search ESGF and local database for CMIP5 files
Constraints can be specified multiple times, in which case they are
combined using OR: -v tas -v tasmin will return anything matching
variable = 'tas' or variable = 'tasmin'. The --latest flag will check ESGF
for the latest version available, this is the default behaviour
Options:
-e, --experiment x CMIP5 experiment: piControl, rcp85, amip ...
--experiment_family [Atmos-only|Control|Decadal|ESM|Historical|Idealized|Paleo|RCP]
CMIP5 experiment family: Decadal, RCP ...
-m, --model x CMIP5 model acronym: ACCESS1.3, MIROC5 ...
-t, --table, --mip [Amon|Omon|OImon|LImon|Lmon|6hrPlev|6hrLev|3hr|Oclim|Oyr|aero|cfOff|cfSites|cfMon|cfDay|cf3hr|day|fx|grids]
-v, --variable x Variable name as shown in filanames: tas,
pr, sic ...
-en, --ensemble, --member TEXT CMIP5 ensemble member: r#i#p#
--frequency [mon|day|3hr|6hr|fx|yr|monClim|subhr]
--realm [atmos|ocean|land|landIce|seaIce|aerosol|atmosChem|ocnBgchem]
--and [variable|experiment|cmor_table|realm|time_frequency|model|ensemble]
Attributes for which we want to add AND
filter, i.e. `--and variable` to apply to
variable values
--institution TEXT Modelling group institution id: MIROC, IPSL,
MRI ...
--cf_standard_name TEXT CF variable standard_name, use instead of
variable constraint
--latest / --all-versions Return only the latest version or all of
them. Default: --latest
--replica / --no-replica Return both original files and replicas.
Default: --no-replica
--distrib / --no-distrib Distribute search across all ESGF nodes.
Default: --distrib
--csv / --no-csv Send output to csv file including extra
information. Default: --no-csv
--stats / --no-stats Write summary of query results, works only
with --local option. Default: --no-stats
--debug / --no-debug Show debug output. Default: --no-debug
--help Show this message and exit.
Passing arguments and options#
The --help
shows all the constraints we can pass to the tool, there are also some additional options which can change the way we run our query. For the moment we can ignore these and use their default values.
Some of the constraints can be passed using an abbreviation,like -v
instead of --variable
. This is handy once you are more familiar with the tool.
The same option can have more than one name, for example --ensemble
can also be passed as --member
, this is because the terminology has changed between CMIP5 and CMIP6.
You can pass how many constraints you want and pass the same constraint more than once. Let’s see what happens though if we do not pass any constraint.
!clef cmip5
ERROR: Too many results (3781387), try limiting your search https://esgf.nci.org.au/search/esgf-nci?query=&type=File&distrib=True&replica=False&latest=True&project=CMIP5
!clef cmip5 --variable tasmin --experiment historical --table day --ensemble r2i1p1s
ERROR: No matches found on ESGF, check at https://esgf.nci.org.au/search/esgf-nci?query=&type=File&distrib=True&replica=False&latest=True&project=CMIP5&ensemble=r2i1p1s&experiment=historical&cmor_table=day&variable=tasmin
Oops that wasn’t reasonable! I mispelled the ensemble “r2i1p1s” does not exists and the tool is telling me it cannot find any matches.
!clef cmip5 --variable tasmin --experiment historical --table days --ensemble r2i1p1
Usage: clef cmip5 [OPTIONS] [QUERY]...
Try 'clef cmip5 --help' for help.
Error: Invalid value for '--table' / '--mip' / '-t': invalid choice: days. (choose from Amon, Omon, OImon, LImon, Lmon, 6hrPlev, 6hrLev, 3hr, Oclim, Oyr, aero, cfOff, cfSites, cfMon, cfDay, cf3hr, day, fx, grids)
Made another spelling mistake, in this case the tool knows that I passed a wrong value and lists for me all the available options for the CMOR table. Eventually we are aiming to validate all the arguments we can, although for some it is no possible to pass all the possible values (ensemble for example).
!clef cmip5 --variable tasmin --experiment historical --table day --ensemble r2i1p1
/g/data/al33/replicas/CMIP5/combined/CCCma/CanCM4/historical/day/atmos/day/r2i1p1/v20120207/tasmin/
/g/data/al33/replicas/CMIP5/combined/CCCma/CanCM4/historical/day/atmos/day/r2i1p1/v20120612/tasmin/
/g/data/al33/replicas/CMIP5/combined/CCCma/CanESM2/historical/day/atmos/day/r2i1p1/v20120410/tasmin/
/g/data/al33/replicas/CMIP5/combined/CNRM-CERFACS/CNRM-CM5/historical/day/atmos/day/r2i1p1/v20120703/tasmin/
/g/data/al33/replicas/CMIP5/combined/IPSL/IPSL-CM5A-LR/historical/day/atmos/day/r2i1p1/v20130506/tasmin/
/g/data/al33/replicas/CMIP5/combined/IPSL/IPSL-CM5A-MR/historical/day/atmos/day/r2i1p1/v20130506/tasmin/
/g/data/al33/replicas/CMIP5/combined/LASG-IAP/FGOALS-s2/historical/day/atmos/day/r2i1p1/v20161204/tasmin/
/g/data/al33/replicas/CMIP5/combined/MIROC/MIROC-ESM/historical/day/atmos/day/r2i1p1/v20120710/tasmin/
/g/data/al33/replicas/CMIP5/combined/MIROC/MIROC4h/historical/day/atmos/day/r2i1p1/v20120628/tasmin/
/g/data/al33/replicas/CMIP5/combined/MIROC/MIROC5/historical/day/atmos/day/r2i1p1/v20120710/tasmin/
/g/data/al33/replicas/CMIP5/combined/MOHC/HadCM3/historical/day/atmos/day/r2i1p1/v20140110/tasmin/
/g/data/al33/replicas/CMIP5/combined/MOHC/HadGEM2-CC/historical/day/atmos/day/r2i1p1/v20111129/tasmin/
/g/data/al33/replicas/CMIP5/combined/MOHC/HadGEM2-ES/historical/day/atmos/day/r2i1p1/v20110418/tasmin/
/g/data/al33/replicas/CMIP5/combined/MPI-M/MPI-ESM-LR/historical/day/atmos/day/r2i1p1/v20111006/tasmin/
/g/data/al33/replicas/CMIP5/combined/MPI-M/MPI-ESM-MR/historical/day/atmos/day/r2i1p1/v20120503/tasmin/
/g/data/al33/replicas/CMIP5/combined/MPI-M/MPI-ESM-P/historical/day/atmos/day/r2i1p1/v20120315/tasmin/
/g/data/al33/replicas/CMIP5/combined/MRI/MRI-CGCM3/historical/day/atmos/day/r2i1p1/v20120701/tasmin/
/g/data/al33/replicas/CMIP5/combined/NCC/NorESM1-M/historical/day/atmos/day/r2i1p1/v20110901/tasmin/
/g/data/al33/replicas/CMIP5/combined/NOAA-GFDL/GFDL-CM3/historical/day/atmos/day/r2i1p1/v20120227/tasmin/
/g/data/rr3/publications/CMIP5/output1/CSIRO-QCCCE/CSIRO-Mk3-6-0/historical/day/atmos/day/r2i1p1/files/tasmin_20110518/
Everything available on ESGF is also available locally
The tool first search on the ESGF for all the files that match the constraints we passed. It then looks for these file locally and if it finds them it returns their path on raijin.
For all the files it can’t find locally, the tool check an NCI table listing the downloads they are working on. Finally it lists missing datasets which are in the download queue, followed by the datasets that are not available locally and no one has yet requested.
The tool list the datasets paths and dataset_ids, we used to have a --format file
option but this has been removed in most recent versions.
The query by default returns the latest available version. What if we want to have a look at all the available versions?
!clef cmip5 --variable clivi --experiment historical --table Amon -m ACCESS1.0 --all-versions
/g/data/rr3/publications/CMIP5/output1/CSIRO-BOM/ACCESS1-0/historical/mon/atmos/Amon/r1i1p1/files/clivi_20120115/
/g/data/rr3/publications/CMIP5/output1/CSIRO-BOM/ACCESS1-0/historical/mon/atmos/Amon/r1i1p1/files/clivi_20120727/
/g/data/rr3/publications/CMIP5/output1/CSIRO-BOM/ACCESS1-0/historical/mon/atmos/Amon/r3i1p1/files/clivi_20140402/
Everything available on ESGF is also available locally
The option --all-versions
is the reverse of --latest
, which is also the default, so we get a list of all available versions.
Since all the ACCESS1.0 data is available on NCI (which is the authoritative source for the ACCESS models) the tool shouldn’t find any missing datasets, if it does please let us know about it.
CMIP6#
!clef cmip6 --help
Usage: clef cmip6 [OPTIONS] [QUERY]...
Search ESGF and local database for CMIP6 files Constraints can be
specified multiple times, in which case they are combined using OR: -v
tas -v tasmin will return anything matching variable = 'tas' or variable =
'tasmin'. The --latest flag will check ESGF for the latest version
available, this is the default behaviour
Options:
-mip, --activity [AerChemMIP|C4MIP|CDRMIP|CFMIP|CMIP|CORDEX|DAMIP|DCPP|DynVarMIP|FAFMIP|GMMIP|GeoMIP|HighResMIP|ISMIP6|LS3MIP|LUMIP|OMIP|PAMIP|PMIP|RFMIP|SIMIP|ScenarioMIP|VIACSAB|VolMIP]
-e, --experiment x CMIP6 experiment, list of available depends
on activity
--source_type [AER|AGCM|AOGCM|BGC|CHEM|ISM|LAND|OGCM|RAD|SLAB]
-t, --table x CMIP6 CMOR table: Amon, SIday, Oday ...
-m, --model, --source_id x CMIP6 model id: GFDL-AM4, CNRM-CM6-1 ...
-v, --variable x CMIP6 variable name as in filenames
-mi, --member TEXT CMIP6 member id: <sub-exp-id>-r#i#p#f#
-g, --grid, --grid_label TEXT CMIP6 grid label: i.e. gn for the model
native grid
-nr, --resolution, --nominal_resolution TEXT
Approximate resolution: '250 km', pass in
quotes
--frequency [1hr|1hrCM|1hrPt|3hr|3hrPt|6hr|6hrPt|day|dec|fx|mon|monC|monPt|subhrPt|yr|yrPt]
--realm [aerosol|atmos|atmosChem|land|landIce|ocean|ocnBgchem|seaIce]
-se, --sub_experiment_id TEXT Only available for hindcast and forecast
experiments: sYYYY
-vl, --variant_label TEXT Indicates a model variant: r#i#p#f#
--and [variable_id|experiment_id|table_id|realm|frequency|member_id|source_id|source_type|activity_id|grid|grid_label|nominal_resolution|sub_experiment_id]
Attributes for which we want to add AND
filter, i.e. `--and variable_id` to apply to
variable values
--institution TEXT Modelling group institution id: IPSL, NOAA-
GFDL ...
--cf_standard_name TEXT CF variable standard_name, use instead of
variable constraint
--latest / --all-versions Return only the latest version or all of
them. Default: --latest
--replica / --no-replica Return both original files and replicas.
Default: --no-replica
--distrib / --no-distrib Distribute search across all ESGF nodes.
Default: --distrib
--csv / --no-csv Send output to csv file including extra
information. Default: --no-csv
--stats / --no-stats Write summary of query results, works only
with --local option. Default: --no-stats
--debug / --no-debug Show debug output. Default: --no-debug
--help Show this message and exit.
The cmip6 sub-command works in the same way but some constraints are different. As well as changes in terminology CMIP6 has more attributes (facets) that can be used to select the data.
Examples of these are the activity
which groups experiments, resolution
which is an approximation of the actual resolution and grid
.
CORDEX#
!clef cordex --help
Usage: clef cordex [OPTIONS] [QUERY]...
Search ESGF and local database for CORDEX files.
Constraints can be specified multiple times, in which case they are
combined using OR: -v tas -v tasmin will return anything matching
variable = 'tas' or variable = 'tasmin'. The --latest flag will check ESGF
for the latest version available, this is the default behaviour NB. for
CORDEX data associated to CMIP6 use the cmip6 command with CORDEX as
activity_id
Options:
--latest / --all-versions Return only the latest version or all of
them. Default: --latest
--replica / --no-replica Return both original files and replicas.
Default: --no-replica
--distrib / --no-distrib Distribute search across all ESGF nodes.
Default: --distrib
--csv / --no-csv Send output to csv file including extra
information. Works only with --local and
--remote. Default: --no-csv
--stats / --no-stats Write summary of query results. Works only
with --local and --remote. Default: --no-
stats
--debug / --no-debug Show debug output. Default: --no-debug
-d, --domain FACET CORDEX region name
-e, --experiment FACET CMIP5 experiment of driving GCM or
'evaluation' for re-analysis
-dmod, --driving_model FACET Model/analysis used to drive the model (eg.
ECMWFERAINT)
-m, --rcm_name FACET Identifier of the CORDEX Regional Climate
Model
-rcmv, --rcm_version FACET Identifier for reruns with perturbed
parameters or smaller RCM release upgrades
-v, --variable FACET Variable name in file
-f, --time_frequency FACET Output frequency indicator
-en, --ensemble FACET Ensemble member of the driving GCM
-vrs, --version FACET Data publication version
-cf, --cf_standard_name FACET CF-Conventions name of the variable
-ef, --experiment_family FACET Experiment family: All, Historical, RCP
-inst, --institute FACET identifier for the institution that is
responsible for the scientific aspects of
the CORDEX simulation
--and [domain|experiment|driving_model|rcm_name|rcm_version|variable|time_frequency|ensemble|version|cf_standard_name|experiment_family|institute]
Attributes for which we want to add AND
filter, i.e. -v tasmin -v tasmax --and
variable will return only model/ensemble
that have both
--help Show this message and exit.
Again cordex works in the same way but some constraints are specific to its experiment design.
These are the cordex domain
, rcm_name
for the regional model, rcm_version
and the driving_model
.
CORDEX also doesn’t use tables so you always have to use f--frequency
to select different timesteps.
!clef cordex -v tas -e historical -dmod CSIRO-BOM-ACCESS1-3 -en r1i1p1 -f mon
/g/data/rr3/publications/CORDEX/output/AUS-44/UNSW/CSIRO-BOM-ACCESS1-3/historical/r1i1p1/UNSW-WRF360J/v1/mon/tas/latest/
/g/data/rr3/publications/CORDEX/output/AUS-44/UNSW/CSIRO-BOM-ACCESS1-3/historical/r1i1p1/UNSW-WRF360K/v1/mon/tas/latest/
/g/data/rr3/publications/CORDEX/output/AUS-44/UNSW/CSIRO-BOM-ACCESS1-3/historical/r1i1p1/UNSW-WRF360L/v1/mon/tas/latest/
/g/data/rr3/publications/CORDEX/output/AUS-44i/UNSW/CSIRO-BOM-ACCESS1-3/historical/r1i1p1/UNSW-WRF360J/v1/mon/tas/latest/
/g/data/rr3/publications/CORDEX/output/AUS-44i/UNSW/CSIRO-BOM-ACCESS1-3/historical/r1i1p1/UNSW-WRF360K/v1/mon/tas/latest/
/g/data/rr3/publications/CORDEX/output/AUS-44i/UNSW/CSIRO-BOM-ACCESS1-3/historical/r1i1p1/UNSW-WRF360L/v1/mon/tas/latest/
Everything available on ESGF is also available locally
Controlling the ouput: clef options#
!clef --local cmip6 -e 1pctCO2 -t Amon -v tasmax -v tasmin -g gr
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1-HR/1pctCO2/r1i1p1f2/Amon/tasmax/gr/v20191021
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/1pctCO2/r1i1p1f2/Amon/tasmax/gr/v20180626
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r10i1p1f2/Amon/tasmax/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r1i1p1f2/Amon/tasmax/gr/v20181018
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r2i1p1f2/Amon/tasmax/gr/v20181031
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r3i1p1f2/Amon/tasmax/gr/v20181107
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r4i1p1f2/Amon/tasmax/gr/v20190328
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r5i1p1f2/Amon/tasmax/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r6i1p1f2/Amon/tasmax/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r7i1p1f2/Amon/tasmax/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r8i1p1f2/Amon/tasmax/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r9i1p1f2/Amon/tasmax/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-Veg/1pctCO2/r1i1p1f1/Amon/tasmax/gr/v20190702
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-Veg/1pctCO2/r1i1p1f1/Amon/tasmax/gr/v20200325
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3/1pctCO2/r3i1p1f1/Amon/tasmax/gr/v20191114
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3/1pctCO2/r3i1p1f1/Amon/tasmax/gr/v20200727
/g/data/oi10/replicas/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/1pctCO2/r1i1p1f1/Amon/tasmax/gr/v20180727
/g/data/oi10/replicas/CMIP6/CMIP/THU/CIESM/1pctCO2/r1i1p1f1/Amon/tasmax/gr/v20200417
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1-HR/1pctCO2/r1i1p1f2/Amon/tasmin/gr/v20191021
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-CM6-1/1pctCO2/r1i1p1f2/Amon/tasmin/gr/v20180626
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r10i1p1f2/Amon/tasmin/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r1i1p1f2/Amon/tasmin/gr/v20181018
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r2i1p1f2/Amon/tasmin/gr/v20181031
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r3i1p1f2/Amon/tasmin/gr/v20181107
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r4i1p1f2/Amon/tasmin/gr/v20190328
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r5i1p1f2/Amon/tasmin/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r6i1p1f2/Amon/tasmin/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r7i1p1f2/Amon/tasmin/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r8i1p1f2/Amon/tasmin/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/CNRM-CERFACS/CNRM-ESM2-1/1pctCO2/r9i1p1f2/Amon/tasmin/gr/v20200529
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-Veg/1pctCO2/r1i1p1f1/Amon/tasmin/gr/v20190702
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3-Veg/1pctCO2/r1i1p1f1/Amon/tasmin/gr/v20200325
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3/1pctCO2/r3i1p1f1/Amon/tasmin/gr/v20191114
/g/data/oi10/replicas/CMIP6/CMIP/EC-Earth-Consortium/EC-Earth3/1pctCO2/r3i1p1f1/Amon/tasmin/gr/v20200727
/g/data/oi10/replicas/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/1pctCO2/r1i1p1f1/Amon/tasmin/gr/v20180727
/g/data/oi10/replicas/CMIP6/CMIP/THU/CIESM/1pctCO2/r1i1p1f1/Amon/tasmin/gr/v20200417
In this example we used the --local
option for the main command clef to get only the local matching data path as output.
Note also that:
we are using abbreviations for the options where available;
we are passing the variable
-v
option twice;we used the CMIP6 specific option
-g/--grid
to search for all data that is not on the model native grid. This doesn’t indicate a grid common to all the CMIP6 output only to the model itself, the same is true for member_id and other attributes.
--local
is actually executing the query directly on the NCI clef.nci.org.au database, which is different from the default query where the search is executed first on the ESGF and then its results are matched locally.
In the example above the final result is exactly the same, whichever way we perform the query. This way of searching can give you more results if a node is offline or if a version have been unpublished from the ESGF but is still available locally.
!clef --missing cmip6 -e 1pctCO2 -v clw -v clwvi -t Amon -g gr
Available on ESGF but not locally:
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r1i1p1f1.Amon.clw.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r2i1p1f1.Amon.clw.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r2i1p1f1.Amon.clwvi.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r3i1p1f1.Amon.clw.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r3i1p1f1.Amon.clwvi.gr.v20200620
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1.1pctCO2.r1i1p1f2.Amon.clw.gr.v20180626
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1.1pctCO2.r1i1p1f2.Amon.clwvi.gr.v20180626
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.1pctCO2.r1i1p1f2.Amon.clw.gr.v20191021
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.1pctCO2.r1i1p1f2.Amon.clwvi.gr.v20191021
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r10i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r10i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r1i1p1f2.Amon.clw.gr.v20181018
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r1i1p1f2.Amon.clwvi.gr.v20181018
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r2i1p1f2.Amon.clw.gr.v20181031
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r2i1p1f2.Amon.clwvi.gr.v20181031
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r3i1p1f2.Amon.clw.gr.v20181107
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r3i1p1f2.Amon.clwvi.gr.v20181107
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r4i1p1f2.Amon.clw.gr.v20190328
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r4i1p1f2.Amon.clwvi.gr.v20190328
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r5i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r5i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r6i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r6i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r7i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r7i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r8i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r8i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r9i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r9i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.E3SM-Project.E3SM-1-0.1pctCO2.r1i1p1f1.Amon.clw.gr.v20190718
CMIP6.CMIP.E3SM-Project.E3SM-1-0.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20190718
CMIP6.CMIP.EC-Earth-Consortium.EC-Earth3.1pctCO2.r3i1p1f1.Amon.clwvi.gr.v20200727
CMIP6.CMIP.EC-Earth-Consortium.EC-Earth3-Veg.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200325
CMIP6.CMIP.IPSL.IPSL-CM6A-LR.1pctCO2.r1i1p1f1.Amon.clw.gr.v20180727
CMIP6.CMIP.IPSL.IPSL-CM6A-LR.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20180727
CMIP6.CMIP.NIMS-KMA.KACE-1-0-G.1pctCO2.r1i1p1f1.Amon.clw.gr.v20190916
CMIP6.CMIP.NIMS-KMA.KACE-1-0-G.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20190916
CMIP6.CMIP.THU.CIESM.1pctCO2.r1i1p1f1.Amon.clw.gr.v20200417
CMIP6.CMIP.THU.CIESM.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200417
This time we used the –missing option and the tool returned only the results matching the constraints that are available on the ESGF but not locally (we changed variables to make sure to get some missing data back).
!clef --remote cmip6 -e 1pctCO2 -v tasmin -t Amon -g gr
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.1pctCO2.r1i1p1f2.Amon.tasmin.gr.v20191021
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1.1pctCO2.r1i1p1f2.Amon.tasmin.gr.v20180626
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r10i1p1f2.Amon.tasmin.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r1i1p1f2.Amon.tasmin.gr.v20181018
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r2i1p1f2.Amon.tasmin.gr.v20181031
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r3i1p1f2.Amon.tasmin.gr.v20181107
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r4i1p1f2.Amon.tasmin.gr.v20190328
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r5i1p1f2.Amon.tasmin.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r6i1p1f2.Amon.tasmin.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r7i1p1f2.Amon.tasmin.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r8i1p1f2.Amon.tasmin.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r9i1p1f2.Amon.tasmin.gr.v20200529
CMIP6.CMIP.EC-Earth-Consortium.EC-Earth3-Veg.1pctCO2.r1i1p1f1.Amon.tasmin.gr.v20200325
CMIP6.CMIP.EC-Earth-Consortium.EC-Earth3.1pctCO2.r3i1p1f1.Amon.tasmin.gr.v20200727
CMIP6.CMIP.IPSL.IPSL-CM6A-LR.1pctCO2.r1i1p1f1.Amon.tasmin.gr.v20180727
CMIP6.CMIP.NIMS-KMA.KACE-1-0-G.1pctCO2.r1i1p1f1.Amon.tasmin.gr.v20200115
CMIP6.CMIP.THU.CIESM.1pctCO2.r1i1p1f1.Amon.tasmin.gr.v20200417
The --remote
option returns the Dataset_ids of the data matching the constraints, regardless that they are available locally or not.
Please note that --local
, --remote
and --missing
together with --request
, which we will look at next, are all options of the main command clef and they need to come before any sub-commands.
Requesting new data#
What should we do if we found out there is some data we are interested to that has not been downloaded or requested yet?
This is a complex data collection, NCI, in consultation with the community, decided the best way to manage it was to have one point of reference. Part of this agreement is that NCI will download the files and update the database that clef is interrrogating. After consultation with the community a priority list was decided and NCI has started downloading anything that falls into it as soon as become available.
Users can then request from the NCI helpdesk, other combinations of variables, experiments etc that do not fall into this list.
The list is available from the NCI climate confluence website:
Even without consulting the list you can use clef, as we demonstrated above, to search for a particular dataset, if it is not queued or downloaded already clef will give you an option to request it from NCI.
Let’s see how it works.
%%bash
clef --request cmip6 -e 1pctCO2 -v clw -v clwvi -t Amon -g gr
no
Available on ESGF but not locally:
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r1i1p1f1.Amon.clw.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r2i1p1f1.Amon.clw.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r2i1p1f1.Amon.clwvi.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r3i1p1f1.Amon.clw.gr.v20200620
CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r3i1p1f1.Amon.clwvi.gr.v20200620
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1.1pctCO2.r1i1p1f2.Amon.clw.gr.v20180626
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1.1pctCO2.r1i1p1f2.Amon.clwvi.gr.v20180626
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.1pctCO2.r1i1p1f2.Amon.clw.gr.v20191021
CMIP6.CMIP.CNRM-CERFACS.CNRM-CM6-1-HR.1pctCO2.r1i1p1f2.Amon.clwvi.gr.v20191021
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r10i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r10i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r1i1p1f2.Amon.clw.gr.v20181018
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r1i1p1f2.Amon.clwvi.gr.v20181018
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r2i1p1f2.Amon.clw.gr.v20181031
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r2i1p1f2.Amon.clwvi.gr.v20181031
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r3i1p1f2.Amon.clw.gr.v20181107
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r3i1p1f2.Amon.clwvi.gr.v20181107
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r4i1p1f2.Amon.clw.gr.v20190328
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r4i1p1f2.Amon.clwvi.gr.v20190328
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r5i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r5i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r6i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r6i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r7i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r7i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r8i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r8i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r9i1p1f2.Amon.clw.gr.v20200529
CMIP6.CMIP.CNRM-CERFACS.CNRM-ESM2-1.1pctCO2.r9i1p1f2.Amon.clwvi.gr.v20200529
CMIP6.CMIP.E3SM-Project.E3SM-1-0.1pctCO2.r1i1p1f1.Amon.clw.gr.v20190718
CMIP6.CMIP.E3SM-Project.E3SM-1-0.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20190718
CMIP6.CMIP.EC-Earth-Consortium.EC-Earth3.1pctCO2.r3i1p1f1.Amon.clwvi.gr.v20200727
CMIP6.CMIP.EC-Earth-Consortium.EC-Earth3-Veg.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200325
CMIP6.CMIP.IPSL.IPSL-CM6A-LR.1pctCO2.r1i1p1f1.Amon.clw.gr.v20180727
CMIP6.CMIP.IPSL.IPSL-CM6A-LR.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20180727
CMIP6.CMIP.NIMS-KMA.KACE-1-0-G.1pctCO2.r1i1p1f1.Amon.clw.gr.v20190916
CMIP6.CMIP.NIMS-KMA.KACE-1-0-G.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20190916
CMIP6.CMIP.THU.CIESM.1pctCO2.r1i1p1f1.Amon.clw.gr.v20200417
CMIP6.CMIP.THU.CIESM.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200417
Finished writing file: CMIP6_pxp581_20200924T094632.txt
Do you want to proceed with request for missing files? (N/Y)
No is default
Your request has been saved in
/home/581/pxp581/clef/docs/CMIP6_pxp581_20200924T094632.txt
You can use this file to request the data via the NCI helpdesk: help@nci.org.au or https://help.nci.org.au.
We run the same query which gave us as a result 4 missing datasets but this time we used the --request
option after clef.
The tool will execute the query remotely, then look for matches locally and on the NCI download list. Having found none gives as an option of putting in a request.
It will accept any of the following as a positive answer:
Y YES y yes
With anything else or if you don’t pass anything it will assume you don’t want to put in a request.
It still saved the request in a file we can use later.
!head -n 4 CMIP6_*.txt
dataset_id=CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r1i1p1f1.Amon.clw.gr.v20200620
dataset_id=CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r1i1p1f1.Amon.clwvi.gr.v20200620
dataset_id=CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r2i1p1f1.Amon.clw.gr.v20200620
dataset_id=CMIP6.CMIP.CAS.FGOALS-f3-L.1pctCO2.r2i1p1f1.Amon.clwvi.gr.v20200620
If I answered yes the tool would have sent an e-mail to the NCI helpdesk with the text file attached, NCI can pass that file as input to their download tool and queue your request.
NB if you are running clef from gadi you cannot send an e-mail so in that case the tool will skip the question and just remind you to send an e-mail to the NCI helpdesk yourself to finalise the request.