20 May 2019

ERA5 1979 onward becoming available in the RDA

CISL DECS is processing ERA5, 1979 onward, into netCDF4/HDF5 files containing single parameter 0.25° gridded time series. Please refer to ds633.0,

ERA5 Reanalysis (0.25 Degree Latitude-Longitude Grid)

for further information and availability.

Please note: DECS is producing a CF 1.6 compliant netCDF-4/HDF5 version of ERA5 for the CISL RDA at NCAR. The netCDF-4/HDF5 version is the de facto RDA ERA5 online data format. The GRIB1 data format is only available via NCAR's High Performance Storage System (HPSS). There is a one-to-one correspondence between the netCDF-4/HDF5 and GRIB1 files, with as much GRIB1 metadata as possible incorporated into the attributes of the netCDF-4/HDF5 counterpart.

06 December 2018

NCAR RDA at AGU 2018 Fall Meeting

Sophie Hou, Tom Cram, and Doug Schuster will be representing the RDA at the AGU 2018 fall meeting.  A schedule of their events is provided below:

02 April 2018

Notice: new Globus endpoints

The RDA data collections were recently moved to a new filesystem (and new directory paths), and as a result of the move, the two shared Globus endpoints providing access to RDA data products were rendered obsolete.  To remedy this, we have created two new shared endpoints and have transferred all active Globus shares to the new endpoints.

The caveat to this change is the new endpoints have new endpoint IDs, and this will have an impact on users who automate Globus transfers from RDA shared endpoints in their workflow scripts.  For those users who have automated transfers set up, please update your scripts to the new endpoint IDs as follows:

Endpoint name: NCAR RDA Dataset Archive (rda#datashare)
Old endpoint ID: db57de42-6d04-11e5-ba46-22000b92c6ec
New endpoint ID: 2869611a-36aa-11e8-b95e-0ac6873fc732

Endpoint name: NCAR RDA Data Requests (rda#data_request)
Old endpoint ID: d20e610e-6d04-11e5-ba46-22000b92c6ec
New endpoint ID: 68823254-36aa-11e8-b95e-0ac6873fc732

Please e-mail us at rdahelp@ucar.edu if you have questions.


19 March 2018

Alternate access to NCEP data products through NCEP Data Portals

The RDA republishes many NCEP data products.  Users with urgent data needs during extended RDA downtimes may obtain the data directly from NCEP servers.

The 10-day archive of GFS products (including GDAS/FNL analysis grids) can be found at the real-time site http://www.nco.ncep.noaa.gov/pmb/products/gfs/

GDAS data up to ~ 9 months old may be found at this online archive  https://nomads.ncdc.noaa.gov/data/gdas/ (which may be decommissioned soon.)

THREDDS users can use https://www.ncei.noaa.gov/thredds/model/model.html

Older data can be requested from this long-term archive https://www.ncdc.noaa.gov/data-access/model-data

Some of the data is available for immediate download.  For example, you can obtain GFS analysis files from ftp://nomads.ncdc.noaa.gov/GFS/analysis_only/201803/20180315/

Note that data read from tape is organized and delivered in daily (01Jan2001​–21Feb2012) or hourly (13Feb2012-present) tar files that can be very large and include things that you may not need.  For instance, one 6-hourly GDAS tar file is 47 GB and includes the initial hour and all forecast hours for global 0.25, 0.5, and 1.0 degree resolution grids.

https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/global-data-assimilation-system-gdas

Subscribe to this blog via RSS or follow us by email (see right column for widget) to get the latest updates.

08 December 2017

NCAR data experts seek input at AGU fall meeting for summer 2018 workshop on digital data repository service

NCAR data management experts will engage the community in a poster session on December 11 at the AGU 2017 Fall Meeting to help shape a 2018 National Science Foundation workshop focused on developing requirements and expectations for a prospective new digital repository service.
The envisioned Geoscience Digital Data Resource and Repository Service (GeoDaRRS) would complement existing NSF-funded data facilities by providing data management planning support resources for the general community and repository services for researchers who have data that do not fit in any existing repository. GeoDaRRS would support NSF-funded researchers in meeting data archiving requirements set by the NSF and publishers for geosciences, thereby ensuring the availability of digital data for use and reuse in scientific research going forward.
See this abstract on the AGU Fall Meeting site for more information.
For additional information on the workshop, please see the workshop website.

19 July 2017

Accessing RDA OPeNDAP endpoints with authentication

Update:
As an alternative to the steps below, it may be simpler specify the username and password within the URL to satisfy the authentication.

For example,
https://USERNAME%40DOMAIN:PASSWORD@rda.ucar.edu/thredds/dodsC/files/g/ds083.3/2016/201602/gdas1.fnl0p25.2016020218.f00.grib2'

For me, it might look like,

https://rpconroy%40ucar.edu:MySuperSecretPassword@rda.ucar.edu/thredds/dodsC/files/g/ds083.3/2016/201602/gdas1.fnl0p25.2016020218.f00.grib2'

Note the %40 in place of the '@' in the email.


All RDA OPeNDAP supported datasets found under https://rda.ucar.edu/thredds now require user authentication for access.  Here are details on how to configure your system, such that most applications including NCL, will work properly with the authentication step (these instructions are based on what is provided by the hdfeos group):

1. RDA Registration and Setting Up Cookies

NCAR RDA OPeNDAP server access requires user registration and cookies. If you do not have RDA Login, please register first and have RDA username and password ready.

Once you have RDA Login username and password, and verify that they work by logging into https://rda.ucar.edu, you need to set up cookies to access data.  We highlight the key steps here.

To set up cookies properly, you will need the following 3 files in your home directory (e.g., /home/rdahelp on Linux or /Users/rdahelp on Mac assuming that rdahelp is your UNIX system's login name).
  1. /home/rdahelp/.netrc
  2. /home/rdahelp/.rda_cookies
  3. /home/rdahelp/.dodsrc
1.1 CREATE .NETRC FILE FOR LOGIN / PASSWORD
The first file .netrc should have RDA login and password. For example, if your username is rdahelp and password is 1234abcd, the file should have the following line.
machine rda.ucar.edu login rdahelp password 1234abcd
Please edit the above file to match your username and password. Since the file contains the password, make sure that others cannot see it by changing permission.


$chmod go-rwx /home/rdahelp/.netrc
1.2 CREATE .RDA_COOKIES FILE USING CURL OR WGET
The second file, .rda_cookies, should be created automatically by either wget or curl command.
If you like to use wget, issue the following command to create the cookie file. 
$wget --load-cookies ~/.rda_cookies --save-cookies ~/.rda_cookies --auth-no-challenge=on --keep-session-cookies https://rda.ucar.edu/thredds/dodsC/files/g/ds083.2/grib1/2000/2000.02/fnl_20000201_06_00.grib1.dds -O test.dds
If you like to use curl, issue the following command to create the cookie file. 
$curl -n -c ~/.rda_cookies -b ~/.rda_cookies -L -g --url https://rda.ucar.edu/thredds/dodsC/files/g/ds083.2/grib1/2000/2000.02/fnl_20000201_06_00.grib1.dds -o test.dods
If the above command succeeds, you will get the following output when you check the content of test.dods file.
$more test.dods
Dataset {                                                        Int32 LatLon_Projection;                                    Float32 lat[lat = 181];                                          Float32 lon[lon = 360];                                                         ... }files/g/ds083.2/grib1/2000/2000.02/fnl_20000201_06_00.grib1
If you don't see the above output, your login/password and system is not working properly with NCAR RDA OPeNDAP server. You cannot access data until this step works. If you get the same output, you can delete the temporary output test.dods file.
1.3 CREATE .DODSRC FILE FOR .NETRC AND .RDA_COOKIES FILES
The final step is to create .dodsrc file if you don't already have one and add the following lines.
HTTP.COOKIEJAR=/home/rdahelp/.rda_cookies
HTTP.NETRC=/home/rdahelp/.netrc

2. Test NCL open and read of OPeNDAP data (NCL version 6.4.0 is required)

The following example shows an example of how to test if NCL can open and read an OPeNDAP file after the above steps have been completed.
$ ncl                                                             ncl 0> f = addfile("https://rda.ucar.edu/thredds/dodsC/files/g/ds083.2/grib1/2000/2000.02/fnl_20000201_06_00.grib1","r")                     ncl 1> print(f)                                               Variable: f                                                       Type: file                                                         filename:    fnl_20000201_06_00                                   path: https://rda.ucar.edu/thredds/dodsC/files/g/ds083.2/grib1/2000/2000.02/fnl_20000201_06_00.grib1                                         file global attributes:                                              Originating_or_generating_Center : US National Weather Service, National Centres for Environmental Prediction (NCEP)          Originating_or_generating_Subcenter : 0                            GRIB_table_version : 0,2                                          ... etc                         
In the above code, the url points to the remote FNL file being served by OPeNDAP. Use print command to check if the url file is opened successfully and to see what variables are available inside the file.

26 April 2017

Transferring custom file lists with Globus

The RDA has supported Globus file transfers for over two years, which enables users to transfer data files from RDA datasets to their destination endpoint using the Globus data transfer service (see http://ncarrda.blogspot.com/2015/06/transferring-rda-data-with-globus.html for details).  Transferring files using Globus is an efficient, reliable, and secure method to transfer large amounts of data from one endpoint to another.

But what if you're interested in a smaller or specific collection of files within a dataset?  In this scenario, the user would navigate to the Globus web app transfer interface and then browse through the files on the RDA shared endpoint to locate the files of interest.  This workflow, however, is not ideal from a search and discovery perspective, since users don't have the convenience of filtering the file sets via faceted browsing or viewing the supporting metadata alongside each file listed on the Globus web interface.

The standard Globus web app transfer interface.  In this workflow, users must browse through the full set of files within a RDA dataset to locate specific files to be transferred.

The good news: RDA users now have the ability to select and initiate Globus transfers of custom file sets directly from the RDA web interface without having to search for the files on the Globus transfer interface.  This method of transferring data uses the Globus Browse Endpoint helper API, and the only step required to be done by the user on the Globus web app is to select the destination endpoint that will receive the data files in the transfer.  The advantage to this approach is users are able to make use of the faceted search and discovery tools on the RDA website and avoid the bottleneck of making "blind" searches of the files on the Globus web interface.

How to make a Globus transfer of custom file sets


Starting from any RDA dataset page, go to the 'Data Access' tab, then select the 'Web file listing' link associated with any of the data products listed.  At this point, you may choose either the 'Faceted browse' or 'Complete file list' option, and then follow the instructions to locate your files of interest.

Procured file lists using the 'Complete file list' (left) and 'Faceted browse' search function (right) on the RDA website. Once a collection of files has been selected, the user initiates a Globus transfer by selecting the 'Globus' button.

To initiate a Globus transfer, select the files you wish to download, then select the 'Globus download' button.  This will redirect you to the Globus web app where you will authenticate with Globus and select the destination endpoint that will receive the data files.  After a destination endpoint has been chosen, select the 'Submit' button.  You will then be redirected back to the RDA website, at which point the data transfer will be submitted on your behalf.  A status message displaying the status of your data transfer will appear, and a notification e-mail will be sent to your RDA user e-mail address when the transfer has successfully completed.

The browse endpoint interface on the Globus web app, where users select the destination endpoint.

As a reminder, RDA users can log into the Globus web app with their RDA user e-mail login and password credentials; this is the recommended method for logging into Globus.  To do this, simply choose the 'NCAR RDA' organizational login on the Globus login page.

23 March 2017

Integrating RDA data processing capabilities into a workflow

Do you have a workflow that leverages repeated subset requests from an RDA dataset or would you like to avoid continually filling out web forms? Would you like to download data once a request is ready without checking your email?  The RDA provides the capability for users to submit subset requests and download data programmatically through the external applications API.  The API provides users with the following capabilities:

  • Get a summary of datasets that have subsetting available
  • Get a list of parameters available for subsetting by dataset
  • Submit a temporal, spatial, parameter subset request from a dataset
  • Check on data request processing status
  • Get a list of output files found in a completed data request (supports data transfer)
  • Set up a Globus endpoint share for a completed data request
  • Purge data request

A complete description of API capabilities is described here.

A couple of example applications have been developed in python that utilize the capabilities provided through the API.


  • rdams-client.py:  The rdams-client python utility can be run by registered RDA users to get parameters available for subsetting by dataset, to submit subset requests on select gridded data sets, to check on the processing status of any subset request, and to download completed request output files to a local system.   This utility may be useful to users who want integrate data subset request submission and download as part of a broader workflow. Download and execute './rdams-client.py -help' for more details (make sure to set 'rdams-client.py' as executable).  For additional information, see the CISL rdams documentation page.
  • rda-request-manager.py:  The rda-request-manager python utility can be run by registered RDA users to check RDA data request processing status, download completed request output files to a local system, and purge request files from the RDA data server.  This utility may be useful to users who submit any type of request through the web interface (subset, file format conversion, or restage data from tape for download), and would like to create a cron job to check on request processing status, and download requests as they become available, instead of waiting for email notifications and downloading data through the web interface.  Download and execute './rda-request-manager.py -help' for additional information (make sure to set 'rda-request-manager.py' as executable).

08 March 2017

Disk Cleanup & Faster Data Transfers

Our transfer disks have been averaging 90% full. Occasionally, it has been so full that processing of new data orders has to be halted until there is more space.

In the interest of keeping your data requests flowing, we ask that you purge your data to free up disk space after you have downloaded it. Just click on the 'Purge request' button to let us know you are done.

Click on 'Purge request' after you download your data.
Please request data in chunks that you can reasonably download in the default 5 day expiration limit.

Users experience faster data transfer speeds using globus.

A few RDA users work in regions where internet speeds are especially slow. Just let us know and we can extend the 5 day expiration limit for you.  Also, consider having data shipped out to you on a hard drive.

03 March 2017

The RDA User Dashboard

One of the "perks" that you get when you register with the RDA (besides the primary one - access to data) is that everything related to your data access activities can be presented to you in a convenient dashboard. To access the dashboard, sign in to the RDA and then click the "dashboard" link at the top of the page next to your username:


In your dashboard, you will see several sections, as shown below:
I'll go through these sections in detail, one-by-one.

User Profile
Click the "Edit/Change Profile" button to open your user profile and make changes to it. You can change your email address, password, and affiliation, and you can gain access to restricted data (if you meet the requirements for access).

Bookmarked Datasets
In this section, you will see a list of all of the datasets that you have bookmarked. You can click a dataset title in the list to go directly to that dataset.

You can bookmark a dataset by clicking the hollow star next to the dataset ID (under the title) on a dataset description page:

Once bookmarked, the star will change to solid gold and the bookmark will appear in your list of bookmarked datasets. You can remove a bookmarked dataset from the dashboard, or by clicking the gold star on the dataset description page.

Customized Data Requests
Customized data requests are special requests for data that you make. These may be subsets of the data, staging of files that only reside in our tape archive, or requests to convert from the native data format to some other format. You can manage various aspects of your data requests from the dashboard:

  • Click the calendar to change the "Available Until" date. By default, data from special requests are purged (removed) from our server 5 days after they are prepared. You can extend the date by as much as two weeks, and you may request multiple extensions.
  • Click the "X" if you have finished downloading your data and you want to remove the request from your list (this will also remove the data from our server).
  • Click the envelope if you have any questions about your request. This will allow you to send a message to the dataset specialist.
You can navigate to your data request by clicking either the request number or the status (if the status is "Completed").


Globus Endpoints Shared With You
If you have requested any Globus transfers, for whole datasets and/or your customized data requests, then the shares will appear in this list. To start a data transfer, you can navigate to Globus by clicking the URL for a particular share. You can delete shares that you no longer need by clicking the "X".

Customized OPeNDAP Aggregations
Any customized OPeNDAP aggregations that you have created will appear here. You can copy and paste the URL of your aggregation into your OPeNDAP aware tools and scripts. You also can manage your aggregations:

  • Click the magnifying glass to see more detailed information about your aggregation
  • Click the "X" to delete the aggregation from your list (this will also remove the aggregation from our server)
  • Click the blue circle arrow to extend the expiration of your aggregation. Note that this option will only be available if the current date is within 7 days of the aggregation expiration. Aggregations, by default, expire 28 days after they are created, unless you request an extension. You may request multiple extensions.


Data Citation
We strongly encourage users to cite their data source(s) in published literature. Two key components of a data citation are the dataset title and the access date. You can use our data citation tool to get a citation for your data that you can use in your publication:
To begin, you need to either know the dataset that you want to cite, or the approximate date that you downloaded data from us.

If you know the dataset, click the "By Dataset" tab.

  • You will see a list of all datasets that you have downloaded from the RDA. Click the specific dataset that you want to cite. You will then see a year/month calendar for all of your accesses. For many accesses over time, the suggestion is to choose the most recent access. Click the month to get a calendar for that month. When you click a specific day, a data citation will be generated.


If you know the approximate date, click the "By Date" tab.

  • You will see a year/month calendar showing you when and how many times you downloaded data from the RDA. Click the month that you downloaded the data that you want to cite. You will then see a list of datasets that you downloaded in that month. Hopefully you will recognize that dataset that you want to cite from this list. Click the specific dataset to get a calendar for the month, and then once you click a specific day, a data citation will be generated.

Over time as we develop new services, you will see new sections appear in the dashboard. So when you need more time to download a dataset subset that you have requested, or you want to jump quickly to a dataset that you have previously used (and bookmarked), remember the RDA user dashboard.