26 April 2017

Transferring custom file lists with Globus

The RDA has supported Globus file transfers for over two years, which enables users to transfer data files from RDA datasets to their destination endpoint using the Globus data transfer service (see http://ncarrda.blogspot.com/2015/06/transferring-rda-data-with-globus.html for details).  Transferring files using Globus is an efficient, reliable, and secure method to transfer large amounts of data from one endpoint to another.

But what if you're interested in a smaller or specific collection of files within a dataset?  In this scenario, the user would navigate to the Globus web app transfer interface and then browse through the files on the RDA shared endpoint to locate the files of interest.  This workflow, however, is not ideal from a search and discovery perspective, since users don't have the convenience of filtering the file sets via faceted browsing or viewing the supporting metadata alongside each file listed on the Globus web interface.

The standard Globus web app transfer interface.  In this workflow, users must browse through the full set of files within a RDA dataset to locate specific files to be transferred.

The good news: RDA users now have the ability to select and initiate Globus transfers of custom file sets directly from the RDA web interface without having to search for the files on the Globus transfer interface.  This method of transferring data uses the Globus Browse Endpoint helper API, and the only step required to be done by the user on the Globus web app is to select the destination endpoint that will receive the data files in the transfer.  The advantage to this approach is users are able to make use of the faceted search and discovery tools on the RDA website and avoid the bottleneck of making "blind" searches of the files on the Globus web interface.

How to make a Globus transfer of custom file sets


Starting from any RDA dataset page, go to the 'Data Access' tab, then select the 'Web file listing' link associated with any of the data products listed.  At this point, you may choose either the 'Faceted browse' or 'Complete file list' option, and then follow the instructions to locate your files of interest.

Procured file lists using the 'Complete file list' (left) and 'Faceted browse' search function (right) on the RDA website. Once a collection of files has been selected, the user initiates a Globus transfer by selecting the 'Globus' button.

To initiate a Globus transfer, select the files you wish to download, then select the 'Globus download' button.  This will redirect you to the Globus web app where you will authenticate with Globus and select the destination endpoint that will receive the data files.  After a destination endpoint has been chosen, select the 'Submit' button.  You will then be redirected back to the RDA website, at which point the data transfer will be submitted on your behalf.  A status message displaying the status of your data transfer will appear, and a notification e-mail will be sent to your RDA user e-mail address when the transfer has successfully completed.

The browse endpoint interface on the Globus web app, where users select the destination endpoint.

As a reminder, RDA users can log into the Globus web app with their RDA user e-mail login and password credentials; this is the recommended method for logging into Globus.  To do this, simply choose the 'NCAR RDA' organizational login on the Globus login page.