Where's my data?

The RDA considers the ecosystem that data resides in when deciding what to archive.  For instance, we archive both NCEP GFS and FNL analyses*; they are part of the NCEP Global Data Assimilation System (GDAS) system.
Researchers may want to examine the data assimilated into the analyses.  They could be useful for future reanalyses, Observation System Simulation Experiments (OSSE) or regional monitoring.  Thus, we archive several of the major data sources in the GDAS data flow.  You can see the data shortfall in the AIRS 2015 total web server inventory listing.

We pull data from national centers such as NCEP and monitor the data flow.  In late June and early July, 2015, we experienced a drop in ds735.0 expected file sizes of the data.  After some investigation, we traced it to planned network maintenance at NCEP that resulted in a reduction of satellite data that were ingested in the GDAS update cycle.

The investigation turned up some insights that I would like to share.  Do try this at home.
  1. View the ds735.0 Web Server Holdings for the AIRS subset.  Notice that the files are smaller than expected for June 22 and July 1, 2015 (all dates and times UTC).
  2. Click on the smaller tar files to download them one at a time.  Examination of their contents shows an absence of the 06Z file and a short 12Z file:
    total 1031864
    314608 gdas.airsev.t00z.20150701.bufr
    76416 gdas.airsev.t12z.20150701.bufr
    640840 gdas.airsev.t18z.20150701.bufr
  3. View the NCEP Real Time Data Monitoring System page.
  4. Scroll down to 'Model Data Dump Tables' and select 06z under GDS.
  5. You should see a table labeled 'GDS Dump Data Counts Time Series Plots'. Select airsev and you should see this plot showing the data drops.
  6. Now go back and try 12z and see the missing days for this cycle.
  7. Now select acars (Aircraft Communications Addressing and Reporting System), which is part of ds335.0.  Those with NCAR internal access accounts can view the inventory of corresponding big_endian/gdas.aircar.tHHz.YYYYMMDD.bufr.be files.

    What's causing these data drops?  Leave a comment with your answer.  The first correct answer will get a NCAR photo postcard autographed by the RDA data specialist team.
* Read What's the difference between GFS and FNL? for background information.

We have a winner in less than 2 hours.  That must have been too easy.  Now, I will send out another NCAR postcard to the first person to explain the periodicity in the Rapid Update ACARS data ingest.

1 comment:

  1. Given the 7-day periodicity of the mins, which occur on Saturday into Sunday, and also the additional min on July 4th, we're just seeing the evidence of less commercial air traffic on Saturdays.


This section is for people who want to discuss using our data holdings effectively. Moderators will delete irrelevant comments.