The MERRA-2 database provides a web service to download the requested information, however this system is not flexible enough to download a big amount of information. According to their instructions, wget can be used to download the files once a file list is generated. The problem is that for certain cases, like mine, a complete dataset will rquire around 15k requests (one file per day since 1980). Here I document the procedure I followed to organize and download the data.

Create script to download and rename the data.

The file list generated by the database will look like this.

Where each line is a day of data. So, I created a simple script that will do the following:

  1. Download the file.
  2. Rename the file to YYYYMMDD.nc4

The script created:



while read line; do
    # Get date from URL
    filename=$(echo $line | cut -d "?" -f 2 | cut -d "." -f 6).nc4

    echo "Downloading $filename"
    wget -q --load-cookies ~/.urs_cookies \
	 --save-cookies ~/.urs_cookies \
	 --auth-no-challenge=on \
	 --keep-session-cookies \
	 --content-disposition $line -O data/$filename

    # Check if file exists
    if [ ! -f data/$filename ]; then
	echo "ERROR: $filename not downloaded!!!"
done < $filelist

The download process will look like this:

Downloading data