GISS Model E Setup for Linux Systems
This document is written as a supplement to the
GISS Model E 'How To' document.
The additional instructions contained here are to support setup and runs of the model
on the OSC Glenn System using the Portland Group compiler and on a home PC using
MinGW and the GFortran compiler. Running the model in batch mode is also covered
with this document.
Index
OSC Policies and Account Access
MinGW and Linux Documentation
Documentation for the GISS Model E
Useful Unix commands
- tar -xf - Extract files from a tarball
- tar -tvf - View files in a tarball
- gzip -d - Decompress a gzip file
- alias cmrundir="~/ModelE/wocean/cmrun" - This is the location of the fixed files and the directory modelE contains the depository of the run deck.
- alias decksdir="~/ModelE/modelE1_pub/decks" - This is the location of the Makefile for generating new run decks.
Install MinGW
See the MinGW Getting Started Document.
- Download the MinGW GUI Installer
- Run the installer and it will create the directory C:\MinGW.
Specify that the C++ and Fortran compilers be installed along with MSYS and the MinGW Developer Toolkit.
GFortran is automatically included. If you want G95, it needs to be installed separately.
- Add the following paths to the Windows environment variable PATH: C:\MinGW\bin and C:\MinGW\msys\1.0\bin
- Need to look into using MinGW-W64. This is a 64 bit build of MinGW. Try using the standard build first to see if it works. Upon further investigation the 64 bit version of MinGW is probably too buggy to use.
Install GISS Model E files
- Download the following files from http://www.giss.nasa.gov/tools/modelE/: modelE1.tar.gz and fixed.tar.gz.
For now use the IPCC AR4 version (internal version number 3.0, dated Feb. 1, 2004). When the AR5 version
is posted, then we can move to it rather than than a moving target of nightly snapshots.
- Extract the file modelE1.tar.gz in the directory ~/GISSModelE. The ~ is the
home directory on Linux and the root directory on a Windows system.
- The file modelE1.tar.gz adds an additional subdirectory of modelE1_pub. To
clean it up a little, move its files to the GISSModelE directory.
- Create a directory within GISSModelE named bcic. Take the file fixed.tar.gz and extract it in this directory.
- Generate a configuration file. This is done by running the commmand gmake config in the GISSModelE/decks directory. This process will place a file named .modelErc in your home directory.
- Modify the configuration file ~/.modelErc in the following way. Relative addressing is used to ensure the setup will work in batch mode. When doing interactive mode, make sure you start in the ~/GISSModelE/decks directory. Check the location and version of the netcdf files for your system. The COMPILER variable sets the compiler to the Portland Group. Also, use your own email address.
DECKS_REPOSITORY=../repository/decks
CMRUNDIR=..
GCMSEARCHPATH=../bcic
EXECDIR=../exec
NETCDFHOME=/usr/local/netcdf-3.6.2
SAVEDISK=..
MAILTO=gollmers@cedarville.edu
OVERWRITE=NO
OUTPUT_TO_FILES=YES
VERBOSE_OUTPUT=NO
MP=YES
UMASK=002
COMPILER=PGI
Correct an error in the file Rules.make (located in the GISSModel/model directory) for the Portland Group compiler. This is related to OpenMP compatability. (OpenMP is an API that allows the program to distribute its computing between multiple processors and threads.) Lines 182 and 183 should be changed as follows:
From:
FFLAGS += -Msmp
LFLAGS += -Msmp
To:
FFLAGS += -mp
LFLAGS += -mp
Correct an error in the file Rules.make to properly include the netcdf libraries. Since the NetCDF libraries are compiled in C++ format, there is an additional library that maps the fortran calls to the C++ calls. This mapping is in the library netcdff. Even with this correction, there is still a problem with processing files in the NetCDF format, but at least with this change the program will compile and run. Line 267 should be changed as follows:
From:
LIBS += -L$(NETCDFHOME)/lib -lnetcdf
To:
LIBS += -L$(NETCDFHOME)/lib -lnetcdff -lnetcdf
Add the missing file 1DEC1969.rsfE050AoM20A to the directory bcic. This is needed only if you are using the dynamic ocean model.
Installation changes for MinGW and GFortran environment
- Installing on the MinGW system may have a problem extracting the subdirectory aux.
It seems that Windows 7 keeps AUX as a restricted word. Using a different extraction
program than tar, the directory is saved to _aux. It is not possible to rename
the directory; therefore, the code, makefiles and scripts need to be checked for compatability
of the directory name. The following files need their reference to the aux directory
changed.
Filename:
Here are some comments on compiling Model E with GFortran and Windows
Most of these steps are included in the run section of these instructions. However, some explanation is included here to make sense of the steps performed in the next section. If you are trying to get a compile and run completed, skip these steps and go to the run section of this document.
- Set up a configuration file - This was described in the setup portion of this document. This configuration file (~/.modelErc), which is located in the home directory, defines variables used by script files that compile and run the model. Of primary importance is the location of key directories and which compiler to use.
- Initialize a run deck - This is accomplished using the command gmake rundeck RUN=E001smg. A new file will show up in the decks directory with the name E001smg.R. This file can be changed to use different initialization and boundary condition files from the directory bcic. For the following instructions a dynamic ocean is used. To get this model to run use the command gmake rundeck RUN=E001smg RUNSRC=E001o. This run deck is distinctly different than the default. The initialization file 1DEC1969.rsfE050AoM20A is missing in the directory bcic and must be added. Many other deck files are available. The templates of these files are in the model directory with an extension of .R.
- Modify the run deck - Edit the file E001smg.R (located in the decks directory) to use different initialization and boundary condition files. Also the end of this file indicates how long the model should run and when it should output accumulated statistics. See the Options File for more information.
. One change that should always be made is to change to the line that says POUT to POUT_netcdf.
- Compile the model - While still in the decks directory, run the command gmake gcm RUN=E001smg. If the option of VERBOSE_OUTPUT=YES was specified in the configuration file ~/.modelErc, the progress of the compile will be displayed one the screen. This is useful when first working with the model. Once you get successful compiles, it is too much information. At that point set the configuration file option to VERBOSE_OUTPUT=NO. If a compile succeeds, an executable program named E001smg will be generated. Also object files from the compile will be deleted. However, if the compile does not successfully run to completion, there will be object files located in the GISSModel/model directory, which will interfere with future compiles. To reset the directories for a clean compile use the command gmake vclean.
- Increase the stack size - The default stack size for the computer is too small for this climate model. Therefore, it needs to be increased. Use the command ulimit -s 32768
- Initialize the model - Using the command gmake setup RUN=E001smg runs the model through one hour of model time. This ensures that all of the initialization and boundary condition files are present along with any necessary environment variables. If this step is not successful, then use the displayed output to diagnose the problem. If too much text is displayed on the terminal, there is a text file containing the same information. It is located in the directory GISSModelE/E001smg and is named E001smg.PRT. During a run this file also contains intermediate statistics about model parameters during the run. If you are successful at this step in the process, you are ready for running the model. If any error occur beyond this point, it is probably due to instabilities in the model resulting from unexpected boundary or initial conditions.
Once you have a successful setup, the model can be run interactively or in batch mode. These instructions repeat steps from the setup and initialization steps described above. Additional instructions given here are to run and stop the model in order to inspect data from the model run. Also the supercomputing center has a wall time limit of 20 minutes for interactive computing. The wall time is the time used by one processor to run a program. If 2 processors are requested, then the 20 minute wall time corresponds to 10 minutes or real time. Since the model will not complete even one month in 20 minutes of wall time, it must be stopped and restarted if you want to operate in interactive mode.
- Go to the directory ~/GISSModelE/decks.
- Make sure Rules.make include compiler library -netcdff and the multiple processor flag for PGI. Rules.make is located in the directory GISSModelE/model. (Setup)
- Make sure ~/.modelErc has the right directory for the netcdf libraries. (Setup)
- Change ~/.modelErc to use multiple processors. (Setup)
- Generate rundeck using gmake rundeck RUN=E001smg RUNSRC=E001o.. This uses the dynamic ocean model. If you want a prescribed ocean, delete the statement RUNSRC=E001.o and it will use the default source instructions. (Initialization)
- Edit E001smg.R to use netcdf by changing POUT to POUT_netcdf. This file should be in the decks directory. (Initialization)
- Compile the program using gmake gcm RUN=E001smg. (Initialization)
- Establish the stack size using ulimit -s 32768. (Initialization)
- Run the setup command gmake setup RUN=E001smg. (Initialization)
- Change the directory to ~/GISSModelE/exec and run the command ./runE E001smg 4. This runs the model using 4 processors.
- Check on status of run by issuing the commend ps.
- Stop the run before 5 minutes so it terminates gracefully using the command ./sswE E001smg.
- Check the status and data from the run by going to ~/GISSModelE/output/E001smg. Check the file E001smg.PRT to see how far the run has proceeded.
- Restart the run by issuing the command ../../exec/runE E001smg 4.
- Monitor the time and when the run time gets close to 20 minutes issue the command ../../exec/sswE E001smg.
- Generate data files from the accumulated monthly diagnostics using the post-processing program pdE. The command for processing a single month is ../../exec/pdE E001smg DEC1949.accE004smg.
- Continue extending the run so additional months are generated. Check documentation to see how multiple months of data can be processed to generate statistics.
Go through the following steps to make sure the system is set up correctly. If this is successful then follow the next procedure to do the batch run.
- Set up the file structure within the GISSModelE directory as follows:
- aux
- CVS
- doc
- decks - Location of Makefile and place where gmake rundeck, gmake gcm and gmake setup are run
- exec - Location of run and stop scripts and place where runE and sswE are run
- bcic - Place boundary and initial conditions in here
- E010smg - Compiled model and run results will be saved here. Also run pdE within this as the current directory to generate diagnostics.
- E2000
- model - Model fortran code is here
- prtdag
- repository
- decks - Run deck will be saved here
- Set up the file ~/.modelErc in the following way
- DECKS_REPOSITORY=../repository/decks
- CMRUNDIR=..
- GCMSEARCHPATH=../bcic
- EXECDIR=../exec
- NETCDFHOME=/usr/local/netcdf-3.6.2
- SAVEDISK=..
- MAILTO=gollmers@cedarville.edu
- OVERWRITE=NO
- OUTPUT_TO_FILES=YES
- VERBOSE_OUTPUT=NO
- MP=YES
- UMASK=002
- COMPILER=PGI
- Go to the directory ~/GISSModelE/decks.
- Generate rundeck using gmake rundeck RUN=E010smg RUNSRC=E001o.
- Check E010smg.R to make sure it is the right deck. If the diagnostics are to be save as NetCDF, POUT needs to be replaced with POUT_netcdf
- Compile the program using gmake gcm RUN=E010smg.
- Establish the stack size using ulimit -s 32768.
- Run the setup command gmake setup RUN=E010smg.
- Change the directory to ~/modelE/exec and run the command ./runE E010smg 4.
- Check on status of run by issuing the commend ps.
- Stop the run before 20 minutes so it terminates gracefully using the command ./sswE E010smg.
- Check the status and data from the run by going to ~/modelE/cmrun/E010smg. Check the file E010smg.PRT to see how far the run has proceeded.
- Go back to ~/modelE/exec and restart the run by issuing the command ./runE E010smg 4.
- Monitor the time and when the run time gets close to 20 minutes issue the command ./sswE E010smg.
- Once a month of data is generated, the diagnostic program needs to be run. Generate data files from the accumulated monthly diagnostics using the post-processing program pdE. Change to the directory ~/GISSModelE/E010smg and run the command ../exec/pdE E004smg DEC1900.accE010smg.
- Continue extending the run so additional months are generated. Check documentation to see how multiple months of data can be processed to generate statistics.
Upon successful completion of the procedure given above, you will follow the next procedure, which uses a batch run script to submit and control the batch job. Since the interactive mode script runE uses the nohup and nice calls, it interferes with the batch job. Since runE ultimately calls the compiled program E010smg, it will be called directly. However two environment variables need to be defined since runE is not being called. Those definitions show up in the batch script below.
- From the process given immediately above perform steps 1 - 4.
- Edit the run deck E010smg.R by changing the &INPUTZ line involving YEARE. Only run the model for 5 months. The appropriate line is changed to the following:
- YEARE=1901,MONTHE=5,DATEE=1,HOURE=0, KDIAG=0,2,2,9*0,9,
- Perform steps 6 - 8 from the previous process to compile and set up the model run.
- Generate a batch run script. The following script will be saved in file E010smg.pbs
#PBS -N E010smg ! Give the run a name
#PBS -l walltime=03:00:00 ! Request 3 hours of wall time
#PBS -l nodes=1:ppn=4 ! Use one node with 4 processors
#PBS -l mem=2GB ! Request 2 GB of memory for running the model
#PBS -j oe ! Combine the standard output and error streams
#PBS -m e ! Send an email when the job ends
export MP_SET_NUMTHREADS=4 ! Set the Environment variable MP_SET_NUMTHREADS (Value should match processor number)
export OMP_NUM_THREADS=4 ! Set the Environment variable OMP_NUM_THREADS (Value should match procesor number)
cd $TMPDIR ! Change the current directory to the batch job directory
mkdir E010smg ! Define the subdirectory E010smg
cp $HOME/modelE/E010smg/* ./E010smg ! Copy files into that directory
mkdir bcic ! Define the subdirectory bcic
cp -r $HOME/modelE/bcic/* ./bcic ! Copy files into that directory
cd $TMPDIR/E010smg ! Change to the directory with the executable code
ulimit -s 34768 ! Set the stack limit to avoid an overflow error
./E010smg > ../E010smg.runoutput ! Run the executable and redirect output to a file
cd $TMPDIR ! Change the current directory to the batch job directory
cp -r ./* $HOME/cmrun ! Copy all of the files from the batch directory to the user account directory
- Set the batch file as executable using chmod 555 E010smg.pbs
- Submit the batch job with the command qsub E010smg.pbs
When the model run is completed, the directory E001smg will contain a number of files. Some are necessary for the model to run such as E, E001smg, E001smg.exe, E001ln, E001smguln, runtime_opts and flagGoStop. Files that store snap shots of model variables during the run are contained in fort.1, fort.2, fort.8, and fort.99. Information about the run are stored in the files I, Ibp, Iij, and Ijk. Two diagnostic files are also present and have the names ODIfF and snow_debug.
As mentioned previously, error messages and model update information is sent to the screen (in interactive mode) and to the file E001smg.PRT. This file is not only useful for diagnosing why a run may have failed, but it also includes monthly and annual statistics for specified locations and latitudes in the model. When using the dynamic ocean, it also includes statistics from the oceans and the straits between oceans.
The remaining files in this directory come in pairs and should correspond to each month of the model run. If you specify a different save rate, such as daily, you will have a pair of files for each day. The first file of the pair is a restart file and contains the current state of the model variables at the end of the month. This file can be used to start a new model run with changed parameters or to restart a failed run, where the output of the fort.1 and fort.2 files were corrupted. A sample file name for the restart file is 1JAN1901.rsfE001smg. The second file of the pair is an accumulated statistics file for model specified variables. These statistics are averages over the current month of the model run. If other statistics are desired for the accumulated statistics, see the How To document. A sample file name for the accumulated statistics file is JAN1901.accE001smg.
Processing the statistics file for analysis involves the use of the program pdE. This program takes a collection of files and performs averages. The following procedure can be used to generate averages of interest.
- Copy the acc files to the directory ~/modelE/process and go to that directory. Since the environment variables for the model use relative addressing, we need to initiate processing in a directory that is a subdirectory of ~/modelE.
- Run the command ../exec/pdE E010smg *1901.accE001smg (Yearly). The asterisk at the beginning is a wildcard character that can represent any string of characters. Therefore, any file that begins with JAN, FEB, MAR, ... and matches the rest of the character string will be averaged together. Instead of keeping the same data structure as the original accumlation file, the averages are saved in a series of files. Some files correspond to data structures within the original accumulation file and others are additional averages across latitudes or diurnal cycles at specific grid points. The following is a list of files generated at what appears to be their significance. Each of these files begin with ANN, which stands for annual averages.
- ANN1901.accE001.smg - This file has the same data structure as the individual monthly accumulation files, but averaged over the year.
- ANN1901.ijE001.smg - Data fields in this file are two dimensional, with each point representing a grid point at a specific latitude and longitude.
- ANN1901.ijkE001.smg - Data fields in this file are three dimensional, with each point representing a specific latitude and longitude as well as a level within the atmosphere.
- ANN1901.hdiurnE001.smg - This gives the diurnal cycle for representative grid points within the model. These are hourly values at those respective points.
- ANN1901.diurnE001.smg - This gives the diurnal cycle for representative grid points within the model. These are hourly averages at those respective points.
- ANN1901.jkE001.smg - This file contains two dimensional data fields that correspond to different latitudes and levels within the atmosphere.
- ANN1901.jE001.smg - This file contains one dimensional data fields that correspond to different latitudes.
- ANN1901.icijE001.smg - This file contains two dimensional sea ice data fields that correspond to different latitudes and longitudes.
- ANN1901.ilE001.smg - This file contains two dimensional data fields that correspond to differnt longitudes and pressure levels at specified latitudes.
- ANN1901.wpE001.smg - This file contains the wave power for zonal and meridional winds at different latitudes and layers within the atmosphere.
- ANN1901.isccpE001.smg - This file contains values corresponding to experimentally measured values from the International Satellite Cloud Climatology Project (ISCCP).
- ANN1901.E001.smg.PRT - This file contains additional calculated statistics and also includes any error messages generated by the script pdE.
- ANN1901.oijE001.smg - This file contains two dimensional data fields for the different ocean layers in the dynamic ocean. If the dynamic ocean is not used, this file will not be generated.
- ANN1901.ojlE001.smg - This file contains two dimensional data fields for different latitudes and ocean levels. If the dynamic ocean is not used, this file will not be generated.
- ANN1901.oilE001.smg - This file contains two dimensional data fields for different longitudes and ocean levels. If the dynamic ocean is not used, this file will not be generated.
- ANN1901.otjE001.smg - This file contains ocean transport of physical properties at different latitudes. If the dynamic ocean is not used, this file will not be generated.
- Repeat this command for the other years over which the model ran.
- Download the annual files to the appropriate directory on a local machine.
- Run the command ../exec/pdE E001smg JAN*.accE001smg (Monthly). Notice the asterisk is used to select all of the January files from different years in the current directory. The same list of files described above will be generated here, but with a prefix of the month followed by the range of years over which the data was averaged. For example a resultant file name would be JAN1901-1906.accE001smg
- Repeat this command for the other months.
- Download the monthly files to the appropriate directory on a local machine.
- Run the command ../exec/pdE E001smg DEC*.accE001smg JAN*.accE001smg FEB*.accE001smg (Seasonal). The resultant average file will be named JFD1901-1906.accE001smg. Notice the first initial of each month is used to name the file. Technically you can string together any series of files for the average; however, I am not sure how the program will name the output file if it does not conform to standard statistics of interest to most climatologists.
- Repeat this command for the other seasons.
- Download the seasonal files to the appropriate directory on a local machine.
Notice that each time pdE is called it is followed with the string E001smg. The script pdE is really calling the climate model and instructing it to average the statistics. Since the control file E001smg.R specified the output to be POUT_netcdf, the post-processing output files will be in NetCDF format. If the output is specified to be POUT each of these files will be in GISS format, which is the result of a standard Fortran write command.
Note for the Dynamic Ocean - I have not been successful at getting pdE to write correct NetCDF files for dynamic ocean model runs. Therefore, the output command in the control file needs to be POUT. To get the ocean files into NetCDF format, several R Scripts have been written. R is an open source statistics package, which is available for Windows, Mac and Linux systems. The following scripts are available for conversion. (Others scripts will be generated in the future as needed.)
- IJ_NetCDF.R - This takes the two dimensional surface parameters from the model and converts them to NetCDF format.
- OIJ_NetCDF.R - This takes the two dimensional ocean layer parameters and converts them to NetCDF format.
- OIL_NetCDF.R - This takes the two dimensional latitude averaged ocean layer parameters and converts them to NetCDF format.