Notes on GISS ModelE1

OSC Policies and Account Access Documentation on the Model

Atmosphere

Ocean

See OceanModelNotes.html

Implement on OSC

Implement on WinXP

Code Modification

Model Runs

Notebook


Useful Unix commands

tar -xf - Extract files from a tarball
tar -tvf - View files in a tarball
gzip -d - Decompress a gzip file
alias cmrundir="~/ModelE/wocean/cmrun" - This is the location of the fixed files and the directory modelE contains the depository of the run deck.
alias decksdir="~/ModelE/modelE1_pub/decks" - This is the location of the Makefile for generating new run decks.

Work Log

May 10, 2010:

Upload the GISS ModelE1 to the OSC cluster and uncompress using gzip and tar. These files are named modelE1.tar.gz and fixed.tar.gz. The first file is the model and the second is the boundary and initial conditions.

See the documentation provided at GISS ModelE

May 17, 2010:

Placed the initialization files in the subdirectory initialization. Will need to specify a path to this directory when running ModelE.

It looks like the q-flux model in ModelE may not meet the need I have for determining fluxes within the ocean. Initializing the ocean involves the following:

  1. Set the sea surface temperatures as fixed and run the model for a decade. Since it is assumed that the model is at equilibrium, the average flux over a year should be equal to zero. Since it is not, then the net flux needs to advect away from the ocean cell to meet that condition. This advection can go to neighboring cells or diffuse down to the deep ocean. This initial run is to determine the values of the fluxes.
  2. Next the sea surface temperatues are allowed to change, but the generated flux file is used to simulate energy transport within the ocean layer. This forms a baseline model run from which to compare runs with different values.
  3. Finally change some parameters, forcings, or initializations and see how the model responds to these changes. This is considered a sensitivity study.

May 18, 2010:

To set up ModelE you need to upload the program code and a fixed dataset file. The program code is somewhat self contained and locates important files relative to a root directory (by default this is the directory modelE1_pub. The decks directory contains a Makefile. Run the command gmake config in this directory and it will generate a file named .modelErc in the home directory. This file provides paths to directories important to the model run. By default they use a root directory of /u, which is used as a central directory at the GISS site. More my account on glenn.osc.edu, I will set this directory to /nfs/06/ced0013/ModelE/wocean. The location of NetCDF is /usr/local/netcdf-3.6.2. The fixed dataset file must be placed in the directory /nfs/06/ced0013/ModelE/wocean/cmrun

The rundeck must be set up. For this first run, use the file E001.R, which is a fixed sea surface temperature. Start in the directory ~/ModelE/modelE1_pub/decks and run the command gmake rundeck RUN=E001smg. This takes the sample deck file E001.R and copies it to the current directory as well as the depository /nfs/06/ced0013/ModelE/wocean/cmrun/modelE/decks.

Now compile the program by running the command gmake gcm RUN=E001smg. Upon success, run the first hour setup using the command gmake setup RUN=E001smg. If this is successful, the rest of the run can be make.

Troubleshooting: The compile did not work evidently because it could not find the compiler. I went back to the file .modelErc and uncommented the compiler option and changed the value to Portland Group. To set up the deck again use the command gmake rundeck RUN=E001smg OVERWRITE=YES. The rundeck was generated; however, when compiling the model, it gave a new error which states that there is no rule to make a target. Since this did not happen the first time, I assume that some files were generated on the first compile that are interfering with this compile. Therefore, I will try generating a new deck called E002smg. In order for the E001.R file to be used the command for generating the deck is now gmake rundeck RUN=E002smg SRC=E001 If SRC=E001 not specified, the default is E001.

Dealing with the following error

	Can't open ./.depend.E004smg: No such file or directory at -e line 1.
	------------          Rebuilding Dependencies             -------------
	running CPP
	Requested target is not supported on Linux
	compiling RES_M12.f ... Requested target is not supported on Linux
	gmake[1]: *** [RES_M12.o] Error 2
	

After several hours of poking around in Makefile and other files in the directory, I came across the file model/Rules.make. This file is called by Makefile and sets a number of compiler flags. Looking through the list I found a conditional for a compiler of PGI (Portland Group). In the file .modelErc it was not clear how the compiler flag should be defined. I assumed it was either Portland or Portland Group. By changing it to PGI, the compile was successful. Now test the compile by using the command gmake setup RUN=E001smg.

A successful setup was accomplished. Now the model needs to be run. This particular run lasts for 6 days. It is initiated with the following command from the decks directory. ../exec/runE E001smg. If the run is to be interrupted gracefully, use the command ../exec/sswE E001smg. The run can be resumed by using the command ../exec/runE E001smg again.

May 19, 2010:

The run done at the end of the day yesterday did not complete correctly. After about 20 minutes it ended without notice. Looking through the file wocean/E001smg/nohup.out I saw the following message

	./E001smg: line 32: 29344 Killed                  ./E001smg.exe -i ./$IFILE > $PRTFILE
	./E001smg: line 37: /nfs/06/ced0013/ModelE/wocean/exec/runpmE: No such file or directory
	./E001smg: line 37: exec: /nfs/06/ced0013/ModelE/wocean/exec/runpmE: cannot execute: No such file or directory
	

Looking through the documentation, it seems that the files contained in modelE1_pub/exec should also be present in the directory defined by the parameter EXECDIR in the file .modelErc. As a result of this conflict, I think it would be best to redefine the location of the root directory for the model run. Up to now it has been wocean. I think it should be the directory modelE1_pub. Since this change will involve a little work, it might be good to move the directory modelE1_pub to a location closer to the home directory. Let's start from scratch and set up using the following steps.

  • Move modelE1_pub to the home directory and rename it modelE1
  • Move cmrun to a subdirectory of modelE1
  • Change the directory locations in the file ~/.modelErc
  • Create directory ~/modelE/output for the output files
  • In the directory ~/modelE/decks run the command gmake rundeck RUN=E001smg
  • Now run the command gmake gcm RUN=E001smg to compile the program
  • Run the command gmake setup RUN=E001smg to run the diagnostic first hour
  • Change to the directory to ~/modelE/exec and run the command runE E001smg. Hopefully this will end gracefully unlike last night.
This new run terminated with the following message in nohup.out
	./E001smg: line 32: 14640 Killed                  ./E001smg.exe -i ./$IFILE > $PRTFILE
	

I'm not sure why it used the term killed; however, it cleaned up working files and it left the following three files from the run. It seems that it terminated gracefully. This run took 20 minutes.

  • 1JAN1950.rsfE001smg - Size of 19954612
  • DEC1949.accE001smg - Size of 4792472
  • E001smg.PRT - Size of 769401

An email message was sent that contained some output values from the model and they do not terminate cleanly. On the output from Cloud Frequency for 60S to 30S latitude, there are 3 values reported for the 900 level of PRESSTAU. After this comes a message of No automatic fixup for return code: 137. A clipping from the email message is as follows:

	 0ISCCP CLOUD FREQUENCY (NTAU,NPRES) % 60S-30S               PARTIAL             
	 ------------------------------------------------------------------------
	   PRESSTAU    0.  1.3  3.6  9.4  23   60   > 
	      90         0.0  0.0  0.0  0.0  0.0  0.0
	     245         0.5  0.3  0.3  0.5  0.9  1.3
	     375         0.3  0.8  1.3  2.8  3.4  3.0
	     500         0.1  0.6  2.2  4.2  3.1  1.0
	     630         0.3  1.9  8.1  8.5  2.0  0.2
	     740         0.2  1.1  2.5  1.6  0.2  0.0
	     900         0.3  2.5  7.4  4.No automatic fixup for return code: 137
	

Try a new run using the multiple processors flag. This flag is located in the file ~/.modelErc. Set it to YES and run the previous series of step for an identical run

  1. In the directory ~/modelE/decks run the command gmake rundeck RUN=E002smg
  2. Now run the command gmake gcm RUN=E002smg to compile the program
  3. Run the command gmake setup RUN=E002smg to run the diagnostic first hour
  4. Change to the directory to ~/modelE/exec and run the command ./runE E002smg.

Ran into a snag on the compile (step 2). A comment in the file ~/.modelErc indicated that the multiple processor flag is only recognized by SGI and Compaq. The conclusion is that this is not supported in Linux. The compile terminated with the following statement when it was trying to link the executable

	linking executable
	pgf90-Error-Unknown switch: -Msmp
	gmake[1]: *** [/nfs/06/ced0013/modelE/decks/E002smg_bin/E002smg.exe] Error 1
	gmake: *** [gcm] Error 2
	

All the modules seem to compile just fine. I need to track down the switch -Msmp and see if there is an equivalent on supported by the Portand Group compiler. The only changes from the previous compile and run would be the incorporation of OpenMP instructions.

Did the following command grep -r Msmp modelE to identify files that contain the string Msmp. The error message line showed up in a hidden file supposedly in ~/modelE/model; however, I could not find it in a normal file list. The other two hits were in a file named ~/modelE/model/Rules.make. Scrolling through this file, there are two lines FFLAGS += -Msmp and LFLAGS += -Msmp. These are fortran compiler and linker flags. There is a comment in this file that these may need to be changed for the PGI OpenMP compatibility???. I would agree. From the Portand Group web site it seems the options should be changed to -mp.

Run gmake clean and gmake vclean to get rid of object files from the failed compile. Now repeat step 2 and follow with steps 3 and 4. Step 2 resulted in a clean compile; however, step 3 resulted in a Segmentation fault. The error message is as follows

	setting up run E002smg
	output files will be saved in /nfs/06/ced0013/modelE/output/E002smg
	using /nfs/06/ced0013/modelE/cmrun/AIC.RES_M12.D771201 for IC only
	using /nfs/06/ced0013/modelE/cmrun/GIC.E046D3M20A.1DEC1955 for IC only
	starting the execution
	current dir is /nfs/06/ced0013/modelE/output/E002smg
	Starting 1st hour in the background.
	
	-bash-3.2$ sh: line 5: 23876 Segmentation fault      ./"E002smg".exe -i I >> E002smg.PRT
	 Problem encountered while running hour 1 :
	cat: error_message: No such file or directory
	 >>>  <<<
	

It is possible the segmentation fault is due to a stack size that is too small. Originally I looked at the stack size and it said unlimited. However, with the command ulimit -s it reported a stack size of about 10240 kbytes. The stack size was changed to ulimit -s 32768 and running step 3 was successful.

Running the multiple processor version of the program also resulted in a kill at 20 minutes. In fact it was exactly at 20 minutes. It may be limit placed on a program run on glenn.osc.edu to prevent one from dominating the cluster. The email sent by this run seems to be a portion of the file E002smg.PRT. There is an error code of 137 showing up on this run at the end. The program was not at the same place because it was a series of Restart files for H2O generated by CH4 in the stratosphere. The following is a clip from this email.

E001smg
	total 98956
	-rw-rw-r-- 1 ced0013 PCED0003 19954612 May 19 14:06 1JAN1950.rsfE001smg
	-rw-rw-r-- 1 ced0013 PCED0003  4792472 May 19 14:06 DEC1949.accE001smg
	lrwxrwxrwx 1 ced0013 PCED0003        7 May 19 13:35 E -> E001smg
	-rwxrwxr-x 1 ced0013 PCED0003      838 May 19 13:35 E001smg
	-rwxrwxr-x 1 ced0013 PCED0003 16156865 May 19 13:30 E001smg.exe
	-rwxrwxr-x 1 ced0013 PCED0003     2572 May 19 13:35 E001smgln
	-rw-rw-r-- 1 ced0013 PCED0003   769401 May 19 14:07 E001smg.PRT
	-rwxrwxr-x 1 ced0013 PCED0003      382 May 19 13:35 E001smguln
	-rw-rw-r-- 1 ced0013 PCED0003        9 May 19 13:46 flagGoStop
	-rw-rw-r-- 1 ced0013 PCED0003 29668300 May 19 14:05 fort.1
	-rw-rw-r-- 1 ced0013 PCED0003 29668300 May 19 14:06 fort.2
	-rw-rw-r-- 1 ced0013 PCED0003     3807 May 19 13:47 fort.8
	-rw-rw-r-- 1 ced0013 PCED0003   102400 May 19 13:47 fort.99
	-rw-rw-r-- 1 ced0013 PCED0003      796 May 19 14:07 I
	-rw-rw-r-- 1 ced0013 PCED0003      231 May 19 13:35 Ibp
	-rw-rw-r-- 1 ced0013 PCED0003    16063 May 19 13:35 Iij
	-rw-rw-r-- 1 ced0013 PCED0003     9606 May 19 13:35 Ijk
	-rw------- 1 ced0013 PCED0003       87 May 19 14:07 nohup.out
	-rwxrwxr-x 1 ced0013 PCED0003      165 May 19 13:35 runtime_opts
	
E002smg
	total 74220
	lrwxrwxrwx 1 ced0013 PCED0003        7 May 19 15:11 E -> E002smg
	-rwxrwxr-x 1 ced0013 PCED0003      838 May 19 15:11 E002smg
	-rwxrwxr-x 1 ced0013 PCED0003 16363137 May 19 15:11 E002smg.exe
	-rwxrwxr-x 1 ced0013 PCED0003     2572 May 19 15:11 E002smgln
	-rw-rw-r-- 1 ced0013 PCED0003    27224 May 19 15:37 E002smg.PRT
	-rwxrwxr-x 1 ced0013 PCED0003      382 May 19 15:11 E002smguln
	-rw-rw-r-- 1 ced0013 PCED0003        9 May 19 15:16 flagGoStop
	-rw-rw-r-- 1 ced0013 PCED0003 29668300 May 19 15:36 fort.1
	-rw-rw-r-- 1 ced0013 PCED0003 29668300 May 19 15:36 fort.2
	-rw-rw-r-- 1 ced0013 PCED0003     3807 May 19 15:17 fort.8
	-rw-rw-r-- 1 ced0013 PCED0003   102400 May 19 15:17 fort.99
	-rw-rw-r-- 1 ced0013 PCED0003      797 May 19 15:37 I
	-rw-rw-r-- 1 ced0013 PCED0003      231 May 19 15:11 Ibp
	-rw-rw-r-- 1 ced0013 PCED0003    16063 May 19 15:11 Iij
	-rw-rw-r-- 1 ced0013 PCED0003     9606 May 19 15:11 Ijk
	-rw------- 1 ced0013 PCED0003       87 May 19 15:37 nohup.out
	-rwxrwxr-x 1 ced0013 PCED0003      165 May 19 15:11 runtime_opts
	

The purpose of these files appear to be the following:

  • 1JAN1950.rsfE001smg - Restart file for when the model was stopped.
  • DEC1949.accE001smg - Diagnostic file which contains all the diagnostic information from the previous month.
  • E -> E001smg - A link to the shell script E001smg.
  • E001smg - Shell script to run the model.
  • E001smg.exe - Compiled executable for this model run.
  • E001smgln - Shell script to link references to initialization and boundary condition files.
  • E001smg.PRT - Report log of the E001smg run. Includes actions performed and certain diagnostic values and arrays.
  • E001smguln - Shell script to unlink references to initialization and boundary condition files.
  • flagGoStop - Text flag to indicate whether the model should continue or stop. Probably changed by runE and sswE.
  • fort.1 - Binary data file.
  • fort.2 - Binary data file.
  • fort.8 - Parameter settings for this model run. Shows up again in the file I.
  • fort.99 - Data arrays (text format) for 3 layers for aerosols (sulfate, sea salt, nitrate, and organic.
  • I - Parameter settings for E001smg. At the end it includes the message given by nohup.out. Before this message is indicates that the model Stopped at 1-01-1950. I assume this was stopped to save the data fields for the first year's run. This indicates that 1 month runs in less than 20 minutes with a single processor.
  • Ibp - List of bugdet-pages
  • Iij - Gives a descriptive list for fields shown as maplets, fields shown as 1-pg maps, fields in binary output, and 3-d fields.
  • Ijk - Gives a descriptive list for the JL-fields (86 of them), JK-fields (88 of them)
  • nohup.out - Contains the last error message issued by the process.
  • runtime_opts - This is a shell script that looks at the stack size limits.

Running the program pdE on the file DEC1949.accE001smg generated the following files:

  • DEC1949.diurnE001smg
  • DEC1949.E001smg.PRT
  • DEC1949.hdiurnE001smg
  • DEC1949.icijE001smg
  • DEC1949.ijE001smg - Latitude-longitude binary file
  • DEC1949.ijkE001smg
  • DEC1949.ilE001smg - Longitude-height binary file
  • DEC1949.isccpE001smg
  • DEC1949.jE001smg - Zonal budget pages ASCII file
  • DEC1949.jkE001smg - Latitude-height binary file
  • DEC1949.wpE001smg - Wave power binary file

May 20, 2010:

Looking through the documentation, there are two things to try. The first is to use multiple processors. Although the code is compiled for OpenMP, the number of processors needs to be specified in the runE command. Also it would be useful to have the data saved in netCDF format. This must be specified in the rundeck before compiling the program. A third run will now be initiated using these two factors. The procedure list is as follows:

  1. Start in directory ~/modelE/decks and make the rundeck with the instruction gmake rundeck RUN=E003smg.
  2. Open the rundeck E003smg.R and change the parameter POUT to POUT_netcdf.
  3. Compile the program using the instruction gmake gcm RUN=E003smg.
  4. Run the setup command gmake setup RUN=E003smg. Don't forget to set the stack limit with ulimit -s 32768.
  5. Change to the directory ~/modelE/exec and run the command ./runE E003smg 4.

Got stopped in step 3 (the compile) due to the linker not being able to find the appropriate files. This may go back to the problem with compiling CAM 3.0 last year. It may be a compiler flag because the NetCDF directory is defined in the .modelErc file parameter NETCDFHOME. Opened the file ~/modelE/model/Rules.make and looked for netcdf. The options are defined for SGI machines, but not for the Portland Group. The following lines were changed as follows:

Before
	#
	# Check for extra options specified in modelErc
	#
	
	ifdef NETCDFHOME
	ifeq ($(MACHINE),SGI)
	  LIBS += -L$(NETCDFHOME)/lib64 -lnetcdf
	else
	  LIBS += -L$(NETCDFHOME)/lib -lnetcdf
	endif
	  FFLAGS += -I$(NETCDFHOME)/include
	  INCS += -I $(NETCDFHOME)/include
	endif
	
	#
	# Pattern  rules
	#
	
After
	#
	# Check for extra options specified in modelErc
	#
	
	ifdef NETCDFHOME
	ifeq ($(MACHINE),SGI)
	  LIBS += -L$(NETCDFHOME)/lib64 -lnetcdf
	else
	#  LIBS += -L$(NETCDFHOME)/lib -lnetcdf
	  LIBS += -L$(NETCDFHOME)/lib -netcdf
	endif
	  FFLAGS += -I$(NETCDFHOME)/include
	  INCS += -I $(NETCDFHOME)/include
	endif
	
	#
	# Pattern  rules
	#
	

Can't get the netcdf portion to work yet. Changing the library flags did not work. Also from the CAM 3.0 issues, it was necessary to run the command module load netcdf, which sets environment variables. This did not fix it either. I may need to contact OSC to resolve this issue. I also tried the multiple processors option in step 5. It appears to have worked; however, the job was terminated after about 5 minutes since I was using 4 processors. For the time being I am going to run E001smg again and see if it restarts correctly from the restart file. If so, I should be able to get another year of simulation in. Go to the file .modelErc and change multiple processors to NO and then use the command ./runE E001smg.

Here is some useful information when planning a batch job to run more than 20 minutes

  • 1 month of model time takes 20 minutes of clock time on the computer. When using multiple processors, there is not a clean division of time based on processors. Where is would be expected for a month to take 5 minutes with 4 processors, it looks like it would take 8.2 minutes for a month.
  • Each month of data puts out two files. The accumulated statistics file is 4,792,472 and the restart file is 19,954,612 bytes.
  • The remaining files in the directory E001smg has a size of 74,224,000 bytes
  • The files in the directory cmrun (initialization and boundary files) is 281,708,000 bytes

May 27, 2010:

Task List of things to work on for ModelE

  • Get a batch file run set up and successfully run.
  • Get a successful compile using NetCDF.
  • Get a compile on a Linux system using gfortran

When running a batch run, a TMP directory needs to be generated. All of the working files need to be copied to this directory and any data files need to be copied back before the batch job finishes. The following parameters determine the size of the TMP directory.

  • Run files - 74 MB
  • Initialization - 282 MB
  • Generated Data - 25 MB/month

For a 6 year run (default for E001) the TMP file should be 2.2 GB and should take 24 hours of computer time

June 1, 2010:

Attempt to compile E003smg.R using POUT_netcdf.f. This failed on May 20th and I never got it resolved. Tried it this time and got the same errors. The following are the error messages generated, which are due to the linker not finding the netcdf compiled library.

	compiling POUT_netcdf.f ... Done
	linking executable
	POUT_netcdf.o: In function `ncout_open_out_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:100: undefined reference to `nf_create_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:104: undefined reference to `nf_put_att_text_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:104: undefined reference to `nf_put_att_text_'
	POUT_netcdf.o: In function `ncout_def_dim_out_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:119: undefined reference to `nf_def_dim_'
	POUT_netcdf.o: In function `ncout_close_out_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:154: undefined reference to `nf_close_'
	POUT_netcdf.o: In function `wrtgattc_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:171: undefined reference to `nf_put_att_text_'
	POUT_netcdf.o: In function `defarr_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:180: undefined reference to `nf_redef_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:180: undefined reference to `nf_inq_varid_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:186: undefined reference to `nf_def_var_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:189: undefined reference to `nf_put_att_text_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:192: undefined reference to `nf_put_att_text_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:195: undefined reference to `nf_put_att_real_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:199: undefined reference to `nf_put_att_real_'
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:199: undefined reference to `nf_enddef_'
	POUT_netcdf.o: In function `setup_arrn_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:262: undefined reference to `nf_inq_varid_'
	POUT_netcdf.o: In function `wrtdarrn_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:301: undefined reference to `nf_put_vara_double_'
	POUT_netcdf.o: In function `wrtrarrn_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:318: undefined reference to `nf_put_vara_real_'
	POUT_netcdf.o: In function `wrtiarrn_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:330: undefined reference to `nf_put_vara_int_'
	POUT_netcdf.o: In function `wrtcarrn_':
	/nfs/06/ced0013/modelE/model/./POUT_netcdf.F:342: undefined reference to `nf_put_vara_text_'
	gmake[1]: *** [/nfs/06/ced0013/modelE/decks/E003smg_bin/E003smg.exe] Error 2
	gmake: *** [gcm] Error 2
	

Check to make sure that the libraries on the OSC computer matches the parameter defined in ~/.modelErc. The version of NetCDF checked out. When running the environment variables from module load netcdf, the librarys included both -lnetcdf and -lnetcdff. This second library is not included in the file Rules.make. Editing this file to include -netcdff before -lnetcdf resulted in a successful compile. This second library is the fortran routines for netcdf. They are defined as C subroutines, but then they are mapped to fortran subroutines. The changed section of Rules.make are as follows:

After
	#
	# Check for extra options specified in modelErc
	#
	
	ifdef NETCDFHOME
	ifeq ($(MACHINE),SGI)
	  LIBS += -L$(NETCDFHOME)/lib64 -lnetcdf
	else
	#  LIBS += -L$(NETCDFHOME)/lib -lnetcdf
	  LIBS += -L$(NETCDFHOME)/lib -lnetcdff -netcdf
	endif
	  FFLAGS += -I$(NETCDFHOME)/include
	  INCS += -I $(NETCDFHOME)/include
	endif
	

Now recompile E003smg to do netcdf output as well as multiple processors. This will be done interactively for now so I have some results to evaluate. After this is successful, I will move on to defining a batch job. Instructions for setting up E003smg:

  1. Go to the directory ~/modelE/decks.
  2. Make sure Rules.make include compiler library -netcdff and the multiple processor flag for PGI.
  3. Make sure ~/.modelErc has the right directory for the netcdf libraries.
  4. Change ~/.modelErc to use multiple processors.
  5. Generate rundeck using gmake rundeck RUN=E003smg.
  6. Edit E003smg.R to use netcdf by changing POUT to POUT_netcdf.
  7. Compile the program using gmake gcm RUN=E003smg.
  8. Establish the stack size using ulimit -s 32768.
  9. Run the setup command gmake setup RUN=E003smg.
  10. Change the directory to ~/modelE/exec and run the command ./runE E003smg 4.
  11. Check on status of run by issuing the commend ps.
  12. Stop the run before 5 minutes so it terminates gracefully using the command ./sswE E003smg.
  13. Check the status and data from the run by going to ~/modelE/output/E003smg. Check the file E003smg.PRT to see how far the run has proceeded.
  14. Restart the run by issuing the command ../../exec/runE E003smg 4.
  15. Monitor the time and when the run time gets close to 20 minutes issue the command ../../exec/sswE E003smg.
  16. Generate data files from the accumulated monthly diagnostics using the post-processing program pdE. The command for processing a single month is ../../exec/pdE E003smg DEC1949.accE003smg.
  17. Continue extending the run so additional months are generated. Check documentation to see how multiple months of data can be processed to generate statistics.

Using the NetCDF output and post-processing the accumulated statistics, the following files were generated:

Running the program pdE on the file DEC1949.accE001smg generated the following files:

  • DEC1949.diurnE001smg.nc
  • DEC1949.E001smg.PRT
  • DEC1949.hdiurnE001smg.nc
  • DEC1949.icijE001smg.nc
  • DEC1949.ijE001smg.nc - Latitude-longitude binary file
  • DEC1949.ijkE001smg.nc
  • DEC1949.ilE001smg.nc - Longitude-height binary file
  • DEC1949.isccpE001smg.nc
  • DEC1949.jE001smg.nc - Zonal budget pages ASCII file
  • DEC1949.jkE001smg.nc - Latitude-height binary file
  • DEC1949.wpE001smg.nc - Wave power binary file

After 3 months model time on the E003smg run, I did not terminate the run before the OCS 20 minute limit was hit. As a result, the run did not terminate cleanly and will not restart. Not sure how to resolve this. I am subscribing to the mailing list at giss-gcm-users-l@giss.nasa.gov. Hopefully I will get a useful response.

Initial Ocean Model run

I am going to set up a dynamic ocean model run to see what data is saved at the end of each month. The rundeck that has the Russell dynamic ocean is E001o.R. This needs to be specified when setting up the initial run deck. Go through the following procedure to generate run E004smg, which is a modification of the previous run to get a dynamic ocean:

ModelE run procedure on OCS

  1. Go to the directory ~/modelE/decks.
  2. Make sure Rules.make include compiler library -netcdff and the multiple processor flag for PGI.
  3. Make sure ~/.modelErc has the right directory for the netcdf libraries.
  4. Change ~/.modelErc to use multiple processors.
  5. Generate rundeck using gmake rundeck RUN=E004smg RUNSRC=E001o.
  6. Edit E004smg.R to use netcdf by changing POUT to POUT_netcdf.
  7. Compile the program using gmake gcm RUN=E004smg.
  8. Establish the stack size using ulimit -s 32768.
  9. Run the setup command gmake setup RUN=E004smg.
  10. Change the directory to ~/modelE/exec and run the command ./runE E004smg 4.
  11. Check on status of run by issuing the commend ps.
  12. Stop the run before 5 minutes so it terminates gracefully using the command ./sswE E004smg.
  13. Check the status and data from the run by going to ~/modelE/output/E004smg. Check the file E003smg.PRT to see how far the run has proceeded.
  14. Restart the run by issuing the command ../../exec/runE E004smg 4.
  15. Monitor the time and when the run time gets close to 20 minutes issue the command ../../exec/sswE E004smg.
  16. Generate data files from the accumulated monthly diagnostics using the post-processing program pdE. The command for processing a single month is ../../exec/pdE E004smg DEC1949.accE004smg.
  17. Continue extending the run so additional months are generated. Check documentation to see how multiple months of data can be processed to generate statistics.

Ran the dynamic ocean run out to March. Download the files and look at them later.

Looking at the data files at home with Panoply, the missing data entries were treated as real values. This messed up the autoscaling of Panoply. I contacted the author (Robert Schmunk) and he said that the data attribute within the NetCDF file should specify missing_value. I did a grep on the data files and found that only the jk files included the missing_value designation. All the other data files were missing it. I thought I would need to change the modelE code to include it; however, I found a simpler solution. There is a collection of netCDF software tools that can be run from a Linux command line. They are called netCDF Operator (NCO). I considered installing these at OCS, but the install packages for precompiled code would probably need administrator priveledges. Instead I have installed them on my VirtualBox Linux installation. Putting the netCDF files into the shared directory, I can modify the netCDF files with the following command:

	ncatted -O -a missing_value,,c,f,-1.0e30 inout.nc
	

The utility program is called ncatted, which edits attributes of a file. The option -O means overwrite the file. The option -a is to work on an attribute with the following parameters (attribute name, name of data field to change, create if not present, use floating point, value of the parameter). The last text string is the name of the netCDF file to modify. Check the documentation to find other ways this program and others can modify netCDF files.

June 3, 2010

Made a mistake on run E004smg. It did not include the dynamic ocean model. Not sure where the error occured. Began a new project of E005smg using the dynamic ocean model. The model compiled fine; however, it generated errors on the setup. The following is the error message:

	--------------------------------------------------------------
	---------       GCM successfully compiled            ---------
	---------    executable E005smg.exe was created     ---------
	--------------------------------------------------------------
	gmake[1]: warning:  Clock skew detected.  Your build may be incomplete.
	---------       Looks like it was compiled OK        ---------
	----- Saving Rundeck and other info to global repository -----
	---------        Starting setup for E005smg          ---------
	--------------------------------------------------------------
	Using settings from ~/.modelErc
	CMRUNDIR = /nfs/06/ced0013/modelE/cmrun
	EXECDIR = /nfs/06/ced0013/modelE/exec
	GCMSEARCHPATH = /nfs/06/ced0013/modelE/cmrun
	SAVEDISK = /nfs/06/ced0013/modelE/output
	MAILTO = gollmers@cedarville.edu
	UMASK = 002
	setting up run E005smg
	output files will be saved in /nfs/06/ced0013/modelE/output/E005smg
	1DEC1969.rsfE050AoM20A not found in /nfs/06/ced0013/modelE/cmrun
	using /nfs/06/ced0013/modelE/cmrun/1DEC1969.rsfE050AoM20A for IC only
	gmake[1]: *** [setup_script] Error 1
	gmake: *** [setup] Error 2
	

Need to find the initialization file for the ocean. Looking through the run script it appears that the file E001o.R is set up to run a full ocean based on an spin up run of 320 years. I would assume this is done to reach some equilibrium state. The restart file for 1DEC1969 is used instead of the initialization files of AIC, GIC and OIC. The script E005smg.R was changed to comment out the restart file and to remove comments on the defintion of AIC, GIC and OIC. Running start up now resulted in a read error of the file defined for AIC. I assume it is due to an incompatible model resolution. Look at the other run scripts and see which initialization file for AIC is appropriate. The model resolution is fine. Looking at the PRT file in the directory E005smg, it indicates that it encountered an error in the restart file for ISTART=8. Since the ocean initialization file is similar to the script E001M20A.R, I should compare the values for files. The values for AIC and GIC are correct, but the ISTART=2 in this file. I will change the value of ISTART to 2 and see if setup works. I will also change the dates for the similation so that it runs for 6 years not a century and generated by default.

Looking at the results from E005smg indicates that the ocean model has problems. The conversion of the accumulated data did not complete cleanly. When looking at oij, there is only one data field (ocean potential temperature) and it has the same value of 9.96921e+36. Need to look at the ocean code and see if other fields should be recorded.

June 7, 2010

I want to get the model running on a desktop system. It seems this is best accomplished by using Ubuntu in a virtual machine. The model and generated data can be placed on a virtual drive, which can be easily backed up without affecting the Ubuntu install. From previous estimates a century run using the 4 x 5 degree model will generated 30 GB of data. This is not practical for a virtual machine. However, a decade run of 3 GB of data would be useful for some preliminary runs. If you want to push the limits of a DVD you could format the drive at 4.7 GB. However, if you are using 1kb as 1024 bytes, then the size of the DVD is really 4.38 GB. Therefore, the virtual disk will be formatted to 4.38 GB. This will give a model run of 13 years. If the model is installed on the primary virtual drive, then the secondary virtual drive could contain only generated data and not run files and initialization files. This would extend the amount of data to 14 years. It is probably best just to put everything on the virtual data drive. Using VirtualBox the drive was formatted to 4.32 GB or 4,637,851,648 bytes.

Formatted the virtual drive and installed the model and initialization files. Ran make config to generate the file .modelErc. Edited this file to match the location of the model resources. Installed gfortran.

Will need to change the Rule.make file to compile gfortran correctly.

June 8, 2010

Begin documenting the Russel ocean model. This document is at OceanModelNotes.html

The model needs to be documented.

While looking through the ocean model, it seems reasonable to perform a qflux run to see what data fields are reported. There should be a sea surface temperature, but also a mixed layer depth.

Set up run for E006smg using the script E001q.R. This is the q-flux model. Follow the ModelE Procedure on OCS described above.

June 22, 2010

Over the last couple of days I have looked at how best to use R for converting initialization files into NetCDF format. I finally got it figured out and have yet to go much further.

Today I read through the main loop of modelE. One thing I noticed is a parameter KCOPY. This parameter determines which files are written during execution. The following lists the possible values:

  • KCOPY = 1 Print the acc files
  • KCOPY = 2 Print both the acc and rsf files
  • KCOPY = 3 Print the acc, rsf and oda files. This last option is when you need the oda files for the calculating the diffusion values for the q-flux model.

Have added information to the file OceanModelNotes.html about file formats and the procedure for modifying E001.R to include a dynamic ocean. To test out my procedure I have set up the run E006smg. I modified E006smg.R accordingly and began a test run. The setup ran fine. Now need to run it out to see if the acc and rsf contain ocean parameters. The size of fort.1 for this run is 41.2 MB and a non-dynamic ocean run has a file size of 29 MB. This is a file size is 42% bigger. When converting the acc file to NetCDF format, I get an error that the program stopped in POUT_netcdf.f. Looking at the generated files, the file DEX1949.otjE006smg has nothing in it. This may indicate where the routine is failing to write.

It looks like the subroutine io_rsf() handles all of the reading and writing of the acc and rsf files.

Need to look at POUT_netcdf.f to see where the program is failing to either read the acc file or to write the output file.

June 25, 2010

Received the restart file for the dynamic ocean and 20 layer atmosphere model. This includes a more resolved stratosphere, which probably will allow me to include the effects of aerosols better. To generate some useable statistics I will start a seventh run that is similar to E005smg, but will now use the restart file. I will use the run deck E001o.R will modifications to the POUT call.

After running E007smg for a month of model time, I tried to generate the post processing diagnostics with pdE. As before, I am getting an stopped process in POUT_netcdf.f. It seems that this subroutine does not take the dynamic ocean model into account. It opens the appropriate files, but then it stops.

I generated another run called E008smg. This run is the same as E007smg (using the restart file from the NASA GISS site. However, in this run I use POUT as is. This will generate a post-processing file in the GISS format. I should be able to figure out the file format and use it as is or generate NetCDF files from it. Have run this model out for three months and generated the post-processing diagnostics. Everything seems to be fine.

I did notice on both runs (E007smg and E008smg) there is a problem flagged in the snow routine that the water is not conserved. I assume that the routine corrects for this, but I am not certain.

Look at the message recorded in E007smg.PRT and E008smg.PRT to see where the water conservation problem occurs and see if any corrective measures are made by the routine.

After post processing the E008smg files, I looked at the DEC1900.ojlE008smg file in a hex editor. It is clear now what the GISS file format is. Each record is as follows:

  • Number of bytes to the end of the record - This is an integer value. In this case the hex was 00 00 0c dc, which has a value of 3292.
  • Title of the data field - This is a Character*80 field. It contains the name of the field, the units the date (month and year) and the run name.
  • Data - In this case the data is assumed to be Real*4. The length of this data is the record length minus the title length. In this case this comes to 3212. Since this data is Real*4, this corresponds to 803 data values. I am not sure what this size corresponds to. I will need to look at POUT to see how this value is recorded.
  • Number of bytes in the completed record. This should be the same as the first value. This integer value is not included in the size of the title and data field.

Looking at the routine ODIAG_PRT.f the subroutine for saving to the file ojl was found. It sets up data to be saved and then sends the data to the subroutine POUT_JL in the file POUT.f. The statement calling this routine is

	        IF (QDIAG) CALL POUT_JL(TITLE,LNAME,SNAME,UNITS,2,LMO+1,XJL
     *       ,ZOC1,"Latitude","Depth (m)")

The variable XJL is the data and the values 2 and LMO+1 are integers. Looking at POUT_JL we get the following:

  • 2 gets passed to J1 and LMO+1 gets passed to KLMAX
  • JXMAX gets set to JM-J1+1, which is 45
  • XCOOR(1:JXMAX) = LAT_DG(J1:JM,J1), which means that XCOOR is a vector of length 45
  • The following write statement is called
  • 	      WRITE (iu_jl) TITLE,JXMAX,KLMAX,1,1,
         *     ((REAL(XJL(J1+J-1,L),KIND=4),J=1,JXMAX),L=1,KLMAX)
         *     ,(REAL(XCOOR(J),KIND=4),J=1,JXMAX)
         *     ,(REAL(PM(L),KIND=4),L=1,KLMAX)
         *     ,REAL(1.,KIND=4),REAL(1.,KIND=4)
         *     ,CX,CY,CBLANK,CBLANK,'NASAGISS'
         *     ,(REAL(XJL(J,LM+LM_REQ+1),KIND=4),J=J1,JM+3)
         *     ,((REAL(XJL(J,L),KIND=4),J=JM+1,JM+3),L=1,KLMAX)
    
    	
  • Interpretation of this write statement is as follows:
    • write Character*80 title
    • write JXMAX, which is 45
    • write KLMAX, which is 14
    • write 1
    • write 1
    • write XJL(J+1,L), which has dimensions of J=1,45 and L=1,14
    • write PM, which has dimensions of 14
    • write 1 as real
    • write 1 as real
    • write CX, which is
    • write CY, which is
    • write CBLANK
    • write CBLANK
    • write NASAGISS
    • write XJL(J, 14+LM_REQ), which has dimensions of J=2,49
    • write XJL(J,L), which has dimensions of J=47,49 and L=1,14

Add this all up gives 80 + 4*(4+630+14) + 4*4 + 2*1 + 8 + 4*(48+42)

June 27, 2010

The file format for ij, il, and jl is not figured out. It is now possible to write a program or set up an R script to convert this data into NetCDF format. The following are the three files and the format of each of their records.

IJ File format
  • Write call from POUT
  • 		      WRITE(iu_ij) TITLE,REAL(XIJ,KIND=4),REAL(XJ,KIND=4),
         *             REAL(XSUM,KIND=4)
    
    		
  • Record format
    • Record size - Integer (4 bytes) Big Endian
    • Title - Character*80
    • XIJ - Real*4 array of 72x46
    • XJ - Real*4 vector of 46 (latitude average)
    • XSUM - Real*4 (total average)
    • Record size - Integer (4 bytes)
  • Next record starts at 8+record size
IL File format
  • Write call from POUT
  • 		      WRITE (iu_il) TITLE,IM,KLMAX,1,1,
         *     ((REAL(XIL(I,L),KIND=4),I=1,IM),L=1,KLMAX)
         *     ,(REAL(XCOOR(I),KIND=4),I=1,IM)
         *     ,(REAL(PM(L),KIND=4),L=1,KLMAX)
         *     ,REAL(0.,KIND=4),REAL(0.,KIND=4)
         *     ,CX,CY,CBLANK,CBLANK,'NASAGISS'
         *     ,(REAL(ASUM(I),KIND=4),I=1,IM),REAL(GSUM,KIND=4)
         *     ,(REAL(ZONAL(L),KIND=4),L=1,KLMAX)
    		
  • Record format
    • Record size - Integer (4 byte) Big Endian
    • Title - Character*80
    • IM - Integer (Longitude index - 72)
    • KLMAX - Integer (Layer index - 13)
    • 1 - Integer
    • 1 - Integer
    • XIL - Real*4 array of IMxKLMAX
    • XCOOR - Real*4 vector of IM (longitude value)
    • PM - Real*4 vector of KLMAX (layer value)
    • 0 - Real*4
    • 0 - Real*4
    • CX - Character*16 (x coord label)
    • CY - Character*16 (y coord label)
    • NASAGISS - Character*8
    • ASUM - Real*4 vector of IM (average by lontidue)
    • GSUM - Real*4 (total average)
    • Zonal - Real*4 vector of KLMAX (average by layer)
    • Record size - Integer (4 byte)
  • Next record starts at 8+record size
  • Record size calculated by formula of 184 + 4*(IM*KLMAX + IM + KLMAX + IM + 1 + KLMAX)

June 28, 2010

Worked on a Java program to read in the GISS format and convert it to NetCDF. I was using Java-NetCDF libraries and Eclipse. The development worked fairly well with the Eclipse environment. However, I was not able to get the data to save in the right format. The variable definition worked well and the data field was able to be recorded. However, when brought up in any NetCDF viewer, such as Panoply, the data field was not visible. The variable description was there and the data was in the file, it just was not accessible. I followed the tutorial given for the Java-NetCDF library, but I just was not able to get it to work.

I went back to R to convert the OIJ file. The format is relatively simple and it needs to be handled as a two pass process. The first pass pulls out the data field information and then the NetCDF file is created with all of the variables defined. On the second pass the data fields for each variable was saved to the file. The following things were observed in this process.

Ideosyncracies of the NetCDF format

  • In the data title, there can not be any blank spaces. It causes the saving of the field to fail. I don't know if this is a true blank of x00 or a white space of x20. To solve this I replaced the variables with an underline. This may be a problem with how the R libraries work. Need to check against other NetCDF files to see what white space is acceptable.
  • In the data title, the forward slash causes problems. This shows up with units such as cm/s. I replaced this with a dash and it works. This could be a problem with how R uses the forward slash.

OIJ data field information

  • 13 levels of Ocean Potential Temperature (C) - This should be moved to a file called OIJL (3D field).
  • 13 levels of Ocean Salinity (psu) - This should be moved to a file callled OIJL (3D field).
  • 3 levels (3,6,9) of Vertical mass flux (kg/s m^2)
  • 3 levels (3,6,9) of Vertical heat diffusion (cm^2/s)
  • 3 levels (3,6,9) of Vertical momentum diffusion (cm^2/s
  • 3 levels (3,6,9) GM-eddy E-W heat flux (W)
  • 3 levels (3,6,9) GM-eddy N-S heat flux (W)
  • 3 levels (3,6,9) GM-eddy vertical heat flux (W/m^2)
  • Surface solar bouyancy flux (m^2/s^3)
  • Surface friction speed (m/s)
  • Surface buoyancy forcing (KPP) (m^2/s^3)
  • Ocean surface height (m)
  • Ocean bottom pressue anomaly (Pa)
  • Ocean boundary layer depth (KPP) (m)
  • East-west heat flux (W)
  • East-west salt flux (kg/s)
  • East-west velocity (cm/s)
  • Horizontal mass transport streamfunction (Sv)
  • North-south heat flux (W)
  • North-south salt flux (kg/s)
  • North-south velocity (cm/s)

Check to see what white spaces are acceptable for NetCDF character fields and see if there is a way to clean up the title when using R so there is not an underline running through the data title.

Check to see if a forward slash is acceptable in NetCDF. If so, see if an escape sequence can be used in R to maintain this character in the units field.

See if there is a way to use Java to generate NetCDF files. It would be nice to have more control over strings when generating fields. R seems a bit crippled in the string manipulation area.

June 29, 2010

Now that the ocean data field can be viewed, it is time to generate a significant amount of additional data. This will only be possible by setting up the model to run as a batch run. This will eliminate the 20 minute computer time per job when done at the terminal.

Several observations were made on May 20th about batch jobs. Since the ocean model has been added, this may no longer be valid. Here is a modified list based on the run E008smg.

  • 1 month of model time takes about 1.1 hour of computer time. With 4 processors running, this drops it down to 15 - 16 minutes in real time. It is about 10 days per 20 minutes. This model includes the ocean, but also has a 20 layer atmosphere rather than the 12 layer atmosphere. This version of the model will be used for production use because it will be necessary to model aerosols in the stratosphere when volcano effects are added in.
  • Scaling to more processors is not effective beyond 4. I tried 8, 16, and 32 processor runs. The 32 processor run took less than 1 minute to run. However, it only went through about 1 1/2 days of model time when the 4 processor run would get you through about 10 days.
  • Each month of data puts out two files. The accumulated statistics file is 9,546,688 and the restart file is 39,291,020 bytes.
  • The remaining files in the directory E008smg has a size of 116 Mbytes.
  • The files in the directory cmrun (initialization and boundary files) is 314 Mbytes. The larger size is due to the presence of 1DEC1969.rsfE050AoM20A. This is the restart file with the ocean spun up to 300 years.
  • The run files - 116 MB
  • Initialization - 314 MB
  • Generated Data - 49 MB/month

For a 6 year the TMP file should be 4 GB and should take 79.2 hours of computer time. This is 3.3 times as much time estimated on May 27th for the model E001, which does not include the stratosphere or the ocean.

To more accurately determine model performance. Multiple runs of 19 minutes computing time (not the most accurate way to stopping the run) were performed. This was done in interactive mode and, therefore, may be affected by system limitations on this mode over a batch mode. A similar process might be performed at the batch level to determine performance.

  • 1 processor - 13.4 days model time, 0.71 days/min in real time
  • 2 processor - 11.9 days model time, 1.3 days/min in real time
  • 4 processor - 9.1 days model time, 1.9 days/min in real time
  • 8 processor - 6.0 days model time, 2.5 days/min in real time
  • 16 processor - 1.0 days model time, 0.84 days/min in real time
  • Quickest for real time processing is 8 processors. The drop off at 16 processors is most likely due to IO dominating the actual run time where only one processor can be involved in the process. The computer time clocks off as if all 16 processors were working.

Using 8 processors it should then take 14.6 hours of real time. This comes out to about 1 decade per day in real time. This compares to 85.7 hours (3.6 days) for 1 decade with one processor in real time. This later number may give a realistic estimate of how fast the model can run on a desktop if it is possible to successfully compile and run it at this level. This would be a goal for the future.

Set up the model to compile and run on a desktop using Linux. Could either be done using Cygwin or MinGW/MSYS or through Linux on VirtualBox. Either way one needs to get a good compile using either gfortran or g95.

The following things need to be done in order to implement a batch job at the OSC:

  • Generate a batch script called E008smg.pbs
  • Submit the batch script to Torque using the command qsub E008smg.pbs
  • The system will respond with a job number and a statement such as #######.opt-batch.osc.edu
  • Once the job is done a log file will be returned to the invoking directory. The name of this log file is E008smg.o#######

The batch script needs PBS header lines. The relevant information from the web page http://www.osc.edu/supercomputing/training/customize/docs/batch/batch_pbsheader.shtml is as follows.

  • The script is used for scheduling priorities. It includes requests for disk space, computing nodes/processors, memory and time.
  • Maximum time for each job is 168 hours and 336 hours if request is made through an alternate scheduler. This is for serial execution (non-parallel).
  • Each job has access to a large partitian of a hard drive. This drive is named $TMPDIR.
  • Once a job is completed the files on $TMPDIR are deleted. Therefore, you must copy your files back in the batch script or you will lose them.
  • Useful commands:
    • #PBS -N E008smg ! This gives your project the name of E008smg.
    • #PBS -l walltime=00:20:00 ! This would request 20 mintes of computation.
    • #PBS -S /bin/ksh ! This changes the shell for running the program. Won't be needed for our run. Uses the shell you log in with unless specified.
    • #PBS -l nodes=4:ppn=2 ! This lets you request multiple nodes and processors. In this case 4 nodes and 2 processors per node are requested. This would give you a total of 8 processors to run your job. Check with the hardware on the Glenn cluster to make sure one of the machines on the cluster is able to handle this request. From preliminary runs and documentation about modelE, it seems that 4 processors is sufficient. There seems to be significant degredation of performance beyond this. However, this may be a limitation on the interactive mode versus the batch mode. It would be good to do an effienciency comparison once batch jobs are started. Each node has a mounted disk drive and each processor has a /TMP drive. This is accessed with $TMPDIR. Need to determine whether this is a problem with how nodes and processors are requested.
    • #PBS -j oe ! This combines standard output and error streams into one log file.
    • #PBS -m e ! This sends an email message when the job ends with total time and memory consumed. It uses the email address recorded in ${HOME}/.forward.
  • #PBS -o filename ! This renames the output file. This command is not needed.
  • #PBS -q queue_name ! This allows a special queue to be requested in which your model may exceed the normal time or resource limits from the regular queue. This should not be needed.
  • #PBS mem=2GB ! This specifies how much memory each node requires. This request asks for 2 gigabytes of memory for each node. From memory use analysis of E008smg a minimum of 524 MB should be requested. The model itself takes up 16 MB, but once arrays are declared it goes up quickly. Use at least 1 GB.
  • #PBS -a [YYYY][MM][DD]hhmm ! This tells the model to execute after the given date and time. This command will not be needed for our runs.

Here is an example PBS script. It contains the PBS commands followed by the shell commands normally given at the terminal. A similar script needs to be generated for running the climate model.

#PBS -N nest
#PBS -l walltime=00:05:00
#PBS -j oe
#PBS -S /bin/ksh
set -x
cd $PBS_O_WORKDIR
gcc nest.c
cp a.out $TMPDIR
cd $TMPDIR
./a.out 459 121

Here is similar script that reports timing for the program execution.

#PBS -l walltime=00:01:00
#PBS -N nest
#PBS -j oe
#PBS -S /bin/csh
#PBS -m e

set echo 

cd $PBS_O_WORKDIR
gcc nest.c -lm 
cp a.out $TMPDIR

cd $TMPDIR

echo "New run "
time ./a.out 10000 10000

echo "New run "
time ./a.out 15400 10032

The use of environment variables in a batch job is described in http://www.osc.edu/supercomputing/training/customize/docs/batch/batch_envvar.shtml. Here are some comments on how this may affect the climate model:

  • $TMPDIR - Absolute path and name of the /tmp directory on each node.
  • $PBS_O_WORKDIR - Absolute path to directory from which the batch script is started.
  • $HOME - Topmost directory in a user's partition.
  • Other variables are defined, which deal with processor number and job names. It doesn't look like these will be needed.
  • Copying files to the /tmp directory for each node, the command pbsdcp. Use of this command can be accessed through the command man
    • pbsdcp -s infile1 infile2 $TMPDIR (This copies the two files into each of the /tmp directories. -s is for scatter.)
    • pbsdcp -g $TMPDIR/outfile* $HOME (This gathers all the files from each /tmp directory and copies them back to the designated directory.)
    • pbsdcp -r modelE $TMPDIR (This will copy recusively all of the files in the directory tree structure below the specified directory.
  • It looks like it might be best to use one node and use all the processors on that node.
  • Make sure to check http://www.osc.edu/supercomputing/computing/opt/index.shtml for specifics about running jobs on the Glenn Cluster.
  • It is encouraged that calculations be done on the temporary space. This recommendation parallels the process for running on a batch job. This will give a method for testing the procedure before running it as a batch. The following is the sequence of commands to get files to and back from the /tmp directory.
  • mkdir /tmp/$USER  	Create your own temporary directory.
    cp files /tmp/$USER 	Copy the necessary files.
    cd /tmp/$USER 	Move to the directory.
    ... 	Do work (compile, execute, etc.).
    ... 	
    cp new files $HOME 	Copy important new files back home.
    cd $HOME 	Return to your home directory.
    rm -rf /tmp/$USER 	Remove your temporary directory.
    exit 	End the session.
    	

July 1, 2010

Looking through the scripts runE and sswE and the Makefile commands, it appear that the physical location of the model could change as long as the environment variables are changed accordingly. As a result, it seems that this is a good time to make a fresh start. I will eliminate all of the previous runs and set up the file .modelErc to reflect a different directory structure. The structure is as follows: (This is reflected in the file .modelErc.

  • DECKS_REPOSITORY=$MODELDIR/cmrun/modelE/decks
  • CMRUNDIR=$MODELDIR/cmrun
  • GCMSEARCHPATH=$MODELDIR/cmrun
  • EXECDIR=$MODELDIR/exec
  • NETCDFHOME=/usr/local/netcdf-3.6.2
  • SAVEDISK=$MODELDIR/cmrun

This will place the compiled program and run output into a directory within cmrun. When a batch job is initiated, the file .modelErc will need to be changed so that all the string $MODELDIR is replaced with $TMPDIR. I copy of this file with the appropriate changes made will be called modelErcBatch.

Since the results from previous runs have been copied to a local machine, the runs will be wiped out on the OCS account. The model code will not be removed and the fixed.tar.gz data will be kept. Also the initialization file for the ocean run, will be retained.

Once model runs for the July conference are completed, the code and results should be backed up and saved to a DVD. Next all of the model code and output on the OCS machine should be cleared out. It should be replaced with the up to date code obtained through CVS. At this point, model runs should be registered through the NASA GISS CVS site.

The following steps were performed to restructure the model on the OCS machine.

  • Define the environment variable MODELDIR. In the file .profile add the following lines.
  • 		MODELDIR=$HOME/modelE
    		export MODELDIR
    		
  • Change the location definitions in the file .modelErc to reflect the directory structure discussed above.
  • Clean out the directory modelE/decks. Only the CVS directory and Makefile should remain.
  • Remove the links to the E00* files in the directory modelE/cmrun.
  • Run gmake clean and gmake vclean in the directory modelE/decks to clean up the object files in the directory modelE/model.
  • Delete the directory modelE/output and all of its subdirectories.

Now begin a new run. Name this one E010smg and use the ocean model by using E001o.R as the source deck. Follow the procedure given on June 1st. Once the file is compiled and the start up is successful, then use the following shell script to make sure the model can be moved to a new location and run successfully.

Problem: Can't get the environment variable $MODELDIR to be interpreted correctly. For Makefile to work it needs to be defined as $(MODELDIR). This works up through the compile. However, when doing setup it fails to work. The variables get defined literally without the substitution of the environment variable. It doesn't matter whether it is in parenthesis or not. This may be an issue with what shell the model is run under. Need to look into this. If it is not a shell issue, then I may need to define each of the environment variables directly and then place a dummy file of .modelErc into $HOME so it does not redefine the environment variables.

July 2, 2010

Looking through documentation for environment variables in a shell versus a makefile it seems that the syntax is not identical. As a result, I don't see a method for placing an environment variable in the file .modelErc that will work. It seems that it wants an absolute reference. This will not work when I go to a batch job because I need to reference $TMPDIR. The solution I will try is to move all of the variables in .modelErc to the environment. That way they can be declared and they should work for both the makefile and the running of the model. Had to make the following modifications.

  • Backup the .modelErc file and then remove all environment variable calls.
  • Add to .profile the environment variables.
  • Back up the file exec/modelErc and modify it by removing the environment variable calls.
  • Go to the directory exec and comment out environment variables in the files pdE, runE, runpmE, setup_e.pl, and sswE.

Using these changes results in the environment variables not being set. There needs to be a different method. I will need to remove the changes and see what can be done.

I contacted Dr. Gavin Schmidt at GISS and he included someone else in the email response. Essentially he said that each system is unique and the make file needs to be adjusted. I will try two more things before I contact Schmidt again. The first is to replace the environment variable with the .. directory designation. If this doesn't work, then I will try to modify the pearl script to see if I can get it to parse the environment variable correctly.

ModelE run batch procedure on OCS

  1. Set up the file structure within the modelE directory as follows:
    • aux
    • CVS
    • doc
    • decks - Location of Makefile and place where gmake rundeck, gmake gcm and gmake setup are run
    • exec - Location of run and stop scripts and place where runE and sswE are run
    • bcic - Place boundary and initial conditions in here
    • E010smg - Compiled model and run results will be saved here. Also run pdE within this as the current directory to generate diagnostics.
      • CO2_sources
      • methane
    • E2000
    • model - Model fortran code is here
    • prtdag
    • repository
      • decks - Run deck will be saved here
  2. Set up the file ~/.modelErc in the following way
    • DECKS_REPOSITORY=../repository/decks
    • CMRUNDIR=..
    • GCMSEARCHPATH=../bcic
    • EXECDIR=../exec
    • NETCDFHOME=/usr/local/netcdf-3.6.2
    • SAVEDISK=..
    • MAILTO=gollmers@cedarville.edu
    • OVERWRITE=NO
    • OUTPUT_TO_FILES=YES
    • VERBOSE_OUTPUT=NO
    • MP=YES
    • UMASK=002
    • COMPILER=PGI
  3. Go to the directory ~/modelE/decks.
  4. Generate rundeck using gmake rundeck RUN=E010smg RUNSRC=E001o.
  5. Check E010smg.R to make sure it is the right deck. If the diagnostics are to be save as NetCDF, POUT needs to be replaced with POUT_netcdf
  6. Compile the program using gmake gcm RUN=E010smg.
  7. Establish the stack size using ulimit -s 32768.
  8. Run the setup command gmake setup RUN=E010smg.
  9. Change the directory to ~/modelE/exec and run the command ./runE E010smg 4.
  10. Check on status of run by issuing the commend ps.
  11. Stop the run before 20 minutes so it terminates gracefully using the command ./sswE E010smg.
  12. Check the status and data from the run by going to ~/modelE/cmrun/E010smg. Check the file E010smg.PRT to see how far the run has proceeded.
  13. Go back to ~/modelE/exec and restart the run by issuing the command ./runE E010smg 4.
  14. Monitor the time and when the run time gets close to 20 minutes issue the command ./sswE E010smg.
  15. Once a month of data is generated, the diagnostic program needs to be run. Generate data files from the accumulated monthly diagnostics using the post-processing program pdE. Change to the directory ~/modelE/E010smg and run the command ../exec/pdE E004smg DEC1900.accE010smg.
  16. Continue extending the run so additional months are generated. Check documentation to see how multiple months of data can be processed to generate statistics.

This process worked with interactive mode. Now it needs to be run as a batch file. Run E010smg is cleaned out and repeated as a batch file. The following is the process for running the batch file.

  1. From the process given immediately above perform steps 1 - 4.
  2. Edit the run deck E010smg.R by changing the &INPUTZ line involving YEARE. Only run the model for 5 months. The appropriate line is changed to the following:
    • YEARE=1901,MONTHE=5,DATEE=1,HOURE=0, KDIAG=0,2,2,9*0,9,
  3. Perform steps 6 - 8 from the previous process to compile and set up the model run.
  4. Generate a batch run script. The following script will be saved in file E010smg.pbs
    • #PBS -N E010smg ! Give the run a name
    • #PBS -l walltime=06:00:00 ! Request 6 hours of wall time
    • #PBS -l nodes=1:ppn=4 ! Use one node with 4 processors
    • #PBS -j oe ! Combine the standard output and error streams
    • #PBS -m e ! Send an email when the job ends
    • #PBS mem=2000MB ! Request 2 GB for running the model
    • cd ~/modelE ! Copy relevant files to the temp directory
    • mkdir $TMPDIR/repository
    • mkdir $TMPDIR/bcic
    • mkdir $TMPDIR/E010smg
    • mkdir $TMPDIR/decks
    • mkdir $TMPDIR/exec
    • pbsdcp -s repository/* $TMPDIR/repository
    • pbsdcp -s bcic/* $TMPDIR/bcic
    • pbsdcp -s E010smg/* $TMPDIR/E010smg
    • pbsdcp -s decks/* $TMPDIR/decks
    • pbsdcp -s exec/* $TMPDIR/exec
    • cd $TMPDIR/exec ! Go to the temp directory and run the model
    • ulimit -s 32768
    • ./runE E010smg 4
    • pbsdcp -g $TMPDIR/E010smg/*acc* ~/modelE/E010smg
    • pbsdcp -g $TMPDIR/E010smg/*rsf* ~/modelE/E010smg
    • pbsdcp -g $TMPDIR/E010smg/E010smg.PRT ~/modelE/E010smg
  5. Set the batch file as executable using chmod 555 E010smg.pbs
  6. Submit the batch job with the command qsub E010smg.pbs

Had a problem with the #PBS mem directive. Have eliminated it and tried again.

Returned batch name is 3469115.opt-batch.osc.edu

Batch job failed. Removed the mkdir commands and just did a copy of the ~/modelE directory. Resubmitted the job which is named 3469123.opt-batch.osc.edu. This one also failed. There seems to be a problem with getting the directories set up on $TMPDIR. Will try different variants and give the final form when done. Accidentally deleted the pbs file. Here is the current modified version

  • #PBS -N E010smg ! Give the run a name
  • #PBS -l walltime=06:00:00 ! Request 6 hours of wall time
  • #PBS -l nodes=1:ppn=4 ! Use one node with 4 processors
  • #PBS -j oe ! Combine the standard output and error streams
  • #PBS -m e ! Send an email when the job ends
  • #PBS mem=2000MB ! Request 2 GB for running the model
  • pbsdcp -s -r ~/modelE $TMPDIR
  • cd $TMPDIR/modelE/exec ! Go to the temp directory and run the model
  • ulimit -s 32768
  • ./runE E010smg 4
  • pbsdcp -g $TMPDIR/modelE/E010smg ~/modelE/cmrun

July 7, 2010

Changed the pbs script to copy back everything to modelE/cmrun. Also redirected the runE messages to E010smg.runoutput. Hopefully this will indicate what is going wrong with the batch job.

In an attempt to streamline the process, I am calling E010smg without using runE. This removes the nohup and nice calls. However, to use multiple processors, two environment variables need to be set. They are the following:

  • export MP_SET_NUMTHREADS=4
  • export OMP_NUM_THREADS=4

With this change I was able to get a 2 month run to come to completion in 2 hours of computer time. The following is the pbs file that worked.

	
#PBS -N E010smg
#PBS -l walltime=03:00:00
#PBS -l nodes=1:ppn=4
#PBS -l mem=2GB
#PBS -j oe
#PBS -m e
export MP_SET_NUMTHREADS=4
export OMP_NUM_THREADS=4
cd $TMPDIR
mkdir E010smg
cp $HOME/modelE/E010smg/* ./E010smg
mkdir bcic
cp -r $HOME/modelE/bcic/* ./bcic
cd $TMPDIR/E010smg
ulimit -s 34768
./E010smg > ../E010smg.runoutput
cd $TMPDIR
cp -r ./* $HOME/cmrun

This can be trimmed down a bit more by only copying the ic and bc files needed by this model run. Will need to check the generated files to make sure the results look reasonable. If so, then a 6 - 10 year run can be set up.

July 8, 2010

Since I had a successful batch job, I need to verify that the data looks reasonable. The acc files were converted using pdE and downloaded. An R script was run to convert the oij fields into NetCDF format. This run was then compared to the results from the interactive run.

The fields match up within rounding errors due to precision of calculation. I now want to run the model for 6 years. This will take 3 days of computer time. Give 4 days so there is room for error and use 8 processors. This run will still be called E010smg.

July 13, 2010

The run completed successfully at 5:44 on July 9th. The job number was 3491574. It used 119:52.29 cpu time units, 276.792 Mb memory, 1,137.200 Mb virtual memory, and 15:14:35 walltime. This comes out to 5 days of computer time. There are evidently some additional inefficiencies that I did not account for when running 8 processors.

Statistics need to be generated for the reference run. The following is the process for generating a yearly, seasonal and monthly averages.

  1. Copy the acc files to the directory ~/modelE/process and go to that directory. Since the environment variables for the model use relative addressing, we need to initiate in a directory that is a subdirectory of ~/modelE.
  2. Run the command ../exec/pdE E010smg *1901.accE010smg (Yearly)
  3. Repeat this command for the other years over which the model ran.
  4. Download the annual files to the appropriate directory on a local machine.
  5. Run the command ../exec/pdE E010smg JAN*.accE010smg (Monthly)
  6. Repeat this command for the other months.
  7. Download the annual files to the appropriate directory on a local machine.
  8. Run the command ../exec/pdE E010smg DEC*.accE010smg JAN*.accE010smg FEB*.accE010smg (Seasonal)
  9. Repeat this command for the other months.
  10. Download the annual files to the appropriate directory on a local machine.

To complete the documentation on the first run it would be good to list the boundary and initial condition files that are used in the dynamic ocean run. Several of them are not used because they are replaced by the restart file, but they are included in case a run from scratch is desired. The list is as follows:

  • AIC.RES_M20A.D771201 ! initial conditions (atm.) needs GIC,OIC ISTART=2 and 300 years spin-up
  • GIC.E046D3M20A.1DEC1955 ! initial conditions (ground)
  • OIC4X5LD.Z12.gas1.CLEV94.DEC01 ! ocean initial conditions
  • 1DEC1969.rsfE050AoM20A ! full IC (GIC,OIC not needed) ISTART=8 (spun up 320 yrs)
  • OFTABLE_NEW ! ocean function table
  • AVR72X46.L13.gas1.modelE ! ocean filter
  • KB4X513.OCN.gas1 ! ocean basin designations
  • Z72X46N_gas.1_nocasp ! ocean bdy.cond
  • CD4X500S.ext
  • V72X46.1.cor2_no_crops.ext ! veg. fractions, crops history
  • CROPS_72X46N.cor4
  • S4X50093.ext ! bdy.cond
  • Z72X46N_gas.1_nocasp (Reused from ocean bdy.cond)
  • REG4X5 ! special regions-diag
  • RD4X525.gas2.RVR ! river direction file
  • sgpgxg.table8 ! rad.tables 8/2003 version
  • radfil33k
  • miescatpar.abcdv2
  • dec2003_PRE_Koch_kg_m2_ChinSEA_Liao_1850 ! pre-industr trop. aerosols
  • sep2003_SUI_Koch_kg_m2_72x46x9_1875-1990 ! industrial sulfates
  • sep2003_OCI_Koch_kg_m2_72x46x9_1875-1990 ! industrial organic carbons
  • sep2003_BCI_Koch_kg_m2_72x46x9_1875-1990 ! industrial black carbons
  • oct2003.relhum.nr.Q633G633.table
  • dust8.tau9x8x13
  • STRATAER.VOL.1850-1999.Apr02
  • cloud.epsilon4.72x46
  • solar.lean02.ann.uvflux ! need KSOLAR=2
  • topcld.trscat8
  • jan2004_o3_shindelltrop_72x46x49x12_1850 ! new ozone files
  • jan2004_o3_shindelltrop_72x46x49x12_1890
  • jan2004_o3_shindelltrop_72x46x49x12_1910
  • jan2004_o3_shindelltrop_72x46x49x12_1930
  • jan2004_o3_shindelltrop_72x46x49x12_1950
  • jan2004_o3_shindelltrop_72x46x49x12_1960
  • jan2004_o3_shindelltrop_72x46x49x12_1970
  • jan2004_o3_shindelltrop_72x46x49x12_1980
  • jan2004_o3_shindelltrop_72x46x49x12_1990
  • jan2004_o3timetrend_46x49x2412_1850_2050
  • GHG.1850-2050.Mar2002
  • dH2O_by_CH4_monthly
  • top_index_72x46.ij.ext
  • MSU.RSS.weights.data

Now the restart file needs to be explored. In order to do a sensitivity run it will be necessary to identify the proper fields to change. Initially I would like to try increasing the temperature of the water at the bottom of the ocean. This could be done in two different ways.

  • Temperature Bubble - Increase the temperature of the lowest level by a certain amount. Since the model uses potential enthalpy in its calculations, it may be necessary to change the enthalpy by a certain percentage.
  • Hot Floor - Determine if the model uses a heat flux across the floor of the ocean. If so, increase the heat flux by a certain percentage.

In both of these cases it should be possible to see how the increased heat redistributes itself. I would anticipate the heat flow to move upward and with it an increased vertical motion. This will also result in some regions developing a down-welling motion.

File format for the RSF file is explored in the file OceanModelNotes.html. Modules are called by the subroutine io_rsfdefined in the file IORSF.f. The model values are saved to a rsf file within the file MODELE.f while running the main program loop. The subroutine io_rsf uses a flag iaction to determine if it reads or writes a restart file.

There are many module calls to save values to the restart file. Each module labels the data it saves using a character string. Once all of the restart parameters are saved, the current status of the accumulated variables are saved. The accumulated variables will be save in the acc file at the end of the month. The accumulated variables are saved under the label of DIAG. The following labels are useful for changing the ocean variables for the sensitivity run. The file 1DEC1969.rsfE050AoM20A is used for locating the following labels:

  • MODEL01
    • Start of MODEL01 - 00 00 18 34
    • End at STRAT01 - 00 28 ee 0c
    • Size - 2676184
  • STRAT01 (Not present)
    • Start of STRAT01 - 00 28 ee 0c
    • End at OCDYN01 - 00 28 ee 0c
    • Size - 0
  • OCDYN01
    • Start of OCDYN01 - 00 28 ee 0c
    • End at OCSTR01 - 00 63 8d e4
    • Size - 3842008
  • OCSTR01
    • Start of OCSTR01 - 00 63 8d e4
    • End at LAKE01 - 00 63 b4 dc
    • Size - 9976
  • LAKE01
    • Start of LAKE01 - 00 63 b4 dc
    • End at SICE02 - 00 65 53 34
    • Size - 106072
  • SICE02
    • Start of SICE02 - 00 65 53 34
    • End at EARTH01 - 00 6a 61 4c
    • Size - 331288
  • EARTH01
    • Start of EARTH01 - 00 6a 61 4c
    • End at SOILS02 - 00 6e 6c a4
    • Size - 265048
  • SOILS02
    • Start of SOILS02 - 00 6e 6c a4
    • End at SNOW01 - 00 7b 5d 54
    • Size - 848048
  • SNOW01
    • Start of SNOW01 - 00 7b 5d 54
    • End at GLAIC01 - 00 83 db 2c
    • Size - 556504
  • GLAIC01
    • Start of GLAIC01 - 00 83 db 2c
    • End at BLD02 - 00 85 12 04
    • Size - 79576
  • BLD02
    • Start of BLD02 - 00 85 12 04
    • End at PBL01 - 00 9a 7f dc
    • Size - 1404376
  • PBL01
    • Start of PBL01 - 00 9a 7f dc
    • End at CLD01 - 00 e0 d9 34
    • Size - 4610392
  • CLD01
    • Start of CLD01 - 00 e0 d9 34
    • End at QUS01 - 01 09 47 8c
    • Size - 2649688
  • QUS01
    • Start of QUS01 - 01 09 47 8c
    • End at RAD05 - 01 9a d3 e4
    • Size - 9538648
  • RAD05
    • Start of RAD05 - 01 9a d3 e4
    • End at ICEDYN01 - 01 d9 17 44
    • Size - 4080480
  • ICEDYN01
    • Start of ICEDYN01 - 01 d9 17 44
    • End at DIAG01 - 01 da b5 9c
    • Size - 106072
  • DIAG01
    • Start of DIAG01 - 01 da b5 9c
    • End at OCDIAG01 - 01 dc 74 cc
    • Size - 114480
  • OCDIAG01
    • Start of OCDIAG01 - 01 dc 74 cc
    • End at TICDIAG01 - 02 52 ac 97
    • Size - 7747531
  • TICDIAG01
    • Start of TICDIAG01 - 02 52 ac 97
    • End at file - 02 57 86 f0
    • Size - 318041

The last three fields should also show up in the acc file. Look at an acc file and deterimine the size of these fields.

  • DIAG01
    • Start of DIAG01 - 00 00 19 d0
    • End at OCDIAG01 - 00 54 22 54
    • Size - 5507204
  • OCDIAG01
    • Start of OCDIAG01 - 00 54 22 54
    • End at TICDIAG01 - 00 8f 3e 67
    • Size - 3873811
  • TICDIAG01
    • Start of TICDIAG01 - 00 8f 3e 67
    • End at file - 00 91 ab c0
    • Size - 159065

Comparing the original restart file with the model restart files, there is a discrepency of 412 bytes in their length. Looking more closely, the following labels and values are compared between the two restart files.

  • In the 1DEC1969.rsfE050AoM20A restart file (39290608)
    • Starts at 00 28 ee 0c (2682380)
    • Ends at 00 63 8d e4 (6524388)
    • MODEL01 at 00 00 18 34 (6196)
    • 94 parameters saved at PARAM02
    • lmaxsubdd at 00 00 11 e0 (4576)
  • In the Output restart file (39291020) (412 longer)
    • Starts at 00 28 ef a8 (2682792) (412 longer)
    • Ends at 00 63 8f 80 (6524800) (412 longer)
    • MODEL01 at 00 00 19 d0 (6608) (412 longer)
    • 100 parameters saved at PARAM02
    • lmaxsubdd at 00 00 12 e8 (4840) (264)
  • OCSTR01 - Followed by LAKE01
  • Starts at 00 63 8d e4
  • Ends at
  • OCDIAG01 - Followed by TICDIAG01

Once we get to MODEL01, the shift of 412 is the same throughout the file. The

July 14, 2010

It looks like the ocean parameters can be changed without affecting the rest of the restart file. Therefore, the difference in sizes should not be a problem as long as the right start point is determined. A Java program can be written to find the start position. Next it can step through the different data fields to find the one to be changed. This field can be loaded in and modified. Treat this like a copy command where input data is saved to another file. When the appropriate field is found, substitute the modified field in place of a straight copy. Next we need to determine the fields saved in the OCDYN01 module. This module has a size of 3,842,008 bytes. The OCSTR01 module deals with the straits between larger bodies of water and should not be affected immediately by the changes of water temperature in the larger ocean. The following information about OCDYN01 is present in OceanModelNotes.html and elaborated on in the following: (Records are saved by io_ocdyn in OCNDYN.f. This subroutine is called by io_ocean, which is the standard call for all ocean models both prescribed and dynamic.) (Analysis was done in RSF_ACC_Fields.xls)

  • Record Size - Integer
    • Start at 00 28 ee 08 - Length 4
  • MODULE_HEADER - Character*80
    • Start at 00 28 ee 0c - Length 80
  • MO (mass of ocean, kg/m^2) - Real*8 72x46x13
    • Start at 00 28 ee 5c - Length 344,448
  • UO (E-W velocity, m/s) - Real*8 72x46x13
    • Start at 00 2e 2f dc - Length 344,448
  • VO (N-S velocity, m/s) - Real*8 72x46x13
    • Start at 00 33 71 5c - Length 344,448
  • G0M (Potential enthalpy, J) - Real*8 72x46x13
    • Start at 00 38 b2 dc - Length 344,448
  • GXMO (X-moment of potential enthalpy, J) - Real*8 72x46x13
    • Start at 00 3d f4 5c - Length 344,448
  • GYMO (Y-moment of potential enthalpy, J) - Real*8 72x46x13
    • Start at 00 43 35 dc - Length 344,448
  • GZMO (Z-moment of potential enthalpy, J) - Real*8 72x46x13
    • Start at 00 48 77 5c - Length 344,448
  • S0M (Salinity of ocean, kg) - Real*8 72x46x13
    • Start at 00 4d b8 dc - Length 344,448
  • SXMO (X-moment of salinity, kg) - Real*8 72x46x13
    • Start at 00 52 fa 5c - Length 344,448
  • SYMO (Y-moment of salinity, kg) - Real*8 72x46x13
    • Start at 00 58 3b dc - Length 344,448
  • SZMO (Z-moment of salinity, kg) - Real*8 72x46x13
    • Start at 00 5d 7d 5c - Length 344,448
  • OGEOZ (Ocean geopotential at surface, m^2/s^2) - Real*8 72x46
    • Start at 00 62 be dc - Length 26,496
  • OGEOZ_SV - Real*8 72x46
    • Start at 00 63 26 5c - Length 26,496
  • Record Size - Integer
    • Start at 00 63 8d dc
  • Total Size - 3,842,008
    • Next Module begins at 00 63 8d e0

Now that the structure of the rsf file is know, a Java program is written to read through the records and extract the ones you want. This program is developed using Eclipse and is called GISSReader. It looks like the program works, but the values for salt and enthalpy seem large. Need to consider the value when compared to the amount of water present. The mass of water is given in kg/m^2. Each level of water has a different thickness. From Russell, Miller and Rind, 1995 A Coupled Atmosphère-Océan Model for Transient Climate Change Studies the layers are as follows: (These values were calculated using the average density for each level and the average mass per level from an ACC file. See SpecificEnthalpy.xls.)

  • Level 1 - 12 m (Calculated - 12.14 m)
  • Level 2 - 18 m (Calculated - 18.21 m)
  • Level 3 - 27 m (Calculated - 27.30 m)
  • Level 4 - 41 m (Calculated - 40.94 m)
  • Level 5 - 61 m (Calculated - 61.38 m)
  • Level 6 - 91 m (Calculated - 92.00 m)
  • Level 7 - 137 m (Calculated - 137.89 m)
  • Level 8 - 205 m (Calculated - 206.64 m)
  • Level 9 - 308 m (Calculated - 309.5 m)
  • Level 10 - 461 m (Calculated - 463.4 m)
  • Level 11 - 692 m (Calculated - 692.8 m)
  • Level 12 - 1038 m (Calculated - 1035 m)
  • Level 13 - 1557 m (Calculated - 1544 m)

July 15, 2010

Trying to track down how potential temperature is calculated for the ocean. Calculation of the ocean temperature is initiated from subroutine OIJOUT in file ODIAG_PRT.f. The calculation is performed as follows:

  • For each layer and each grid point calculate the specific enthalpy (enthalphy/mass) and salinity (salt mass/mass).
  • Pass these two parameters into the function TEMGS(GOS,SOS).
  • The function TEMGS makes sure the specific enthalpy falls between -8000 and 160,000.
  • The function TEMGS makes sure the salinity falls between 0 and 0.04.
  • The values for specific enthalpy and salinity are interpolated with respect to a look-up table named TGSP found in the initialization file OFTABLE_NEW.
  • The result of this function is returned to the calling routine OIJOUT.
  • The temperature for both poles are copied to each grid point corresponding to the pole.
  • The temperature is saved to the OIJ field one layer at a time.

The next step is to change the potential enthalpy of the ocean layers and see how it translates into a change in ocean temperatures. Looked at the specific enthalpy for the bottom and the top of the ocean. See SpecificEnthalpy.xls. At the surface, the specific enthalpy is at 84% of the maximum allowed value. At the bottom of the ocean it is at 7% of max. Both top and bottom are within 60% of the minimum value. It looks like the best choice for warming the ocean is to increase the specific enthalpy by 10%. This should prevent any inconsistencies in the moments of the enthalpy and should result in a significantly warmer ocean. Since we are dealing with specific enthalpy, the change in enthalpy for any particular layer will need to be mass weighted.

A more precise calculation of layer thickness was attempted by taking the layer mass (from ACC file) and divide it by the average layer density (from PRT file). Those values are recorded with the previous days entries of level thickness. Either way the values are within 1% of each other.

July 16, 2010

Today's project is to increase the potential enthalpy for the restart file and see how it affects the potential temperature of the ocean in the ACC file. Modify the GISSReader program to copy a file until the appropriate record is found and then add a set amount of enthalpy to each layer of the ocean. Each layer has a different amount of enthalpy because the thickness of the layers change with depth. The java code was rewritten to make it easier to access different data fields in the future. The OCDYN01 record is loaded in as double arrays for each variable. The variable G0M is changed by adding an amount of specific enthalpy that is 10% of the largest value for the surface. In order to verify that the changes are correct, only the OCDYN01 record was saved for both the original RSF file and the modified one. They are called DEC1969.ocndyn and HotEnth.ocndyn respectively. These files were converted to NetCDF files using the R script OCDYN2NCDF.R. Values of enthalpy for non-ocean grid points were originally at zero. This was maintained in the modified file. By shifting all of the specific enthalpies by the same amount, it is hoped that there will not be any significant effect on the values of GXMO, GYMO and GZMO (the moments of the enthalpy). When running the model, it is expected that the potential temperature of the whole ocean will be increased with a more pronounced change in the deep ocean. There should also be an impact on sea surface temperatures and sea ice amounts.

The modified enthalpy file will be uploaded to OCS and a new run called E011smg will be initiated. A test run of one month will be performed to look at initial statistics. If this run looks good, E011smg will be rerun as a batch job for six years.

  • Follow the interactive model run procedure from June 1, 2010.
  • After generating the run script, edit the AIC file to the modified restart file (HotEnth).

Problem encountered during setup. Error message is PBL: Ts out of range. I think it might be too large of a gradient between the sea surface temperature and the atmosphere. There are two possible solutions. One is to decrease the change in the sea surface temperature and the other is to leave the top layer unchanged. I will attempt the first one and see if a month of model time can be generated. Keeping the top layer unchanged allowed the setup to work; however, during the initialization of the standard run (beyond the 1st hour), the same message is generated. I have now kept the whole ocean the same except the bottom layer. This allowed a run to go for 1 month. I will look through the ACC file and see what the ocean looks like due to this change.

July 17, 2010

It was possible to run with only the bottom layer increased in enthalpy; however, it didn't show much change. The increase in enthalpy was set at 1% instead of 10%. The run worked and showed about 0.3 C increase in temperature for the tropic, but 19 C increase for the poles. No matter the layer, the north pole values were extreme. It is clear that the way enthalpy is increased is not right. It is clear that the enthalpy increase not only needs to take layer thickness into account, but also latitude (less area).

Looking at OGEOM.f the weighting of area by latitude is calculated with the formula DXYPO(J) = DLON*RADIUS*RADIUS*(SINVN-SINVS). Each of the variables in the formula have the following values:

  • DLON = TWOPI/IM (IM = 72)
  • RADIUS = 6375000
  • SINVS = SIN(VLATS)
  • (VLATS = -TWOPI/4.)
  • SINVN = SIN(VLATN)
  • VLATN = DLAT*(J+.5-FJEQ) (IF(J.EQ.JM) VLATN = TWOPI/4.)
  • DLAT = NINT(180./(JM-1))*TWOPI/360.
  • FJEQ = .5*(1+JM)
  • pi = 3.1415926535897932d0
  • twopi = 2d0*pi

The calculated area by latitude is contained in SpecificEnthalpy.xls. At this point the Java program for modifying the Enthalpy will use a specific enthalpy value of 135,200 as a base. This comes to 84% of the maximum allowed value. This value will be increased by 10%, which means 13,520 will be added to every ocean specific enthalpy. To convert this to the potential enthalpy for each ocean grid point and layer, the value added will be 13,520*MO(i,j,l)*area(j).

July 18, 2010

Tried to modify the enthalpy as described above. There were some problems when checking the enthalpy field in OCDYN01. After several tries, the Java program was debugged. A new RSF file was generated with specific enthalpy increased by about 10%. A model run of E011smg was run and the OIJ fields were inspected. The potential temperature of the ACC file ran about 0.3 to 4 C greater than the ACC file for the control run. It looks like there is a change in salinity due to seaice melt. The values for the one month run seem reasonable. A batch job was set up and initiated. It's job number is 3539120. Because the job was submitted at 1 AM Monday morning, it got off the queue very quickly. If this run takes 15 hours like the first one, it should be done by 6 PM tonight.

July 19, 2010

While waiting for the batch job to complete, R scripts need to be written to generate NetCDF files for the model accumulated statistics. The following files are generated by pdE. Some are text files and others are binary. The binary files will be converted to NetCDF format.

  • Ocean
    • OIJ - This is the two dimensional ocean diagnostics. Format of the file is contained in Output_Fields.xls and the script for converting to NetCDF is OIJ_NetCDF.R
    • OIL - This is a two dimensional ocean diagnostic saving vertical sections through the ocean at specified latitudes. Format of this file is contained in Output_Fields.xls and the script for converting to NetCDF is OIL_NetCDF.R
    • OTJ - This text file contains mass, salt and energy transport at different latitudes for the Atlantic, Pacific, Indian and Global ocean basins.
  • Atmosphere
    • WP - This binary file contains wave power for the U and V components of the model. No time was taken to determine the file format or to convert it to NetCDF.
    • ISCCP - This binary file contains cloud statistics for comparison with the ISSCP project. No time was taken to determine the file format or to convert it to NetCDF format.
    • DIURN - This is a text file containing hourly means for various parameters at designated grid points. The default grid points are:
      • AUSD (63,17) - Australia (123.5 E, 26.0 S)
      • MWST (17,34) - Midwest US (97.5 W, 42.0 N)
      • SAHL (37,27) - Sahel Africa (2.5 E, 14.0 N)
      • EPAC (12, 23) - Eastern Pacific (122.5 W, 2.0 S)
    • HDIURN - This is a text file containing hourly values for various paramters at designated grid points. These grid points are the same as for DIURN.

July 20, 2010

The batch job finished at 4:30 pm on July 19. The program pdE was used to generate diagnostic statistics on the resultant data. The process described on July 13, 2010 was used to generate annual and yearly seasonal averages. The Yearly seasonal averages were done with the batch file batchconvert. Taking a preliminary look at the data, it looks like the model was a success. The ocean temperatures at 6 years are cooler than the initial values, but warmver than the reference run.

July 21, 2010

It is time to make a poster and presentation based on the model run. Comparisons will be made between the 6 year winter average of the reference run against both the beginning and ending winter of the modified run. For comparison the ending winter will be compared to the ending summer to see how large the seasonal cycle is versus the perturbation of the model.

Potential temperature comparison

The temperature comparisons are done as a zonal average. This gives some sense of where the mixed layer ends and the deep ocean begins. Comparing the beginning winter with the reference, there is a nearly uniform 3.2 C increase in potential temperature due to the 10% increase of specific enthalpy relative to the surface. The plots generated for comparison are the reference, beginning winter, ending winter, difference between ending and reference, and difference between ending winter and summer. When making these plots, the following parameters are set in Panoply to make the plots look uniform.

  • For the regular plots, the scale goes between -2 and 32 using the panoply GCT color scale.
  • Printing the vertical scale uses the format of %.1f.
  • The plot subtitle is changes to reflect what set of values are being plotted.
  • Plotting the differences uses the same color scale, but the limts are -6 to 6.

Differences between the winter and summer are contrained to the top 3 layers of the model and have differences of -5.6 to 4.5 C. This layer depth translates into about 57 meters. Changes below this depth are on the range of 0.1 and 0.2 C.

Surface velocities

The surface velocities are plotted for the reference winter and the modified 6th winter.

  • Set the color scale to Blues_09 CPT and the range from 0 to 40 cm/s.
  • Set the overlay to earth mask so only the oceans are displayed.
  • Change the plot label to reflect that is is the magnitude and direction of the surface velocity that is being plotted.
Ocean Boundary Layer Depth

The ocean boundary layer depth is used by the K-Profile Parameterization (KPP) scheme. This scheme predicts an ocean boundary layer depth and then uses a parameterization using a nonlocal bulk Richardson number and similarity theory of turbulence. The alternate means of calculating vertical mixing is the Pacanowski and Philander (PP) scheme. For a comparison of these schemes and its use see Li, X. et al, 2000: A Comparison of Two Vertical Mixing Schemes in a Pacific Ocean General Circulation Model. J. Climate.

  • Use the panoply GCT color scheme and do a log plot. Set the scale from 1 to 2000.
  • Plot the reference, modified 6th year winter and summer. Set the scale to -400 to 400.
  • Change the sub-title and the legend label.
  • The annual BL were compared between the reference (1st year control) and 6th year control, 1st year modified and 6th year modified.
  • Scale for differences went from -300 to 300 m
  • The annual BL reference was plotted using a logrithmic scale with the Blues_09 CPT color table.

July 22, 2010

Continued to generate plots for the conference poster session.

Horizontal Mass Transport Streamfunction

This streamfunction is a vertically integrated relationship that determines the meridional mass transport. The units are Sverdrups (Sv) and are the same as 10^6 m^3/s. According to Marshall and Plumb (Atmosphere, Ocean and Climate Dynamics, p. 215) the Amazon river has a volume transport of 0.2 Sv. From Wikipedia the flow of fresh water from all of the rivers of the world have a value of 1 Sv. Regions where the volume flow is largest has no zonal flow. Regions of a high northward gradient of this streamfunction gives a location of very large flow from east to west. Likewise if the gradient is southward, the flow of the ocean gyre in this location is west to east. Regions of high streamfunction value occur near the westerly portion of the ocean basin.

Vertical Mass Flux

The vertical mass fluxes were compared at levels 3, 6, and 9. The differences between the reference, beginning and ending winters is not significant for levels 3 and 6. However, for level 9 it looks like there is a noticeable increase in the vertical mass flux near the arctic and antarctic. This is most likely due to enhanced melting at the poles.

  • Use the panoply GCT color scheme and set the range from -5 to 5 10^2 kg/s m^2 (The range is the same for the differences as well as the regular value.)
  • Change the sub-title and the legend label.
  • Set the overlay to earth mask.
GM/Eddy Heat Flux

There are 9 data fields linked to quantity. GM stands for the Gent-McWilliams scheme for reducing climate drift in coupled ocean-atmosphere models. This scheme is an ocean eddy parameterization for sub-grid processess. It is assumed that the eddy fluxes are quasi-adiabatic in the ocean interior and can, therefore, be represented as eddy-induced velocity. (See Ferrari, Eddy-mixed layer interactions in the ocean. (MIT)) The picture that this scheme leads to is that creation of eddies depletes the energy source and, therefore, reduce the slope of the isopycnal surfaces. Once an eddy is formed, it slides along the isopycnal surface and in the process mix temperature and tracers on the grid scale. (See Neelin and Marotzke, Representing Ocean Eddies in Climate Models (Science)) This scheme does not work in the boundary layer of the ocean because there are strong diabatic processes going on. This leads to the need for tapering off the GM scheme and implementing a different one, such as the KPP scheme mentioned above. Through correspondence with Gavin Schmidt, the vertical heat flux is positive in the upward direction. When it is used in the subroutine GMFEXP, the flux (RFZT) is added to level L and subtracted from level L+1 (which is deeper in the ocean). Therefore, a positive value for flux will add heat to an upper level and remove it from a lower level.

  • Plotted the reference and difference from the reference using the panoply GCT color table and set the scale for -9 to 9.
  • In general the flux at the 9th level (749 m) is upward. There is some intensification of flux at the circumpolar current.
Vertical Mass Flux

Looking at levels 3 and 6 of the vertical mass flux, there does not seem to be any significant difference between the control and modified runs.

  • It looks like there is a difference in the vertical mass flux at level 9 that is larger than that with the 6th year of the control.
  • Plot using range of -8 to 8.
Heat Flux Levels 1-13

This plot is a vector plot of the N-S and E-W heat fluxes integrated over the depth of the ocean model.

  • Set the scale from 0 to 4 and use the Blues_09 CPT color table.
  • The vector length is set to a scale length of 3.
  • The heat transport pattern looks the same for all the plots; however, the strength of the flux is doubled at the Cape Horn.
Salt Flux Levels 1-13

The integrated salt flux was compared between the control and other runs. There is an increase of flux around antarctica; however, this is nothing different than the increase in heat flux and surface velocity in this location. The salt flux increases by 27%. This is probably an indicator of an increased deeper ocean circulation around the continent. No plots were generated.

Salinity

Looking at the vertical profile of salinity shows not much change at greater depths. It looks like any changes are in the upper 4 levels. The salinity is compared as several levels.

  • Not much change shows up in salinity. There is some reduction of salinity at the north pole most likely due to melting ice.
  • Plotting salinity of the reference use a scale of 0 to 60 and the div_brown2blue_12 ACT color table.
  • Use the same color table, but use a scale of -10 to 10 for the difference of salinity.
Potential Temperature

The vertical potential temperature profile for the annual averages do not show much difference. Several levels are plotted and their differences.

  • For the potential temperature use a scale of -5 to 35 and use the panoply GCT color table.
  • For the difference of potential temperature use a scale of -10 to 10 and the panoply_diff_17 GCT color table.
Ocean Surface Height

The ocean surface height was compared between the reference and modified run. The warmer ocean does give an increased surface height. On the average the increased height corresponds to 2 meter increase.

  • Plot the reference surface height between -10 and 10 using the panoply GCT color table.
  • Plot the differences between -5 and 5 using the panoply_diff_17 GCT color table.

July 23, 2010

Continued working on the poster for the CGS meeting next week.

August 3, 2010

While at the conference, was able to rename the variables in the IJ file to match how the netcdf routines in modelE does it. Also was able to successfully generate an OIL file. If the second dimension is relabeled to lat, Panoply reads and displays it. However, it assumes it is based on geographic coordinates.

Need to find another program other than Panoply to display NetCDF files. This can be done in R, but look for other sources before developing a series of R scripts to do plotting. One possibility is NCL (NCAR Control Language).

Installing modelE on a linux virtual machine. Place all of the relavant files on a secondary drive. When using it, the .modelErc files needs to be copied to the home directory of the user. The data files will be saved on the secondary drive. Use the relative addressing scheme from the batch job runs on the OCS computer.

August 5, 2010

Trying to get modelE to run on Linux desktop. To get a successful compile, the netcdf calls in Rules.make were removed. Also the function call iargc is an implicit function in gfortran (hold over from g77. To get this to work, the file MODELE.f needs to have all the references to iargc removed and the function call iargc() needs to be replaced with command_argument_count(). The program compiled; however, when startup was run for the first month, a segmentation fault was received. The value of ulimit was expanded, but that had no effect. Also the memory of the virtual machine was pushed up to 1.5 GB, but still a segmentation fault.

Try the following to get ModelE running on a Linux Desktop.

  • Try to compile ModelE on OSC using gfortran.
  • Take the compiled code from OSC and download it to the Linux Desktop and see if it will run.

The second option (compile and download to desktop) did not work at face value. There may be some issues with the shell script that calls the binary, but there does seem to be a significant problem that needs to be explored if the other option does not pan out.

I looked at the resource useage at OSC. Of the 5000 allocated units, 24.54 are used. The balance as of today is 4975.35768. Compare this balance with a balance at a later time to see if general use of the account reduces the RU balance. Most if not all of this useage is due to the two batch jobs, which ran modelE with a dyanamic ocean for six years. Using this same model it is then expected that each simulated year will cost about 2 RU's. That means with the original allocation, a combined simulation of 2,500 years could be performed. Since it was observed that there is some drift in the model using the 1DEC1969 restart file (supposedly already run for 300 years), it is suggested by the folks at GISS that a total of 500 years spin-up be done. That means that the model should be spun-up for an additional 200 years. This would use 400 RU's. Before this is done the most recent version of ModelE should be downloaded using CVS.

Before downloading the most recent version of ModelE, the limits of the ocean model should be tested. An increase of 10% to specific enthalpy resulted in an average ocean temperature increase of 3 C. How much more can the ocean be warmed before it causes a problem with the model?

At this point, let's see if gfortran can be used to run the model on the OSC machine. Here are the changes made to the ModelE files to see if a compile can be accomplished.

  • Change the COMPILER environment variable in .modelErc to GFORTRAN.
  • Change the MP environment variable in .modelErc to NO. Get a good compile first before addressing multiple processors.
  • Modify the file Rules.make
    • Under the condition ifdef NETCDFHOME, remove the references to -lnetcdff and -lnetcdf from the LIBS definition.
    • Copy lines 173 - 191 after line 191. This is the compiler definition for PGI. This is modified to fit the Gnu Fortran compiler.
    • Lines 193, 194; change PGI to GFORTAN.
    • Change line 195: F90 = gfortran -fconvert=big-endian
    • Change line 199: FFLAGS = -cpp -O2
    • Change line 203: FFLAGS += -fopenmp
    • Change line 204: LFLAGS += -fopenmp
    • Copy lines 234-239 after line 239. This is the compiler forced recompile for PGI.
    • Change line 340; change PGI to GFORTRAN.
    • Add another endif at line 348.
  • Remove references to iargc in MODELE.f and replace call iargc() with command_argument_count()
  • Compile was successful, but setup resulted in a runtime error of Invalid argument at line 858 of MODELE.f
  • Ran the same changes with GNU Fortran on the Linux workstation and get the segmentation fault from before. No other error messages are conveyed. It may be a problem with line 858 of MODELE.f
  • Line 858 of MODELE.f is: read(iu_GIC)
  • iu_GIC is opened two lines earlier using: call openunit("GIC", iu_GIC,.true.,.true.)
  • Invalid argument error may be related to the reading of big-endian unformated data. The record length can be specified using the compile option -frecord-marker=4. This will force the record marker to be a 32 bit integer.
  • Both the OCS and Linux workstation version of the model give a segmentation fault at this point. The OCS version gives additional information and says that it occured on line 5 of the shell which states: ./"E001gnu".exe -i I >> E001gnu.PRT
  • Expanding the stack size does not help. Possible causes of a segmentation fault are buffer overflow, uninitialized pointers, dereferenced NULL points. From an earlier error, it looked like reading GIC caused a problem, but this message went away when the -frecord-marker=4 line was added. Is there something else the matter with reading an unformatted data file? Check to see if GIC is read before AIC.