3 pghpf Runtime Options

3.1 Runtime Options and Environment Variables
3.2 Print Execution Information stat
- 3.2.1 CPU Statistics stat

This chapter describes features of the pghpf runtime library that the programmer controls using runtime options or by setting environment variables. The pghpf runtime library uses transport-independent calls and is implemented using different library versions for different targets. The transport independent calls interact with a transport dependent interface supporting different communications protocols. The transport mechanisms currently supported include:

MPI (Message Passing Interface) this includes a number of transports, including the Public Domain software available from Argonne National Laboratory and Mississippi State University.
PVM (Parallel Virtual Machine). Public domain software available from Oak Ridge National Laboratory. PVM is a software system that enables a collection of heterogeneous computers to be used as a coherent and flexible concurrent computational resource.
RPM (PGI Proprietary Communications - Real Parallel Machine). This transport mechanism was developed at PGI to model the behavior of PVM among a homogeneous group of hosts on a network. It offers greater efficiency and performance than PVM with fewer requirements.
RPM1 (RPM, single process only). This transport mechanism runs on a single processor and doesn't fork. This library works like RPM and assists with debugging by allowing a single node version of an HPF program to run and work in a manner very similar to a multi-node program.

There are different versions of the pghpf runtime libraries for each supported transport mechanism. Depending on your hardware and the product or products you purchased from PGI, your pghpf compilation tools may include libraries for one or more transport mechanisms.

From the HPF programmer's point of view, the differences between versions of the pghpf runtime library have little effect on program development. The differences are:

Linking involves using different versions of the pghpf runtime libraries. Using the pghpf compilation driver, you can select the library version by supplying a different -M arg option on the command line during linking, for example -Mpvm. A default communications option is set by the driver, so you need not specify one on the command line unless you want to change the default.
Some runtime options and corresponding environment variables are different for different versions of the pghpf runtime library.
Program debugging techniques differ depending on the pghpf runtime library.

3.1 Runtime Options and Environment Variables

This section describes the runtime options and environment variables for executable HPF programs linked with the pghpf runtime libraries. All runtime options may be specified as runtime options or environment variables. The environment variable corresponding to a runtime option -xxx is PGHPF_XXX. Runtime command-line options override values specified by environment variables.

The following sections describe the options for each pghpf runtime library. While many options are similar depending on the library, the options also sometimes have slightly different meanings depending on the runtime library.

3.1.1 Specify Command Line Options in the Environment

All runtime options can be specified either as command line options or environment variables. An example using command line options:

a.out -pghpf -np 8 -stat all

An example using environment variables:

setenv PGHPF_NP 8
setenv PGHPF_STAT all
a.out

Options may also be specified using command line format in an environment variable:

setenv PGHPF_OPTS "-np 8 -stat all"

Command line options over-ride environment variables and individual environment variables over-ride the PGHPF_OPTS environment variable.

3.1.2 RPM Runtime Library

RPM is the PGI Real Parallel Machine system. RPM supports process spawning and communication among processes on a group of homogeneous hosts. Processes use UNIX sockets for communication. Processes on the same host may optionally use shared global memory for communication. RPM is similar to PVM, but RPM offers greater efficiency and performance than PVM, with fewer administrative requirements.

RPM first creates the processes needed to run the program. It uses rsh by default to create remote processes. It also establishes socket connections between each pair of processes (this phase is functionally equivalent to MPI or PVM).

The processes then exchange hostname and process ID information. The processes then determine how many other processes of the same job reside on the same host. If there is the only one process on the host, the process will always use sockets for communication.

The executable runtime option -pghpf specifies that the following options are to be passed to the communications control portion of the executable program. The -pghpf option allows you to pass user-defined runtime options to your application program and communications control runtime options to the pghpf runtime library.

In general, the command line format for a compiled program is:

% a.out user_options  -pghpf PGHPF_options

where: a.out is the executable program, user_options are the program's valid runtime options and PGHPF_options are any of the valid options for the pghpf runtime library.

Table 3-1 shows the valid RPM executable options (-pghpf options). Detailed descriptions are provided in the sections following Table 3-1.

Table 3-1 RPM Runtime Library Options and Variables

Option
Environment Variable
Purpose
-curhost hostname
PGHPF_CURHOST hostname
Specify alternate name for
current host.
-debugger path
PGHPF_DEBUGGERpath
Specify the default debugger.
-g n | all
PGHPF_G [n| all]
Enable parallel debugging.
-heapz n
PGHPF_HEAPZ n
Specify the shared global heap size.
-host hostargs
PGHPF_HOST hostargs
Specify the hosts where additional processes are to be spawned.
-minxfer size
PGHPF_MINXFER size
Specify minimum unbuffered message size.
-mount arg
PGHPF_MOUNTarg
For systems that use auto mounted file systems, this option fixes a problem where the RPM library is not able to resolve NFS pathnames.
-np num
PGHPF_NP num
Specify number of processors.
-rsh shell
PGHPF_RSH
Specify an alternate shell to be used in spawning processes on remote systems.
-stat options
PGHPF_STAToptions
Print runtime statistics upon program completion.

Number of Processors np

The runtime option -np n and the environment variable PGHPF_NP num specify the number of processors to use. When neither of these options is specified, the default value for the number of processors is set to one (1). The argument supplied for n or num must be a positive integer.
For example, to compile and run the HPF program test1.hpf using four processors, issue the following commands:
$pghpf -o test1 test1.hpf $test1 -pghpf -np 4
If your program uses the PROCESSORS directive and includes the intrinsic NUMBER_OF_PROCESSORS, then the value supplied to the -np runtime option (or the PGHPF_NP environment variable) will be the number of processors for the current HPF execution. For example:
!HPF$ PROCESSORS NUM_PROC(NUMBER_OF_PROCESSORS())
If your program does not use an intrinsic to calculate the number of processors available, but uses an explicit PROCESSORS directive with a value, then the -np value should be at least as great as the value used in the program's PROCESSORS directive. For example if the value supplied is four, then the -pghpf -np argument should be greater than or equal to four.
!HPF$ PROCESSORS NUM_PROC(4)
If your program does not use a PROCESSORS directive, the program runs on the number of processors specified in the -np runtime option.

Specify Hosts host

The runtime option -host and the PGHPF_HOST environment variable specify the hosts where compute processes will be spawned. The command line option has the following syntax:
-host [ host[:n] |,-file=path |,-v |,-dyn ]...
All items in the command line option are separated by commas. The command line option has the same arguments as the environment variable.
The first process always runs on the host where the program was started. If this option is not specified, all of the processes run on the same host.
This option specifies the hosts where the compute processes will be created. The first process' host is taken into account to properly spread processes across the specified hosts.
The following examples assume the program was started on moe.
The simplest form is simply a list of hostnames:
PGHPF_HOST moe,larry,curly,bill
The hosts are chosen in the order specified. If two compute processes were specified, moe and larry would be chosen. If the program requested five processes, the hosts would be chosen in the following order: moe, larry, curly, bill, and moe.
Hostname can be appended with a power factor, for example:
PGHPF_HOST moe:50,larry:150,curly,bill:25
When any host has a power factor specified, the hosts are chosen in the order of the specified power factor. The default power factor is 100. In the above example, curly has a power factor of 100, moe has a power factor of 50, larry has a power factor of 150 and bill has a power factor of 25. The power factor specifies the following order for host selection: larry, curly, moe,bill .
If multiple hosts have the same power factor, the order in which they are chosen is undefined.
The hostnames and optional power factors can be read from a file. Assuming the file hosts contained:
moe 50 larry 150 curly
The following example is identical to the previous example:
PGHPF_HOST -file=hosts,bill:25
Multiple hostnames and -file options can be specified in any order desired. The order is significant only if no host has a power factor.
The -dyn option changes each host's effective power factor based on that host's current load average. The calculation is:
current_power = power / (load+1)
With the -dyn option, the calculated current_power factor is used to choose hosts. If a host fails to respond to the load average request, that host is ignored. This option can take some time to process since each host's load average is requested. Hosts are timed out after 10 seconds and ignored.
Normally, all of the above is done silently. The -v option displays interesting informative messages.
A final example:
PGHPF_HOST -dyn,-file=hosts,bill:25,-v
The order of the -file,-dyn, -v options is not important.
Specifying Alternate Hostname curhost

When systems are interconnected by multiple networks, the hosts often have different names on different networks. Any of a remote host's alternative names may be specified with the -host option. The -curhost option is used to specify an alternative name for the current or local host. The -host and -curhost options together can be used to specify the alternate hostnames that in turn select a different network.
Specify Debugging g and debugger

The -g runtime option or PGHPF_G environment variable invokes a debugger.
Debuggers are specified by setting the PGHPF_DEBUGGER environment variable to the pathname of the debugger. For example,
setenv PGHPF_DEBUGGER /usr/ucb/dbx
If more than one debugger is present, entering commands is difficult. A better method for debugging is to create a shell script to create a window for each debugger. For example, create a script named rpmdbg (any name will do) containing the following.
exec xterm -e /usr/ucb/dbx $1
And then set PGHPF_DEBUGGER to rpmdbg.
All pghpf runtime options are specified on the command line that invokes the program. The user-supplied runtime arguments must be specified with each debugger's run command.
The file rpmdbg must be accessible in the current PATH.
The n and all optional arguments to the -g option and PGHPF_G variable specify debugging parameters. The positive integer n specifies a logical process number (this should be between 0 and the number of processors). The all keyword specifies debug all processes.
Specify Minimum Unbuffered Message Size minxfer

The -minxfer option specifies the minimum message size that will not be buffered. Messages greater than or equal to the specified size are sent as individual messages. Consecutive messages less than the specified size sent to the same destination are buffered. The default value is currently 2 Kilobytes. Changing this value may improve performance on some systems.
Specify NFS Mount Path for Automounted Systems mount

There is a problem when NFS filesystems are mounted at different points on different hosts. Pathnames valid on one host are not necessarily valid on other hosts.
If filesystems are not mounted at the same directories on remote systems, remote RPM processes will not be able to find files. This can usually be corrected by specifying the -mount runtime command line option or the PGHPF_MOUNT environment variable.
The value specified is a list of match strings and replacement strings.
-mount match0:replace0,match1:replace1,...
For example:
a.out -pghpf -mount /home/u:/home/1,/home/g:/home/1
The pathname /home/u would be changed to /home/1 and /home/g would be changed to /home/1.
Specify Remote Shell -rsh

The -rsh option and the PGHPF_RSH environment variable specifies the name of the remote shell used for creating remote processes. It is normally "rsh". If your favorite remote shell was named "ksh", the option would be:
a.out -pghpf -rsh ksh

Print Execution Information

Refer to sections 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
3.1.3 Shared Memory Systems - RPM

If there are more than one processes on the same host, RPM creates a shared-memory segment or mmaps a temporary file to use for communications (depending on the target system). The processes on the host then attach the other processes' shared-memory segments or mmap the other processes' temporary files. The RPM runtime option -heapz specifies the size of the shared-memory segment or the temporary file.
The processes then allocate all globally-accessible data from the shared-memory segment or from the mmap'ed file called the global heap. Communication between processes on the same host is then a copy between global heaps. Communication between processes on different hosts uses sockets.
Never use shared-memory with more processes than available processors, this will destroy performance.
Global Heap Size heapz

Processes by default use Unix sockets for communication. Processes on the same host may optionally use a shared global heap for communication.

NOTE

The use of a shared global heap will generally improve performance on systems where there is a one-to-one correspondence between logical and physical processors.

The -heapz size command line option and PGHPF_HEAPZ size environment variable specify the size of the shared global heap. The default size is zero, which forces all processes to use sockets for communication. The size may be specified with a 'k' or 'm' suffix, the 'k' specifies kilobytes, and 'm' specifies megabytes. For example, specifying 4m is the same as specifying 4194304.
If the program attempts to allocate more memory than available in the shared global heap, a message is displayed and the program aborts. A rough estimate of the shared global heap size required can be determined by running the program with the
"-stat mems" option and looking at the "heap used" value. Refer to section 3.2 "Execution Information" for more information on the -stat option.
The -heapz option is recommended when all of the processes of a program are run on a single shared memory multi-processor system. However, if the processes of a program run on multiple systems, each group of processes on a single system will use a shared global heap for communication. Sockets are still used for communication between processes on different systems.
3.1.4 RPM1 Runtime Library

RPM1 is the PGI Real Parallel Machine system for a single processor. RPM1 supports debugging for a RPM program. This system runs on a single processor and doesn't fork. This library works like RPM and assists with debugging by allowing a single node version of an HPF program to run and work in a manner very similar to a multi-node program.
The executable runtime option -pghpf specifies that the following options are to be passed to the communications control portion of the executable program. The -pghpf option allows you to pass user-defined runtime options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% a.out user_options -pghpf PGHPF_options
where: a.out is the executable program, user_options are the program's valid runtime options and PGHPF_options are any of the valid options for the pghpf runtime library.
Table 3-2 shows the valid RPM1 executable options (-pghpf options).

Table 3-2 RPM1 Runtime Library Option and Variable

Option
Environment Variable
Purpose
-stat options
PGHPF_STAToptions
Print runtime statistics upon program completion.
Print Execution Information
Refer to sections 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
3.1.5 PVM Runtime Library

PVM is the Parallel Virtual Machine System available from Oak Ridge National Laboratory. PVM is a software system that enables a collection of computers to be used as a coherent and flexible concurrent computational resource. The options in this section apply to programs using PVM for communications.
The executable runtime option -pghpf specifies that the following options are to be passed to the communications control portion of the executable program. The -pghpf option allows you to pass user-defined runtime options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% a.out user_options -pghpf pghpf_options
where: a.out is the executable program, user_options are the program's valid runtime options and pghpf_options are any of the valid options for the pghpf runtime library.
Running a PVM program requires that PVM is installed on your system, and that the PVM daemon is running. In addition, the PVM_ROOT and PVM_ARCH environment variables need to be set, and the executable needs to reside in the directory appropriate for your system $PVM_ROOT/bin/$PVM_ARCH.
Table 3-3 shows the valid PVM executable options (-pghpf options).

Table 3-3 PVM Runtime Library Options and Variables

Option
Environment Variable
Purpose
-np num
PGHPF_NP num
Specify number of processors.
-stat options
PGHPF_STAToptions
Print runtime statistics upon program completion.

PVM - Number of Processors np

The command line option -np n and the environment variable PGHPF_NP num specify the number of processors to use. When neither of these options is specified, the default value for the number of processors is set to one (1). The argument supplied for n or num must be a positive integer.
For example, to compile and run an HPF program, test1.hpf, using four processors issue the following commands:
$pghpf -Mpvm -o test1 test1.hpf $test1 -pghpf -np 4
If your program uses the PROCESSORS directive and includes the intrinsic NUMBER_OF_PROCESSORS, then the value supplied to the -np command line option (or the PGHPF_NP environment variable) will be the number of processors for the current execution. For example:
!HPF$ PROCESSORS NUM_PROC(NUMBER_OF_PROCESSORS())
If your program does not use an intrinsic to calculate the number of processors available, but uses an explicit PROCESSORS directive with a value, then the -np value should be at least as great as the value used in the program's PROCESSORS directive. For example if the value supplied is four, then the -pghpf -np argument should be greater than or equal to four.
!HPF$ PROCESSORS NUM_PROC(4)
If your program does not use a PROCESSORS directive, the program runs on the number of processors specified in the -np command line option.
Print Execution Information

Refer to sections 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
3.1.6 MPI Runtime Library

MPI consists of a number of transport mechanisms conforming to the Message Passing Interface standard. One version of MPI supported by pghpf is the Public Domain software available from Argonne National Laboratory and Mississippi State University. MPI is a software system that enables a collection of computers or processors of a single computer to be used as a coherent and flexible computational resource.
The options in this section apply to programs using MPI for communication.
The executable runtime option -pghpf specifies that the following options are to be passed to the communication control portion of the executable program. The -pghpf option allows you to pass user-defined options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% mpirun mpirun_options a.out user_options -pghpf options
where:

mpirun
is the command used to execute MPI programs.
mpirun_options
are the options to the mpirun command.
a.out
is the executable program.
user_options
are the program's options.
pghpf_options
are any of the valid options to the pghpf runtime library.
Most systems require the use of the mpirun command or some other command to execute programs. Refer to your system's documentation for details.
Running an MPI program requires that MPI is installed on your system.
Table 3-4 shows the valid MPI executable options (-pghpf options).

Table 3-4 MPI Runtime Library Options and Variables

Option
Environment Variable
Purpose
-unsafe value
PGHPF_UNSAFE yes | no
Enable/Disable unsafe optimizations

Print Execution Information

Refer to section 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.

3.2 Print Execution Information stat

On most systems, the -stat runtime option or the PGHPF_STAT environment variable cause runtime statistics to display when the program completes running. There are several different options for the type of statistics available, including: CPU usage, memory usage, and message transfers. The statistics option allows any of the following arguments:
-stat [cpu|mem|msg|all|cpus|mems|msgs|alls]
The "s" versions provide information for all processors running the program on a per-processor basis. Options without the "s" provide summary information.
The PGHPF_STAT environment variable allows any of the following arguments corresponding to those available for the -stat runtime option.
PGHPF_STAT [cpu|mem|msg|all|cpus|mems|msgs|alls]
The message statistics will be zero, unless the -Mstats option is specified on the pghpf compiler command line (refer to Chapter 2, pghpf Compiler Options, for details). Enabling the message statistic collection with this option may slightly reduce performance on some systems.
3.2.1 CPU Statistics stat

To run a program, for example test2 and display the CPU related execution statistics, use the command:
%test2 -pghpf -np 8 -stat cpus cpu real user sys ratio node 0* 5.22 4.91 0.08 96% 0 1 5.11 4.87 0.07 97% 1 2 5.11 4.86 0.04 96% 2 3 5.12 4.87 0.02 96% 3 4 5.10 4.86 0.00 95% 4 5 5.01 4.87 0.01 97% 5 6 5.09 4.86 0.02 96% 6 7 5.09 4.86 0.02 96% 7 min 5.01 4.86 0.00 avg 5.11 4.87 0.03 max 5.22 4.91 0.08 total 5.22 38.96 0.26 7.52x
The first eight lines show information for each processor. The * in the first line indicates the processor printing the information.
The first column shows the logical processor number.
The second column shows the real or elapsed time in seconds.
The third column shows the user CPU time in seconds.
The fourth column shows the system CPU time in seconds.
The fifth column shows the percentage of the elapsed time that the CPU was active. This is calculated using the following formula:
(user time + sys time)/ real time
The last column shows the processors' hostname or system-dependent id; on some systems this is the physical processor number or node, on others it is the socket or process id.
The next to last three lines show the minimum, average and maximum times for the real, user, and system times reported.
The last line shows the maximum elapsed time, the total user and system times, and the speedup factor. The speedup factor is calculated using the following formula: (user time + sys time)/real time.
On most systems the -stat cpu runtime option or the PGHPF_STATenvironment variable set to cpu displays only the minimum, average, maximum, and total time for the CPU statistics.
%test2 -pghpf -np 8 -stat cpu cpu real user sys ratio node min 5.01 4.86 0.00 avg 5.11 4.87 0.03 max 5.22 4.91 0.08 total 5.22 38.96 0.26 7.52x
If only one processor executes the program, the minimum, average, and maximum times are not shown.
All times are acquired from the host operating systems using calls such as gettimeofday(), getrusage(), and times(). Note that the output is not always accurate, for example, the user and system time for a single processor may occasionally exceed the real time.
Memory Statistics

To run a program, for example test2 and see the memory-related execution statistics, use the command:
%test2 -pghpf -np 8 -stat mems
The mems option displays output showing the columns: heap used, page faults, signals received, voluntary switches, involuntary switches and res size shown in the next two tables.

Table 3.6 Memory Statistics

memory
heap used
pag flts no i/o
pag flts i/o
signals received
voluntary switches
involunt switches
res size (pages)
0* 8MB
2306
0
0
0
0
4901
89
1 8MB
2291
0
0
0
0
4907
99
total 16MB
4597
0
0
0
0
9808
188
The second column, heap used, shows the total heap used by all nodes.
The third column, pag flts (no i/o), shows the total number of page faults that did not require I/O.
The fourth column, pag flts ( i/o) shows the total number of page faults that did require I/O.
The fifth column shows the total number of signals received.
The sixth column shows the total number of voluntary context switches.
The seventh column shows the total number of involuntary context switches.
The eighth column shows the total number of resident set pages.
The third through eighth columns show information returned by getrusage().
Some systems do not fully support getrusage, these systems will display zero for unsupported fields. Refer to your system's man pages for more information. [1]
One line is displayed for each node present. The last line is the same as the line displayed by the option -stat mem.
On most systems the -stat mem runtime option or the PGHPF_STATenvironment variable set to mem displays only the totals for the memory statistics.
Message Statistics

To run a program, for example test2 and see the message-related execution statistics, use the command:
%test2 -pghpf -np 8 -stat msgs
The message statistics will be zero, unless the -Mstats option is specified on the pghpf compiler command line (refer to Chapter 2, pghpf Compiler Options, for details). Enabling the message statistic collection with this option may slightly reduce performance on some systems.
The msgs option will display the following:

messages
send
send
send
recv
recv
recv
copy
copy
copy

cnt
total
avg
cnt
total
avg
cnt
total
avg
0*
0
0B
0B
100
400B
4B
0
0B
0B
1
0
0B
0B
100
400B
4B
0
0B
0B
2
0
0B
0B
100
400B
4B
0
0B
0B
3
300
1KB
4B
0
0B
0B
100
400B
4B
total
300
1KB
4B
300
1KB
4B
100
400B
4B
One line is displayed for each node.
The msg and msgs option will display statistics on send counts, receive counts, and copy counts.
All Statistics

The all or alls options are equivalent to specifying the cpu, mem, msg or cpus, mems, msgs options respectively.The message statistics will be zero, unless the -Mstats option is specified on the pghpf compiler command line (refer to the preceding section for details).

[1]Although Solaris does not fully support getrusage, all fields are supported using the proc file.

Option	Environment Variable	Purpose
-curhost hostname	`PGHPF_CURHOST` hostname	Specify alternate name for current host.
-debugger path	`PGHPF_DEBUGGER`path	Specify the default debugger.
-g n \| all	`PGHPF_G` `[`n`\| all]`	Enable parallel debugging.
-heapz n	`PGHPF_HEAPZ n`	Specify the shared global heap size.
-host hostargs	`PGHPF_HOST` hostargs	Specify the hosts where additional processes are to be spawned.
-minxfer size	`PGHPF_MINXFER` size	Specify minimum unbuffered message size.
-mount arg	`PGHPF_MOUNT`arg	For systems that use auto mounted file systems, this option fixes a problem where the RPM library is not able to resolve NFS pathnames.
-np num	`PGHPF_NP` num	Specify number of processors.
-rsh shell	`PGHPF_RSH`	Specify an alternate shell to be used in spawning processes on remote systems.
-stat options	`PGHPF_STAT`options	Print runtime statistics upon program completion.

`memory`	`heap used`	`pag flts no i/o`	`pag flts i/o`	`signals received`	`voluntary switches`	`involunt switches`	`res size (pages)`
`0* 8MB`	`2306`	`0`	`0`	`0`	`0`	`4901`	`89`
`1 8MB`	`2291`	`0`	`0`	`0`	`0`	`4907`	`99`
`total 16MB`	`4597`	`0`	`0`	`0`	`0`	`9808`	`188`

`messages`	`send`	`send`	`send`	`recv`	`recv`	`recv`	`copy`	`copy`	`copy`
	`cnt`	`total`	`avg`	`cnt`	`total`	`avg`	`cnt`	`total`	`avg`
`0*`	`0`	`0B`	`0B`	`100`	`400B`	`4B`	`0`	`0B`	`0B`
`1`	`0`	`0B`	`0B`	`100`	`400B`	`4B`	`0`	`0B`	`0B`
`2`	`0`	`0B`	`0B`	`100`	`400B`	`4B`	`0`	`0B`	`0B`
`3`	`300`	`1KB`	`4B`	`0`	`0B`	`0B`	`100`	`400B`	`4B`
`total`	`300`	`1KB`	`4B`	`300`	`1KB`	`4B`	`100`	`400B`	`4B`