This chapter describes features of the pghpf runtime library that the programmer controls using runtime options or by setting environment variables. The pghpf runtime library uses transport-independent calls and is implemented using different library versions for different targets. The transport independent calls interact with a transport dependent interface supporting different communications protocols. The transport mechanisms currently supported include:
From the HPF programmer's point of view, the differences between versions of the pghpf runtime library have little effect on program development. The differences are:
This section describes the runtime options and environment variables for executable HPF programs linked with the pghpf runtime libraries. All runtime options may be specified as runtime options or environment variables. The environment variable corresponding to a runtime option -xxx is PGHPF_XXX. Runtime command-line options override values specified by environment variables.
The following sections describe the options for each pghpf runtime library. While many options are similar depending on the library, the options also sometimes have slightly different meanings depending on the runtime library.
All runtime options can be specified either as command line options or environment variables. An example using command line options:
a.out -pghpf -np 8 -stat allAn example using environment variables:
setenv PGHPF_NP 8Options may also be specified using command line format in an environment variable:
setenv PGHPF_STAT all
a.out
setenv PGHPF_OPTS "-np 8 -stat all"Command line options over-ride environment variables and individual environment variables over-ride the PGHPF_OPTS environment variable.
RPM is the PGI Real Parallel Machine system. RPM supports process spawning and communication among processes on a group of homogeneous hosts. Processes use UNIX sockets for communication. Processes on the same host may optionally use shared global memory for communication. RPM is similar to PVM, but RPM offers greater efficiency and performance than PVM, with fewer administrative requirements.
RPM first creates the processes needed to run the program. It uses rsh by default to create remote processes. It also establishes socket connections between each pair of processes (this phase is functionally equivalent to MPI or PVM).
The processes then exchange hostname and process ID information. The processes then determine how many other processes of the same job reside on the same host. If there is the only one process on the host, the process will always use sockets for communication.
The executable runtime option -pghpf specifies that the following options are to be passed to the communications control portion of the executable program. The -pghpf option allows you to pass user-defined runtime options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% a.out user_options -pghpf PGHPF_optionswhere: a.out is the executable program, user_options are the program's valid runtime options and PGHPF_options are any of the valid options for the pghpf runtime library.
Table 3-1 shows the valid RPM executable options (-pghpf options). Detailed descriptions are provided in the sections following Table 3-1.
Table 3-1 RPM Runtime Library Options and Variables
Option
|
Environment
Variable
|
Purpose
|
-curhost
hostname
|
PGHPF_CURHOST
hostname
|
Specify
alternate name for current host. |
-debugger
path
|
PGHPF_DEBUGGER
path
|
Specify
the default debugger.
|
-g
n | all
|
PGHPF_G
[n| all]
|
Enable
parallel debugging.
|
-heapz
n
|
PGHPF_HEAPZ
n
|
Specify
the shared global heap size.
|
-host
hostargs
|
PGHPF_HOST
hostargs
|
Specify
the hosts where additional processes are to be spawned.
|
-minxfer
size
|
PGHPF_MINXFER
size
|
Specify
minimum unbuffered message size.
|
-mount
arg
|
PGHPF_MOUNT
arg
|
For
systems that use auto mounted file systems, this option fixes a problem where
the RPM library is not able to resolve NFS pathnames.
|
-np
num
|
PGHPF_NP
num
|
Specify
number of processors.
|
-rsh
shell
|
PGHPF_RSH
|
Specify
an alternate shell to be used in spawning processes on remote systems.
|
-stat
options
|
PGHPF_STAT
options
|
Print
runtime statistics upon program completion.
|
The runtime option -np n and the environment variable PGHPF_NP num specify the number of processors to use. When neither of these options is specified, the default value for the number of processors is set to one (1). The argument supplied for n or num must be a positive integer.
For example, to compile and run the HPF program test1.hpf using four processors, issue the following commands:
$pghpf -o test1 test1.hpf $test1 -pghpf -np 4If your program uses the PROCESSORS directive and includes the intrinsic NUMBER_OF_PROCESSORS, then the value supplied to the -np runtime option (or the PGHPF_NP environment variable) will be the number of processors for the current HPF execution. For example:
!HPF$ PROCESSORS NUM_PROC(NUMBER_OF_PROCESSORS())If your program does not use an intrinsic to calculate the number of processors available, but uses an explicit PROCESSORS directive with a value, then the -np value should be at least as great as the value used in the program's PROCESSORS directive. For example if the value supplied is four, then the -pghpf -np argument should be greater than or equal to four.
!HPF$ PROCESSORS NUM_PROC(4)If your program does not use a PROCESSORS directive, the program runs on the number of processors specified in the -np runtime option.
The runtime option -host and the PGHPF_HOST environment variable specify the hosts where compute processes will be spawned. The command line option has the following syntax:
-host [ host[:n] |,-file=path |,-v |,-dyn ]...All items in the command line option are separated by commas. The command line option has the same arguments as the environment variable.
The first process always runs on the host where the program was started. If this option is not specified, all of the processes run on the same host.
This option specifies the hosts where the compute processes will be created. The first process' host is taken into account to properly spread processes across the specified hosts.
The following examples assume the program was started on moe.
The simplest form is simply a list of hostnames:
PGHPF_HOST moe,larry,curly,billThe hosts are chosen in the order specified. If two compute processes were specified, moe and larry would be chosen. If the program requested five processes, the hosts would be chosen in the following order: moe, larry, curly, bill, and moe.
Hostname can be appended with a power factor, for example:
PGHPF_HOST moe:50,larry:150,curly,bill:25When any host has a power factor specified, the hosts are chosen in the order of the specified power factor. The default power factor is 100. In the above example, curly has a power factor of 100, moe has a power factor of 50, larry has a power factor of 150 and bill has a power factor of 25. The power factor specifies the following order for host selection: larry, curly, moe,bill .
If multiple hosts have the same power factor, the order in which they are chosen is undefined.
The hostnames and optional power factors can be read from a file. Assuming the file hosts contained:
moe 50 larry 150 curlyThe following example is identical to the previous example:
PGHPF_HOST -file=hosts,bill:25Multiple hostnames and -file options can be specified in any order desired. The order is significant only if no host has a power factor.
The -dyn option changes each host's effective power factor based on that host's current load average. The calculation is:
current_power = power / (load+1)With the -dyn option, the calculated current_power factor is used to choose hosts. If a host fails to respond to the load average request, that host is ignored. This option can take some time to process since each host's load average is requested. Hosts are timed out after 10 seconds and ignored.
Normally, all of the above is done silently. The -v option displays interesting informative messages.
A final example:
PGHPF_HOST -dyn,-file=hosts,bill:25,-vThe order of the -file,-dyn, -v options is not important.
When systems are interconnected by multiple networks, the hosts often have different names on different networks. Any of a remote host's alternative names may be specified with the -host option. The -curhost option is used to specify an alternative name for the current or local host. The -host and -curhost options together can be used to specify the alternate hostnames that in turn select a different network.
The -g runtime option or PGHPF_G environment variable invokes a debugger.
Debuggers are specified by setting the PGHPF_DEBUGGER environment variable to the pathname of the debugger. For example,
setenv PGHPF_DEBUGGER /usr/ucb/dbxIf more than one debugger is present, entering commands is difficult. A better method for debugging is to create a shell script to create a window for each debugger. For example, create a script named rpmdbg (any name will do) containing the following.
exec xterm -e /usr/ucb/dbx $1And then set PGHPF_DEBUGGER to rpmdbg.
All pghpf runtime options are specified on the command line that invokes the program. The user-supplied runtime arguments must be specified with each debugger's run command.
The file rpmdbg must be accessible in the current PATH.
The n and all optional arguments to the -g option and PGHPF_G variable specify debugging parameters. The positive integer n specifies a logical process number (this should be between 0 and the number of processors). The all keyword specifies debug all processes.
The -minxfer option specifies the minimum message size that will not be buffered. Messages greater than or equal to the specified size are sent as individual messages. Consecutive messages less than the specified size sent to the same destination are buffered. The default value is currently 2 Kilobytes. Changing this value may improve performance on some systems.
There is a problem when NFS filesystems are mounted at different points on different hosts. Pathnames valid on one host are not necessarily valid on other hosts.
If filesystems are not mounted at the same directories on remote systems, remote RPM processes will not be able to find files. This can usually be corrected by specifying the -mount runtime command line option or the PGHPF_MOUNT environment variable.
The value specified is a list of match strings and replacement strings.
-mount match0:replace0,match1:replace1,...For example:
a.out -pghpf -mount /home/u:/home/1,/home/g:/home/1The pathname /home/u would be changed to /home/1 and /home/g would be changed to /home/1.
The -rsh option and the PGHPF_RSH environment variable specifies the name of the remote shell used for creating remote processes. It is normally "rsh". If your favorite remote shell was named "ksh", the option would be:
a.out -pghpf -rsh ksh
Refer to sections 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
If there are more than one processes on the same host, RPM creates a shared-memory segment or mmaps a temporary file to use for communications (depending on the target system). The processes on the host then attach the other processes' shared-memory segments or mmap the other processes' temporary files. The RPM runtime option -heapz specifies the size of the shared-memory segment or the temporary file.
The processes then allocate all globally-accessible data from the shared-memory segment or from the mmap'ed file called the global heap. Communication between processes on the same host is then a copy between global heaps. Communication between processes on different hosts uses sockets.
Never use shared-memory with more processes than available processors, this will destroy performance.
Processes by default use Unix sockets for communication. Processes on the same host may optionally use a shared global heap for communication.
The -heapz size command line option and PGHPF_HEAPZ size environment variable specify the size of the shared global heap. The default size is zero, which forces all processes to use sockets for communication. The size may be specified with a 'k' or 'm' suffix, the 'k' specifies kilobytes, and 'm' specifies megabytes. For example, specifying 4m is the same as specifying 4194304.
If the program attempts to allocate more memory than available in the shared
global heap, a message is displayed and the program aborts. A rough estimate of
the shared global heap size required can be determined by running the program
with the
"-stat mems" option and looking at the "heap used" value. Refer to
section 3.2 "Execution Information" for more information on the -stat
option.
The -heapz option is recommended when all of the processes of a program are run on a single shared memory multi-processor system. However, if the processes of a program run on multiple systems, each group of processes on a single system will use a shared global heap for communication. Sockets are still used for communication between processes on different systems.
RPM1 is the PGI Real Parallel Machine system for a single processor. RPM1 supports debugging for a RPM program. This system runs on a single processor and doesn't fork. This library works like RPM and assists with debugging by allowing a single node version of an HPF program to run and work in a manner very similar to a multi-node program.
The executable runtime option -pghpf specifies that the following options are to be passed to the communications control portion of the executable program. The -pghpf option allows you to pass user-defined runtime options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% a.out user_options -pghpf PGHPF_optionswhere: a.out is the executable program, user_options are the program's valid runtime options and PGHPF_options are any of the valid options for the pghpf runtime library.
Table 3-2 shows the valid RPM1 executable options (-pghpf options).
Table 3-2 RPM1 Runtime Library Option and Variable
Option
|
Environment
Variable
|
Purpose
|
-stat
options
|
PGHPF_STAT
options
|
Print
runtime statistics upon program completion.
|
Print Execution Information
Refer to sections 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
PVM is the Parallel Virtual Machine System available from Oak Ridge National Laboratory. PVM is a software system that enables a collection of computers to be used as a coherent and flexible concurrent computational resource. The options in this section apply to programs using PVM for communications.
The executable runtime option -pghpf specifies that the following options are to be passed to the communications control portion of the executable program. The -pghpf option allows you to pass user-defined runtime options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% a.out user_options -pghpf pghpf_optionswhere: a.out is the executable program, user_options are the program's valid runtime options and pghpf_options are any of the valid options for the pghpf runtime library.
Running a PVM program requires that PVM is installed on your system, and that the PVM daemon is running. In addition, the PVM_ROOT and PVM_ARCH environment variables need to be set, and the executable needs to reside in the directory appropriate for your system $PVM_ROOT/bin/$PVM_ARCH.
Table 3-3 shows the valid PVM executable options (-pghpf options).
Table 3-3 PVM Runtime Library Options and Variables
Option
|
Environment
Variable
|
Purpose
|
-np
num
|
PGHPF_NP
num
|
Specify
number of processors.
|
-stat
options
|
PGHPF_STAT
options
|
Print
runtime statistics upon program completion.
|
The command line option -np n and the environment variable PGHPF_NP num specify the number of processors to use. When neither of these options is specified, the default value for the number of processors is set to one (1). The argument supplied for n or num must be a positive integer.
For example, to compile and run an HPF program, test1.hpf, using four processors issue the following commands:
$pghpf -Mpvm -o test1 test1.hpf $test1 -pghpf -np 4If your program uses the PROCESSORS directive and includes the intrinsic NUMBER_OF_PROCESSORS, then the value supplied to the -np command line option (or the PGHPF_NP environment variable) will be the number of processors for the current execution. For example:
!HPF$ PROCESSORS NUM_PROC(NUMBER_OF_PROCESSORS())If your program does not use an intrinsic to calculate the number of processors available, but uses an explicit PROCESSORS directive with a value, then the -np value should be at least as great as the value used in the program's PROCESSORS directive. For example if the value supplied is four, then the -pghpf -np argument should be greater than or equal to four.
!HPF$ PROCESSORS NUM_PROC(4)If your program does not use a PROCESSORS directive, the program runs on the number of processors specified in the -np command line option.
Refer to sections 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
MPI consists of a number of transport mechanisms conforming to the Message Passing Interface standard. One version of MPI supported by pghpf is the Public Domain software available from Argonne National Laboratory and Mississippi State University. MPI is a software system that enables a collection of computers or processors of a single computer to be used as a coherent and flexible computational resource.
The options in this section apply to programs using MPI for communication.
The executable runtime option -pghpf specifies that the following options are to be passed to the communication control portion of the executable program. The -pghpf option allows you to pass user-defined options to your application program and communications control runtime options to the pghpf runtime library.
In general, the command line format for a compiled program is:
% mpirun mpirun_options a.out user_options -pghpf optionswhere:
Running an MPI program requires that MPI is installed on your system.
Table 3-4 shows the valid MPI executable options (-pghpf options).
Table 3-4 MPI Runtime Library Options and Variables
Option
|
Environment
Variable
|
Purpose
|
-unsafe
value
|
PGHPF_UNSAFE
yes | no
|
Enable/Disable
unsafe optimizations
|
Refer to section 3.2, "Execution Information" for details on printing runtime execution statistics, and the options controlling the execution statistics.
On most systems, the -stat runtime option or the PGHPF_STAT environment variable cause runtime statistics to display when the program completes running. There are several different options for the type of statistics available, including: CPU usage, memory usage, and message transfers. The statistics option allows any of the following arguments:
-stat [cpu|mem|msg|all|cpus|mems|msgs|alls]The "s" versions provide information for all processors running the program on a per-processor basis. Options without the "s" provide summary information.
The PGHPF_STAT environment variable allows any of the following arguments corresponding to those available for the -stat runtime option.
PGHPF_STAT [cpu|mem|msg|all|cpus|mems|msgs|alls]The message statistics will be zero, unless the -Mstats option is specified on the pghpf compiler command line (refer to Chapter 2, pghpf Compiler Options, for details). Enabling the message statistic collection with this option may slightly reduce performance on some systems.
To run a program, for example test2 and display the CPU related execution statistics, use the command:
%test2 -pghpf -np 8 -stat cpus cpu real user sys ratio node 0* 5.22 4.91 0.08 96% 0 1 5.11 4.87 0.07 97% 1 2 5.11 4.86 0.04 96% 2 3 5.12 4.87 0.02 96% 3 4 5.10 4.86 0.00 95% 4 5 5.01 4.87 0.01 97% 5 6 5.09 4.86 0.02 96% 6 7 5.09 4.86 0.02 96% 7 min 5.01 4.86 0.00 avg 5.11 4.87 0.03 max 5.22 4.91 0.08 total 5.22 38.96 0.26 7.52xThe first eight lines show information for each processor. The * in the first line indicates the processor printing the information.
The first column shows the logical processor number.
The second column shows the real or elapsed time in seconds.
The third column shows the user CPU time in seconds.
The fourth column shows the system CPU time in seconds.
The fifth column shows the percentage of the elapsed time that the CPU was active. This is calculated using the following formula:
(user time + sys time)/ real timeThe last column shows the processors' hostname or system-dependent id; on some systems this is the physical processor number or node, on others it is the socket or process id.
The next to last three lines show the minimum, average and maximum times for the real, user, and system times reported.
The last line shows the maximum elapsed time, the total user and system times, and the speedup factor. The speedup factor is calculated using the following formula: (user time + sys time)/real time.
On most systems the -stat cpu runtime option or the PGHPF_STAT environment variable set to cpu displays only the minimum, average, maximum, and total time for the CPU statistics.
%test2 -pghpf -np 8 -stat cpu cpu real user sys ratio node min 5.01 4.86 0.00 avg 5.11 4.87 0.03 max 5.22 4.91 0.08 total 5.22 38.96 0.26 7.52xIf only one processor executes the program, the minimum, average, and maximum times are not shown.
All times are acquired from the host operating systems using calls such as gettimeofday(), getrusage(), and times(). Note that the output is not always accurate, for example, the user and system time for a single processor may occasionally exceed the real time.
To run a program, for example test2 and see the memory-related execution statistics, use the command:
%test2 -pghpf -np 8 -stat memsThe mems option displays output showing the columns: heap used, page faults, signals received, voluntary switches, involuntary switches and res size shown in the next two tables.
Table 3.6 Memory Statistics
memory
|
heap
used
|
pag
flts no i/o
|
pag
flts i/o
|
signals
received
|
voluntary
switches
|
involunt
switches
|
res
size (pages)
|
0*
8MB
|
2306
|
0
|
0
|
0
|
0
|
4901
|
89
|
1
8MB
|
2291
|
0
|
0
|
0
|
0
|
4907
|
99
|
total
16MB
|
4597
|
0
|
0
|
0
|
0
|
9808
|
188
|
The second column, heap used, shows the total heap used by all nodes.
The third column, pag flts (no i/o), shows the total number of page faults that did not require I/O.
The fourth column, pag flts ( i/o) shows the total number of page faults that did require I/O.
The fifth column shows the total number of signals received.
The sixth column shows the total number of voluntary context switches.
The seventh column shows the total number of involuntary context switches.
The eighth column shows the total number of resident set pages.
The third through eighth columns show information returned by getrusage().
Some systems do not fully support getrusage, these systems will display zero for unsupported fields. Refer to your system's man pages for more information. [1]
One line is displayed for each node present. The last line is the same as the line displayed by the option -stat mem.
On most systems the -stat mem runtime option or the PGHPF_STAT environment variable set to mem displays only the totals for the memory statistics.
To run a program, for example test2 and see the message-related execution statistics, use the command:
%test2 -pghpf -np 8 -stat msgsThe message statistics will be zero, unless the -Mstats option is specified on the pghpf compiler command line (refer to Chapter 2, pghpf Compiler Options, for details). Enabling the message statistic collection with this option may slightly reduce performance on some systems.
The msgs option will display the following:
messages
|
send
|
send
|
send
|
recv
|
recv
|
recv
|
copy
|
copy
|
copy
|
cnt
|
total
|
avg
|
cnt
|
total
|
avg
|
cnt
|
total
|
avg
| |
0*
|
0
|
0B
|
0B
|
100
|
400B
|
4B
|
0
|
0B
|
0B
|
1
|
0
|
0B
|
0B
|
100
|
400B
|
4B
|
0
|
0B
|
0B
|
2
|
0
|
0B
|
0B
|
100
|
400B
|
4B
|
0
|
0B
|
0B
|
3
|
300
|
1KB
|
4B
|
0
|
0B
|
0B
|
100
|
400B
|
4B
|
total
|
300
|
1KB
|
4B
|
300
|
1KB
|
4B
|
100
|
400B
|
4B
|
The msg and msgs option will display statistics on send counts, receive counts, and copy counts.
The all or alls options are equivalent to specifying the cpu, mem, msg or cpus, mems, msgs options respectively.The message statistics will be zero, unless the -Mstats option is specified on the pghpf compiler command line (refer to the preceding section for details).
[1]Although Solaris does not fully support getrusage, all fields are supported using the proc file.