PGI Workstation User's Guide - Contents
Preface
1 Getting Started
1.1 Overview
1.2 Invoking the command-level PGI Compilers
1.2.1 Command-line Syntax-
1.2.2 Command-line Options
1.2.3 Fortran Directives and C/C++ Pragmas
1.3 Filename Conventions
1.3.1 Input Files
1.3.2 Output Files
1.4 Parallel Programming Using the PGI Compilers
1.4.1 Running SMP Parallel Programs
1.4.2 Running Data Parallel HPF Programs
1.5 Using the PGI Compilers on Linux
1.5.1 Linux Header Files
1.5.2 Running Parallel Programs on Linux
1.6 Using the PGI Compilers on Win32
2 Optimization & Parallelization
2.1 Overview of Optimization
2.2 Invoking Optimization
2.3 Selecting Appropriate Optimizations - Checklist
2.3.1 Guidelines for Selecting An Optimization Level
2.4 Minimal Optimization (-O0)
2.5 Local Optimization ( -O1)
2.6 Global Optimization (-O2, -O)
2.7 Vectorization (Mvect)
2.8 Parallelization (Mconcur, mp)
2.9 Loop Unrolling (Munroll)
2.10 Default Optimization Levels
2.11 Local Optimization Using Directives and Pragmas
2.12 Execution Timing and Instruction Counting
3 Optimization Features
3.1 Using Optimization Features
3.1.1 Using the -Mvect Vectorization Option
3.1.2 Using the -Mconcur Auto-parallelization Option
3.1.3 Determining Why a Loop is Not Vectorized or Parallelized
3.1.4 Data Dependences and Safe Pointers
3.1.5 Using the -Munroll Loop Unroller Option
3.2 Types of Optimization
3.2.1 Local Optimization
3.2.2 Global Optimization
3.3 Loop Unrolling
3.4 Vectorization
3.4.1 Loop Interchange
3.4.2 Nested Loop Distribution
3.4.3 Outer Loop Distribution
3.4.4 Inner Loop Distribution
3.4.5 Automatic Usage of Pentium III SSE Instructions
3.4.6 Pentium III and Athlon Prefetch Instructions
3.5 High-Level Transformations
3.5.1 Reductions
3.5.2 Expandable Scalars
3.5.3 Calls
3.6 Parallelization Transformations
3.6.1 Candidate Parallel Loops
3.6.2 Loops Which Fail to Parallelize
3.6.3 Timing Loops
3.6.4 Scalars
3.6.5 The safe_lastval Option
4 Function Inlining
4.1 Invoking Function Inlining
4.1.1 Using an Inline Library
4.2 Creating an Inline Library
4.2.1 Working with Inline Libraries
4.2.2 Updating Inline Libraries - Makefiles
4.3 Error Detection During Inlining
4.4 Examples
4.5 Restrictions on Inlining
5 Fortran, C and C++ Data Types
5.1 Fortran Data Types
5.1.1 Fortran Scalars
5.1.2 FORTRAN 77 Aggregate Data Type Extensions
5.1.3 Fortran 90 Aggregate Data Types (Derived Types)
5.2 C and C++ Data Types
5.2.1 C and C++ Scalars
5.2.2 C and C++ Aggregate Data Types
5.2.3 Class and Object Data Layout
5.2.4 Aggregate Alignment
5.2.5 Bit-field Alignment
5.2.6 Other Type Keywords in C and C++
6 Inter-language Calling
6.1 Overview of Calling Conventions
6.2 Inter-language Calling Considerations
6.3 Functions and Subroutines
6.4 Upper and Lower Case Conventions, Underscores
6.5 Compatible Data Types
6.5.1 Fortran Named Common Blocks
6.6 Argument Passing and Return Values
6.6.1 Passing By Value (%VAL)
6.6.2 Character Return Values
6.6.3 Complex Return Values
6.7 Array Indices
6.8 Example - Fortran Calling C
6.9 Example - C Calling Fortran
6.10 Example - C ++ Calling C
6.11 Example - C Calling C++
6.12 Example - Fortran Calling C++
6.13 Example - C++ Calling Fortran
6.14 Win32 Calling Conventions
6.14.1 Win32 Fortran Calling Conventions
6.14.2 Symbol Name Construction and Calling Example
6.14.3 Using the Default Calling Convention
6.14.4 Using the STDCALL Calling Convention
6.14.5 Using the C Calling Convention
6.14.6 Using the UNIX Calling Convention
7 Command-line Options
7.1 Generic PGI Compiler Options
7.2 C and C++ -specific Compiler Options
8 Libraries
8.1 Libraries and Startup Routines
8.2 Using builtin Math Functions in C/C++
8.3 Creating and Using Shared Object Files on UNIX
8.4 Creating and Using DLLs on Win32
8.5 Using LIB3F on Win32
8.6 LAPACK, the BLAS and FFTs
8.7 The C++ Standard Template Library
9 Optimization Directives
and Pragmas
9.1 Adding Directives to Fortran
9.2 Fortran Directive Summary
9.3 Scope of Directives and Command Line options
9.4 Adding Pragmas to C and C++
9.5 C/C++ Pragma Summary
9.6 Scope of C/C++ Pragmas and Command Line Options
10 OpenMP Parallelization
Directives for Fortran
10.1 Parallelization Directives
10.2 PARALLEL ... END PARALLEL
10.3 CRITICAL ... END CRITICAL
10.4 MASTER ... END MASTER
10.5 SINGLE ... END SINGLE
10.6 DO ... END DO
10.7 BARRIER
10.8 DOACROSS
10.9 PARALLEL DO
10.10 SECTIONS ... END SECTIONS
10.11 PARALLEL SECTIONS
10.12 ORDERED
10.13 ATOMIC
10.14 FLUSH
10.15 THREADPRIVATE
10.16 Run-time Library Routines
10.17 Environment Variables
11 OpenMP Parallelization
Pragmas for C and C++
11.1 Parallelization Pragmas
11.2 omp parallel
11.3 omp critical
11.4 omp master
11.5 omp single
11.6 omp for
11.7 omp barrier
11.8 omp parallel for
11.9 omp sections
11.10 omp parallel sections
11.11 ordered
11.12 omp atomic
11.13 omp flush
11.14 omp threadprivate
11.15 Run-time Library Routines
11.16 Environment Variables
12 C++ Template Instantiation
12.1 Command Line control of template instantiation
12.2 Pragma control of template instantiation
12.3 Automatic template instantiation
12.4 Implicit inclusion
12.5 Template Libraries
13 C++ Name Mangling
13.1 Types of Mangling
13.2 Mangling Summary
13.2.1 Type Name Mangling
13.2.2 Nested Class Name Mangling
13.2.3 Local Class Name Mangling
13.2.4 Template Class Name Mangling
14 The PGPROF Profiler
14.1 Introduction
14.1.1 Definition of Terms
14.1.2 Compilation
14.1.3 Program Execution
14.1.4 Profiler Invocation and Initialization
14.1.5 Data Interpretation
14.1.6 Virtual Timer
14.1.7 Profile Data
14.1.8 Caveats
14.1.8.1 Clock Granularity
14.1.8.2 Optimization
14.2 X-Windows Graphical User Interface
14.2.1 Command Line Switches And X-Windows Resources
14.2.2 Using the PGPROF X-Windows GUI
14.2.2.1 File Menu
14.2.2.2 Options Menu
14.2.2.3 Sort Menu And The Sort Option Box
14.2.2.4 Select Menu And The Select Option Box
14.2.2.5 Processors Menu
14.2.2.6 View Menu
14.2.2.7 Help Menu
14.3 Command Language
14.3.1 Command Usage
15 The PGDBG Debugger
15.1 Definition of Terms
15.1.1 Definition of Terms
15.1.2 Invocation and Initialization
15.1.2.1 Command-Line Arguments
15.1.3 Command Language
15.1.4 Commands
15.1.4.1 Constants
15.1.4.2 Symbols
15.1.4.3 Scope Rules
15.1.4.4 Register Symbols
15.1.4.5 Source Code Locations
15.1.4.6 Statements
15.1.4.7 Events
15.1.4.8 Expressions
15.1.5 Debugging Fortran
15.1.5.1 Arrays
15.1.5.2 Operators
15.1.5.3 Name of Main Routine
15.1.5.4 Fortran Common Blocks
15.1.6 Debugging C++
15.1.7 Core Files
15.1.8 Debugger Commands
15.2 PGDBG Commands
15.2.1 Commands
15.2.1.1 Process Control
15.2.1.2 Events
15.2.1.3 Program Locations
15.2.1.4 Printing and Setting Variables
15.2.1.5 Symbols and Expressions
15.2.1.6 Scope
15.2.1.7 Register Access
15.2.1.8 Memory Access
15.2.1.9 Conversions
15.2.1.10 Miscellaneous
15.3 Commands and Registers
15.3.1 Command Summary
15.3.2 IA-32 Register Symbols
15.4 X-Windows Graphical User Interface
15.4.1 Startup
15.4.2 Main Window
15.4.3 Disassembly Window
15.4.4 Register Window
15.4.5 Memory Window
15.4.6 Custom Window
A Run-time Environment
A.1 Programming Model
A.2 Function Calling Sequence
A.3 Functions Returning Scalars or No Value
A.4 Integral and Pointer Arguments
A.5 Floating-Point Arguments
A.6 Structure and Union Arguments
B Messages
B.1 Diagnostic Messages
B.2 Phase Invocation Messages
B.3 Compiler Error Messages
B.3.1 Message Format
B.3.2 Message List
B.4 Runtime Error Messages
B.4.1 Message Format
B.4.2 Message List
C C++ Dialect Supported
C.1 Anachronisms Accepted
C.2 New Language Features Accepted
C.3 The following language features are not accepted
C.4 Extensions Accepted in Normal C++ Mode
C.5 cfront 2.1 Compatibility Mode
C.6 cfront 2.1/3.0 Compatibility Mode