![]() |
1.0.7 (revision 953)
|
Opari2 is a tool to automatically instrument C, C++ and Fortran source code files in which OpenMP is used. Function calls to a POMP2 API are inserted around OpenMP directives. By implementing this API, detailed measurements regarding the runtime behavior of an OpenMP application can be made. A conforming POMP2 implementation needs to implement all POMP2 functions, see pomp2_lib.h for a list of those.
OpenMP 3.0 introduced tasking to OpenMP. To support this feature the POMP2 adapter needs to do some bookkeeping in regard to specific task IDs. The pomp2_lib.c provided with this package includes the necessary code so it is strongly advised to use it as a basis for writing an adapter to your own tool.
A detailed description of the first Opari version has been published by Mohr et al. in "Design and prototype of a performance tool interface for OpenMP" (Journal of supercomputing, 23, 2002).
Opari2 was developed with Autotools. After downloading and unpacking, change into your build directory and perform the following steps:
See the file INSTALL for further information.
To create an instrumented version of an OpenMP application, each file of interest is transformed by the OPARI2 tool. The application is then linked against the POMP2 runtime measurement library and optionally to a special initialization file (see section LINKING (startup initialization only) and SUMMARY for further details).
A call to Opari2 has the following syntax:
Usage: opari2 [OPTION] ... infile [outfile] with following options and parameters: [--f77|--f90|--c|--c++] [OPTIONAL] Specifies the programming language of the input source file. This option is only necessary if the automatic language detection based on the input file suffix fails. [--free-form] [OPTIONAL] Specifies that free formating is used for Fortran source files. This is the default for Fortran 90/95. [--fix-form] [OPTIONAL] Specifies that fixed formating is used for Fortran source files. This is the default for Fortran 77. [--nosrc] [OPTIONAL] If specified, OPARI2 does not generate #line constructs, which allow to preserve the original source file and line number information, in the transformation process. This option might be necessary if the OpenMP compiler does not understand #line constructs. The default is to generate #line constructs. [--nodecl] [OPTIONAL] Disables the generation of POMP2_DLISTXXXXX macros. These are used in the parallel directives of the instrumentation to make the region handles shared. By using this option the shared clause is used directly on the parallel directive with the resprective region handles. [--tpd] [OPTIONAL] Adds the clause 'copyin(<pomp_tpd>)' to any parallel construct. This allows to pass data from the creating thread to its children. The variable is declared externally in all files, so it needs to be defined by the pomp library. [--disable=<constructs>] [OPTIONAL] Disable the instrumentation of manually-annotated POMP regions or the more fine-grained OpenMP constructs such as !$OMP ATOMIC. <constructs> is a comma separated list of the constructs for which the instrumentation should be disabled. Accepted tokens are atomic, critical, master, flush, single, ordered or locks (as well as sync to disable all of them) or regions. [--task= Special treatment for the task directive abort|warn|remove] abort: Stop instrumentation with an error message when encountering a task directive. warn: Resume but print a warning. remove: Remove all task directives. [--untied= Special treatment for the untied task attribute. abort|keep|no-warn] The default beavior is to remove the untied attribute, thus making all tasks tied, and print out a warning. abort: Stop instrumentation with an error message when encountering a task directive with the untied attribute. keep: Do not remove the untied attribute. no-warn: Do not print out a warning. [--tpd-mangling= [OPTIONAL] If programming languages are mixed gnu|intel|sun|pgi| (C and Fortran), the <pomp_tpd> needs to use ibm|cray] the Fortran mangled name also in C files. This option specifies to use the mangling scheme of the gnu, intel, sun, pgi or ibm compiler. The default is to use the mangling scheme of the compiler used to build opari2. [--version] [OPTIONAL] Prints version information. [--help] [OPTIONAL] Prints this help text. infile Input file name. [outfile] [OPTIONAL] Output file name. If not specified, opari2 uses the name infile.mod.suffix if the input file is called infile.suffix. Report bugs to <scorep-bugs@groups.tu-dresden.de>.
If you run Opari2 on the input file example.c
it will create two files:
example.mod.c
is the instrumented version of example.c
, i.e. it contains the original code plus calls to the POMP2 API referencing handles to the OpenMP regions identified by Opari2.example.c.opari.inc
contains the OpenMP region handle definitions accompanied with all the relevant data needed by the handles. This compile time context (CTC) information is encoded into a string for maximum portability. For each region, the tuple (region_handle, ctc_string) is passed to an initializing function (POMP2_Assign_handle()). All calls to these initializing functions are gathered in a function named POMP2_Init_reg_XXX_YY, where XXX_YY is unique for each compilation unit.At some point during the runtime of the instrumented application, the region handles need to be initialized using the information stored in the CTC string. This can be done in one of of two ways:
We highly recommend using the first option as it incurs much less runtime overhead than the second one (no locking, no lookup needed). In this case all POMP2_Init_reg_XXX_YY functions introduced by opari2 need to be called. See LINKING (startup initialization only) for further details. For runtime initialization the ctc string as argument to the relevant POMP2 function calls is provided as an argument.
As mentioned above, we pass ctc strings to different POMP2 functions. These functions need to parse the string in order to process the encoded information. With POMP2_Region_info and ctcString2RegionInfo() the opari2 package provides means of doing this, see pomp2_region_info.h.
The CTC string is a string in the format "length*key=value*key=value*[key=value]**, for example:
*82*regionType=parallel*sscl=xmpl.c:61:61*escl=xmpl.c:66:66*hasIf=1**
Mandatory keys are:
Optional keys are
The optional values are set to 0 by default, i.e. the presence of the key denotes the presence of the respective clause.
You can use the function ctcString2RegionInfo() to decode CTC strings. It can be found in pomp2_region_info.c and pomp2_region_info.h, installed under <opari-prefix>/share/opari2/devel.
For startup initialization all POMP2_Init_reg_XXX_YY functions that can be found in the object files and libraries of the application are called. This is done by creating an additional compilation unit that contains calls to following POMP2 functions:
The resulting object file is linked to the application. During startup of the measurement system the only thing to be done is to call POMP2_Init_region() which then calls all POMP2_Init_reg_XXX_YY functions.
In order to create the additional compilation unit (for example pomp2_init_file.c
) the following command sequence can be used:
% `opari2-config --nm` <objs_and_libs> | \ `opari2-config --region-initialization` > pomp2_init_file.c
Here, <objs_and_libs> denotes the entire set of object files and libraries that were instrumented by opari2.
Due to portability reasons nm
, and the awk
script to create the additional file are not called directly but via the provided opari2-config
tool.
A call to the opari2-config
tool has the following syntax:
Usage: opari2-config [OPTION] ... <command> with the following commands: --nm Prints the nm command. --region-initialization Prints the script used to create the pomp2_init_regions.c file. --create-pomp2-regions Prints the whole command necessary <object files> for creating the initialization file. --awk-cmd [Deprecated, use --region-initialization instead.] Prints the awk command. --awk-script [Deprecated, use --region-initialization instead.] Prints the awk script. --egrep [Deprecated, use --region-initialization instead.] Prints the egrep command. --cflags Prints compiler options to include installed headers. --version Prints the opari2 version number. --interface-version Prints the pomp2 API version that instrumented files conform too. --opari2-revision Prints the revision number of the OPARI2 package. --common-revision Prints the revision number of the common package. --help Prints this help text. and the following options: [--build-check] Tells opari2-config to use build paths instead of install paths. Used for build testing. [--config=<config file>] Reads in a configuration from the given file. Report bugs to <scorep-bugs@groups.tu-dresden.de>.
For manual user instrumentation the following pragmas are provided.
C/C++:
#pragma pomp inst init #pragma pomp inst begin(region_name) #pragma pomp inst altend(region_name) #pragma pomp inst end(region_name) #pragma pomp noinstrument #pragma pomp instrument
Fortran:
!$POMP INST INIT !$POMP INST BEGIN(region_name) !$POMP INST ALTEND(region_name) !$POMP INST END(region_name) !$POMP NOINSTRUMENT !$POMP INSTRUMENT
Users can specify code regions, like functions for example, with INST
BEGIN
and INST
END
. If a region contains several exit points like return/break/exit/... all but the last need to be marked with INST
ALTEND
pragmas. The INST
INIT
pragma should be used for initialization in the beginning of main, if no other initialization method is used. The NOINSTRUMENT
and INSTRUMENT
pragmas can be used to turn off or on the instrumentation of OpenMP pragmas. All pragmas between NOINSTRUMENT
and INSTRUMENT
except for parallel regions are not instrumented. Parallel regions are always instrumented to allow a correct thread management in the performance tool. See the EXAMPLE section for an example on how to use user instrumentation.
The directory <prefix>/share/opari2/doc/example contains the following files:
example.c example.f Makefile
The Makefile contains all required information for building the instrumented and uninstrumented binaries. It demonstrates the compilation and linking steps as described above.
Additional examples which illustrate the use of user instrumentation can be found in <prefix>/share/opari2/doc/example_user_instrumentation. The folder contains the following files:
example_user_instrumentation.c example_user_instrumentation.f Makefile
Opari2 uses a new mechanism to link files. The main advantage is, that no opari.rc file is needed anymore. Libraries can now be preinstrumented and parallel builds are supported. To achieve this, the handles for parallel regions are instrumented using a ctc_string.
The POMP2 interface is not compatible with the original POMP interface. All functions of the new API begin with POMP2_. The declaration prototypes can be found in pomp2_lib.h.
The POMP2_Parallel_fork() call has an additional argument to pass the requested number of threads to the POMP2 library. This allows the library to prepare data structures and allocate memory for the threads before they are created. The value passed to the library is determined as follows:
num_threads
clause is present, the expression inside this clause is evaluated into a local variable pomp_num_threads
. This variable is afterwards passed in the call to POMP2_Parallel_fork() and in the num_threads clause itself.pomp_num_threads
and passed to the POMP2_Parallel_fork() call.In Fortran, instead of omp_get_max_threads(), a wrapper function pomp_get_max_threads_XXX_X is used. This function is needed to avoid multiple definitions of omp_get_max_threads() since we do not know whether it is defined in the user code or not. Removing all definitions in the user code would require much more Fortran parsing than is done with opari2, since function definitions cannot easily be distinguished from variable definitions.
If it is necessary for the POMP2 library to pass information from the master thread to its children, the option --tpd
can be used. Opari2 uses the copyin clause to pass a threadprivate variable pomp_tpd
to the newly spawned threads at the beginning of a parallel region. This is a 64 bit integer variable, since Fortran does not allow pointers. However a pointer can be stored in this variable, passed to child threads with the copyin clause (in C/C++ or Fortran) and later on be cast back to a pointer in the pomp library.
To support mixed programming (C/Fortran) the variable name depends on the name mangling of the Fortran compiler. This means, for GNU, Sun, Intel and PGI C compilers the variable is called pomp_tpd_ and for IBM it is called pomp_tpd in C. In Fortran it is of course always called pomp_tpd. The --tpd-mangling option can be used to change this. The variable is declared extern in all program units, so the pomp library contains the actual variable declaration of pomp_tpd as a 64 bit integer.
In OpenMP 3.0 the new tasking construct was introduced. All parts of a program are now implicitly executed as tasks and the user gets the possibility of creating tasks that can be scheduled for asynchronous execution. Furthermore these tasks can be interrupted at certain scheduling points and resumed later on (see the OpenMP API 3.0 for more detailed information).
Opari2 instruments functions POMP2_Task_create_begin and POMP2_Task_create_end to allow the recording of the task creation time. For the task execution time, the functions POMP2_Task_begin and POMP2_Task_end are instrumented in the code. To correctly record a profile or a trace of a program execution these different instances of tasks need to be differentiated. Since OpenMP does not provide Task ids, the performance measurement system needs to create and maintain own task ids. This cannot be done by code instrumentation as done by Opari2 alone but requires some administration of task ids during runtime. To allow the measurement system to administrate these ids, additional task id parameters (pomp_old_task/pomp_new_task) were added to all functions belonging to OpenMP constructs which are task scheduling points. With this package there is a "dummy" library, which can be used as an adapter to your measurement system. This library contains all the relevant functionality to keep track of the different instances of tasks and it is highly recommended to use it as a template to implement your own adapter for your measurement system.
For more detailed information on this mechanism see:
"How to Reconcile Event-Based Performance Analysis with Tasking in OpenMP"
by Daniel Lorenz, Bernd Mohr, Christian Rössel, Dirk Schmidl, and Felix Wolf
In: Proc. of 6th Int. Workshop of OpenMP (IWOMP), LNCS, vol. 6132, pp. 109121
DOI: 10.1007/978-3-642-13217-9_9
The typical usage of OPARI2 consists of the following steps:
1. Call OPARI2 for each input source file
% opari2 file1.f90 ... % opari2 fileN.f90
2. Compile all modified output files *.mod.* using the OpenMP compiler
3. Generate the initialization file
% `opari2-config --nm` <objs_and_libs> | \ `opari2-config --region-initialization` > pomp2_init_file.c
4. Link the resulting object files against the pomp2 runtime measurement library.