I have used vprof, which is free, and also works well with openmpi:
One might need slight code modifications to get output, depending on
compilers used, such as adding
to start profiling and
to end profiling where rank is the MPI rank integer.
vprof can also use papi, but I have not (yet) tried this.
Den 2009-04-23 02:00:01 skrev Brock Palen <brockp_at_[hidden]>:
> There is a tool (not free) That I have liked that works great with
> OMPI, and can use gprof information.
> Also I am not sure but Tau (which is free) Might support some gprof
> Brock Palen
> Center for Advanced Computing
> On Apr 22, 2009, at 7:37 PM, jgans wrote:
>> Yes you can profile MPI applications by compiling with -pg. However, by
>> default each process will produce an output file called "gmon.out",
>> which is a problem if all processes are writing to the same global file
>> system (i.e. all processes will try to write to the same file).
>> There is an undocumented feature of gprof that allows you to specify
>> the filename for profiling output via the environment variable
>> GMON_OUT_PREFIX. For example, one can set this variable in the .bashrc
>> file for every node to insure unique profile filenames, i.e.:
>> export GMON_OUT_PREFIX='gmon.out-'`/bin/uname -n`
>> The filename will appear as GMON_OUT_PREFIX.pid, where pid is the
>> process id on a given node (so this will work when multiple nodes are
>> contained in a single host).
>> Tiago Almeida wrote:
>>> I've never done this, but I believe that an executable compiled with
>>> profilling support (-pg) will generate the gmon.out file in its
>>> current directory, regardless of running under MPI or not. So I think
>>> that you'll have a gmon.out on each node and therefore you can "gprof"
>>> them independently.
>>> Best regards,
>>> Tiago Almeida
>>> jody wrote:
>>>> I wanted to profile my application using gprof, and proceeded like
>>>> when profiling a normal application:
>>>> - compile everything with option -pg
>>>> - run application
>>>> - call gprof
>>>> This returns a normal-looking output, but i don't know
>>>> whether this is the data for node 0 only or accumulated for all nodes.
>>>> Does anybody have experience in profiling parallel applications?
>>>> Is there a way to have profile data for each node separately?
>>>> If not, is there another profiling tool which can?
>>>> Thank You
>>>> users mailing list
>>> users mailing list
>> users mailing list
> users mailing list