Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Problem using VampirTrace
From: Thomas Ropars (tropars_at_[hidden])
Date: 2008-09-15 09:04:07


Hello,

I don't have a common file system for all cluster nodes.

I've tried to run the application again with VT_UNIFY=no and to call
vtunify manually. It works well. I managed to get the .otf file.

Thank you.

Thomas Ropars

Andreas Knüpfer wrote:
> Hello Thomas,
>
> sorry for the delay. My first asumption about the cause of your problem is the
> so called "unify" process. This is a post-processing step which is performed
> automatically after the trace run. This step needs read access to all files,
> though. So, do you have a common file system for all cluster nodes?
>
> If yes, set the env variable VT_PFORM_GDIR point there. Then the traces will
> be copied there from the location VT_PFORM_LDIR which still can be a
> node-local directory. Then everything will be handled automatically.
>
> If not, please set VT_UNIFY=no in order to disable automatic unification. Then
> you need to call vtunify manually. Please copy all files from the run
> directory that start with your OTF file prefix to a common directory and call
>
> %> vtunify <number of processes> <file prefix>
>
> there. This should give you the <prefix>.otf file.
>
> Please give this a try. If it is not working, please give me an 'ls -alh' from
> your trace directory/directories.
>
> Best regards, Andreas
>
>
> P.S.: Please have my email on CC, I'm not on the users_at_[hidden] list.
>
>
>
>
>>> From: Thomas Ropars <tropars_at_[hidden]>
>>> Date: August 11, 2008 3:47:54 PM IST
>>> To: users_at_[hidden]
>>> Subject: [OMPI users] Problem using VampirTrace
>>> Reply-To: Open MPI Users <users_at_[hidden]>
>>>
>>> Hi all,
>>>
>>> I'm trying to use VampirTrace.
>>> I'm working with r19234 of svn trunk.
>>>
>>> When I try to run a simple application with 4 processes on the same
>>> computer, it works well.
>>> But if try to use the same application with the 4 processes executed
>>> on 4 different computers, I never get the .otf file.
>>>
>>> I've tried to run with VT_VERBOSE=yes, and I get the following trace:
>>>
>>> VampirTrace: Thread object #0 created, total number is 1
>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe8349ca.3294 id 1] for generation [buffer 32000000 bytes]
>>> VampirTrace: Thread object #0 created, total number is 1
>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834bca.3020 id 1] for generation [buffer 32000000 bytes]
>>> VampirTrace: Thread object #0 created, total number is 1
>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834aca.3040 id 1] for generation [buffer 32000000 bytes]
>>> VampirTrace: Thread object #0 created, total number is 1
>>> VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834fca.3011 id 1] for generation [buffer 32000000 bytes]
>>> Ring : Start
>>> Ring : End
>>> [1]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834aca.3040 id 1]
>>> [2]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834bca.3020 id 1]
>>> [1]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834aca.3040 id 1]
>>> [3]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834fca.3011 id 1]
>>> [2]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834bca.3020 id 1]
>>> [0]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe8349ca.3294 id 1]
>>> [1]VampirTrace: Wrote unify control file ./ring-vt.2.uctl
>>> [2]VampirTrace: Wrote unify control file ./ring-vt.3.uctl
>>> [3]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe834fca.3011 id 1]
>>> [0]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
>>> vt.fffffffffe8349ca.3294 id 1]
>>> [0]VampirTrace: Wrote unify control file ./ring-vt.1.uctl
>>> [0]VampirTrace: Checking for ./ring-vt.1.uctl ...
>>> [0]VampirTrace: Checking for ./ring-vt.2.uctl ...
>>> [1]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834aca.
>>> 3040.1.def
>>> [2]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834bca.
>>> 3020.1.def
>>> [3]VampirTrace: Wrote unify control file ./ring-vt.4.uctl
>>> [1]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834aca.
>>> 3040.1.events
>>> [2]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834bca.
>>> 3020.1.events
>>> [3]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834fca.
>>> 3011.1.def
>>> [1]VampirTrace: Thread object #0 deleted, leaving 0
>>> [2]VampirTrace: Thread object #0 deleted, leaving 0
>>> [3]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834fca.
>>> 3011.1.events
>>> [3]VampirTrace: Thread object #0 deleted, leaving 0
>>>
>>>
>>> Regards
>>>
>>> Thomas
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>
>
>
>