Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] XML request
From: Ralph Castain (rhc_at_[hidden])
Date: 2009-09-01 07:30:46


Hmmm...well, for now, let's go with passing a filename. I'll add it to
the trunk code base over the next few days.

I may play/ponder a little from there to see if we can't come up with
a more efficient solution.

Thanks
Ralph

On Aug 31, 2009, at 7:26 PM, Greg Watson wrote:

> Hey Ralph,
>
> Unfortunately I don't think this is going to work for us. Most of
> the time we're starting the mpirun command using the ssh exec or
> shell service, neither of which provide any mechanism for reading
> from file descriptors other than 1 or 2. The only alternatives I see
> are:
>
> 1. Provide a separate command that starts mpirun at the end of a
> pipe that is connected to the fd passed using the -xml-fd argument.
> This command would need to be part of the OMPI distribution, because
> the whole purpose of the XML was to provide an out-of-the-box
> experience when using PTP with OMPI.
>
> 2. Implement an -xml-file option, but I could write the code for you.
>
> 3. Go back to limiting XML output to the map only.
>
> None of these are particularly ideal. If you can think of anything
> else, let me know.
>
> Regards,
> Greg
>
> On Aug 30, 2009, at 10:36 AM, Ralph Castain wrote:
>
>> What if we instead offered a -xml-fd N option? I would rather not
>> create a file myself. However, since you are calling mpirun
>> yourself, this would allow you to create a pipe on your end, and
>> then pass us the write end of the pipe. We would then send all XML
>> output down that pipe.
>>
>> Jeff and I chatted about this and felt this might represent the
>> cleanest solution. Sound okay?
>>
>>
>> On Aug 28, 2009, at 6:33 AM, Greg Watson wrote:
>>
>>> Ralph,
>>>
>>> Would this be doable? If we could guarantee that the only output
>>> that went to the file was XML then that would solve the problem.
>>>
>>> Greg
>>>
>>> On Aug 28, 2009, at 5:39 AM, Ashley Pittman wrote:
>>>
>>>> On Thu, 2009-08-27 at 23:46 -0400, Greg Watson wrote:
>>>>> I didn't realize it would be such a problem. Unfortunately there
>>>>> is
>>>>> simply no way to reliably parse this kind of output, because it is
>>>>> impossible to know what the error messages are going to be, and
>>>>> presumably they could include XML-like formatting as well. The
>>>>> whole
>>>>> point of the XML was to try and simplify the parsing of the mpirun
>>>>> output, but it now looks like it's actually more difficult.
>>>>
>>>> I thought this might be difficult when I saw you were attempting
>>>> it.
>>>>
>>>> Let me tell you about what Valgrind does because they have similar
>>>> problems. Initially they just had added --xml=yes option which
>>>> put most
>>>> of the valgrind (as distinct from application) output in xml
>>>> tags. This
>>>> works for simple cases and if you mix it with --log-
>>>> file=<filename> it
>>>> keeps the valgrind output separate from the application output.
>>>>
>>>> Unfortunately there are lots of places throughout the code where
>>>> developers have inserted print statements (in the valgrind case
>>>> these
>>>> all go to the logfile) which means the xml is interspersed with
>>>> non-xml
>>>> output and hence impossibly to parse reliably.
>>>>
>>>> What they have now done in the current release is to add a extra
>>>> --xml-file=<file> option as well as the --log-file=<file>
>>>> option. Now
>>>> in the simple case all output from a normal run goes well
>>>> formatted to
>>>> the xml file and the log file remains empty, any tool that wraps
>>>> around
>>>> valgrind can parse the xml which is guaranteed to be well
>>>> formatted and
>>>> it can detect the presence of other messages by looking for
>>>> output in
>>>> the standard log file. The onus is then on tool writers to look
>>>> at the
>>>> remaining cases and decide if they are common or important enough
>>>> to
>>>> wrap in xml and propose a patch or removal of the non-formatted
>>>> message
>>>> entirely.
>>>>
>>>> The above seems to work well, having a separate log file for xml
>>>> is a
>>>> huge step forward as it means whilst the xml isn't necessarily
>>>> complete
>>>> you can both parse it and are able to tell when it's missing
>>>> something.
>>>>
>>>> Of course when looking at this level of tool integration it's
>>>> better to
>>>> use sockets that files (e.g. --xml-socket=localhost:1234 rather
>>>> than
>>>> --xml-file=/tmp/app_XXXX.xml) but I'll leave that up to you.
>>>>
>>>> I hope this gives you something to think over.
>>>>
>>>> Ashley,
>>>>
>>>> --
>>>>
>>>> Ashley Pittman, Bath, UK.
>>>>
>>>> Padb - A parallel job inspection tool for cluster computing
>>>> http://padb.pittman.org.uk
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> devel_at_[hidden]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> _______________________________________________
>> devel mailing list
>> devel_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel