Would this be doable? If we could guarantee that the only output that
went to the file was XML then that would solve the problem.
On Aug 28, 2009, at 5:39 AM, Ashley Pittman wrote:
> On Thu, 2009-08-27 at 23:46 -0400, Greg Watson wrote:
>> I didn't realize it would be such a problem. Unfortunately there is
>> simply no way to reliably parse this kind of output, because it is
>> impossible to know what the error messages are going to be, and
>> presumably they could include XML-like formatting as well. The whole
>> point of the XML was to try and simplify the parsing of the mpirun
>> output, but it now looks like it's actually more difficult.
> I thought this might be difficult when I saw you were attempting it.
> Let me tell you about what Valgrind does because they have similar
> problems. Initially they just had added --xml=yes option which put
> of the valgrind (as distinct from application) output in xml tags.
> works for simple cases and if you mix it with --log-file=<filename> it
> keeps the valgrind output separate from the application output.
> Unfortunately there are lots of places throughout the code where
> developers have inserted print statements (in the valgrind case these
> all go to the logfile) which means the xml is interspersed with non-
> output and hence impossibly to parse reliably.
> What they have now done in the current release is to add a extra
> --xml-file=<file> option as well as the --log-file=<file> option. Now
> in the simple case all output from a normal run goes well formatted to
> the xml file and the log file remains empty, any tool that wraps
> valgrind can parse the xml which is guaranteed to be well formatted
> it can detect the presence of other messages by looking for output in
> the standard log file. The onus is then on tool writers to look at
> remaining cases and decide if they are common or important enough to
> wrap in xml and propose a patch or removal of the non-formatted
> The above seems to work well, having a separate log file for xml is a
> huge step forward as it means whilst the xml isn't necessarily
> you can both parse it and are able to tell when it's missing
> Of course when looking at this level of tool integration it's better
> use sockets that files (e.g. --xml-socket=localhost:1234 rather than
> --xml-file=/tmp/app_XXXX.xml) but I'll leave that up to you.
> I hope this gives you something to think over.
> Ashley Pittman, Bath, UK.
> Padb - A parallel job inspection tool for cluster computing
> devel mailing list