Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] Application logging
From: Barry Rountree (rountree_at_[hidden])
Date: 2008-05-07 17:45:31


On Wed, May 07, 2008 at 01:51:03PM -0400, Alberto Giannetti wrote:
>
> On May 7, 2008, at 1:32 PM, Barry Rountree wrote:
>
> > On Wed, May 07, 2008 at 12:33:59PM -0400, Alberto Giannetti wrote:
> >> I need to log application-level messages on disk to trace my program
> >> activity. For better performances, one solution is to dedicate one
> >> processor to the actual I/O logging, while the other working
> >> processors would trace their activity through non-blocking, string
> >> message sends:
> >
> > A few comments:
> >
> > If you're using a cluster where each node has dedicated disk space, it
> > would probably be better to open a local file and log messages there.
> > After the application completes, then collate all the local files
> > together.
> >
> > Even simpler is to open a file in the directory from which you started
> > the application. Using 16 older opterons and writing 2 lines per MPI
> > call per node, the overhead for doing this was small enough to be lost
> > in the noise.
>
> What I want to avoid is disk I/O operations in some of my real-time
> processors. fprintf, fputs or other write operations are the most
> time-consuming system calls and I'd rather dedicate a processor/CPU
> to that task.
>
> Assuming my logger processor is allocated on a remote host (a worst-
> case scenario), are you saying that, for instance, a 256 bytes disk
> write is faster that a non-blocking send to the remote node?

The best thing for you to do is instrument the logger you want to you
and see how much overhead it generates. I think you'll be surprised
with how fast fprintf is these days.

>
> > Call this after MPI_Init and after you've figured out
> > which node you are.
> >
> > FILE *
> > initialize_logfile(int rank){
> >
> > char format[]="runtime.%02d.dat";
> > char fname[64];
> > sprintf(fname, format, rank);
> > blr_logfile = fopen(fname, "w");
> > assert(blr_logfile);
> > return blr_logfile;
> > }
> >
> > Then just fprintf(logfile, ...) as needed.
> >
> > There are configurations where this won't work, of course, and it
> > won't
> > scale to thousands of nodes. But I've found it to be rock-solid
> > for my
> > work.
> >
> >>
> >> /* LOGGER PROCESSOR MAIN LOOP */
> >> void logger(void)
> >> {
> >> MPI_Status status;
> >> char buf[LOGMSG_MAXSIZE];
> >>
> >> printf("Logger: Started\n");
> >>
> >> while( 1 ) {
> >> MPI_Recv(&buf, LOGMSG_MAXSIZE, MPI_CHAR, MPI_ANY_SOURCE,
> >> LOGMSG_TAG, MPI_COMM_WORLD, &status);
> >> buf[status.count] = '\0';
> >> /* ACTUAL I/O */
> >> printf("Processor %d ==> %s\n", status.MPI_SOURCE, buf);
> >> }
> >> }
> >>
> >>
> >> /* WORKER PROCESSOR LOGGING */
> >> void mylog(char* msg)
> >> {
> >> MPI_Request req;
> >> int msglen = strlen(msg);
> >>
> >> if( msglen > LOGMSG_MAXSIZE ) {
> >> /* Truncate */
> >> msg[LOGMSG_MAXSIZE-1] = '\0';
> >> msglen = LOGMSG_MAXSIZE;
> >> }
> >>
> >> /* Non-blocking send */
> >> MPI_Isend(msg, msglen, MPI_CHAR, LOGGER, LOGMSG_TAG,
> >> MPI_COMM_WORLD, &req);
> >> }
> >>
> >>
> >> I figured this must be a common problem in MPI applications and was
> >> wondering if there is any library available or related discussions.
> >>
> >> _______________________________________________
> >> users mailing list
> >> users_at_[hidden]
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users