Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] MPI_AllGather null terminator character
From: Gabriele Fatigati (g.fatigati_at_[hidden])
Date: 2012-01-28 05:22:18


HI Jeff,

I had the same idea so my simple code I have already done calloc and memset
..

The same warning still appear using strncmp that should exclude
uninitialized bytes on hostnam_recv_buf :(

My apologize for being so insistent, but I would understand if there is
some bug in MPI_Allgather, strcmp or Valgrind itself.

2012/1/27 Jeff Squyres <jsquyres_at_[hidden]>

> Ah, I have an idea what might be happening here: I believe that valgrind
> is actually pretty smart.
>
> If you have a buffer of size 128, and gethostname() only fills in, say,
> the first 32 bytes (including the \0), the other 128-32=96 bytes are
> uninitialized. You can MPI_Allgather these, in which case those 96
> uninitialized bytes will be copied over to the hostname_recv_buf buffer.
>
> For each rank, valgrind can actually track the local memcpy from
> local_hostname to hostnam_recv_buf[rank * MAX_LEN_SIZE], and it knows that
> those 96 bytes are still uninitialized.
>
> So when you go to strcmp them later, valgrind says "ah ha! those are
> uninitialized!"
>
> Meaning: I think that in some cases, valgrind is actually tracking the
> memcpy of uninitialized bytes and then alerting you later when you access
> those secondary uninitialized bytes.
>
> If I'm right, you can memset the local_hostname buffer (or use calloc),
> and then valgrind warnings will go away.
>
>
>
> On Jan 27, 2012, at 8:21 AM, Gabriele Fatigati wrote:
>
> > Hi Jeff,
> >
> > yes, very stupid bug in a code, but also with the correction the problem
> with Valgrind in strcmp remains:
> >
> > ==21779== Conditional jump or move depends on uninitialised value(s)
> > ==21779== at 0x4A0898C: strcmp (mc_replace_strmem.c:711)
> > ==21779== by 0x400BA8: main (all_gather.c:28)
> > ==21779==
> > ==21779== Conditional jump or move depends on uninitialised value(s)
> > ==21779== at 0x4A0899A: strcmp (mc_replace_strmem.c:711)
> > ==21779== by 0x400BA8: main (all_gather.c:28)
> > ==21779==
> > ==21779== Conditional jump or move depends on uninitialised value(s)
> > ==21779== at 0x4A089BA: strcmp (mc_replace_strmem.c:711)
> > ==21779== by 0x400BA8: main (all_gather.c:28)
> >
> >
> > Do you have the same warning with Valgrind? Localhost name is something
> like "node343" "node344" and so on.
> >
> >
> > 2012/1/27 Jeff Squyres <jsquyres_at_[hidden]>
> > I see one problem:
> >
> > gethostname(local_hostname, sizeof(local_hostname));
> >
> > That should be:
> >
> > gethostname(local_hostname, max_name_len);
> >
> > because sizeof(local_hostname) will be sizeof(void*).
> >
> > But if that's what you were intending, just to simulate a small hostname
> buffer, then be aware that gethostname() will not put a \0 after the
> string, because it'll copy in sizeof(local_hostname) characters and then
> stop.
> >
> > Specifically, the man page on OS X says:
> >
> > The gethostname() function returns the standard host name for the
> current
> > processor, as previously set by sethostname(). The namelen argument
> > specifies the size of the name array. The returned name is
> null-termi-
> > nated, unless insufficient space is provided.
> >
> > Hence, MPI is transmitting the entire 255 characters in your source
> array (regardless of content -- MPI is not looking for \0's; you gave it
> the explicit length of the buffer), but if they weren't filled with \0's,
> then the receiver's printf will have problems handling it.
> >
> >
> >
> > On Jan 27, 2012, at 4:03 AM, Gabriele Fatigati wrote:
> >
> > > Sorry,
> > >
> > > this is the right code.
> > >
> > > 2012/1/27 Gabriele Fatigati <g.fatigati_at_[hidden]>
> > > Hi Jeff,
> > >
> > > The problem is when I use strcmp on ALLGather buffer and Valgrind that
> raise a warning.
> > >
> > > Please check if the attached code is right, where size(local_hostname)
> is very small.
> > >
> > > Valgrind is used as:
> > >
> > > mpirun valgrind --leak-check=full --tool=memcheck ./all_gather
> > >
> > > and openmpi/1.4.4 compiled with "-O0 -g"
> > >
> > > Thanks!
> > >
> > > 2012/1/26 Jeff Squyres <jsquyres_at_[hidden]>
> > > I'm not sure what you're asking.
> > >
> > > The entire contents of hostname[] will be sent -- from position 0 to
> position (MAX_STRING_LEN-1). If there's a \0 in there, it will be sent.
> If the \0 occurs after that, then it won't.
> > >
> > > Be aware that get_hostname(buf, size) will not put a \0 in the buffer
> if the hostname is exactly "size" bytes. So you might want to double check
> that your get_hostname() is returning a \0-terminated string.
> > >
> > > Does that make sense?
> > >
> > > Here's a sample I wrote to verify this:
> > >
> > > #include <stdio.h>
> > > #include <string.h>
> > > #include <mpi.h>
> > > #include <stdlib.h>
> > >
> > > #define MAX_LEN 64
> > >
> > > static void where_null(char *ptr, int len, int rank)
> > > {
> > > int i;
> > >
> > > for (i = 0; i < len; ++i) {
> > > if ('\0' == ptr[i]) {
> > > printf("Rank %d: Null found at position %d (string: %s)\n",
> > > rank, i, ptr);
> > > return;
> > > }
> > > }
> > >
> > > printf("Rank %d: Null not found! (string: ", rank);
> > > for (i = 0; i < len; ++i) putc(ptr[i], stdout);
> > > putc('\n', stdout);
> > > }
> > >
> > > int main()
> > > {
> > > int i;
> > > char hostname[MAX_LEN];
> > > char *hostname_recv_buf;
> > > int rank, size;
> > >
> > > MPI_Init(NULL, NULL);
> > > MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> > > MPI_Comm_size(MPI_COMM_WORLD, &size);
> > >
> > > gethostname(hostname, MAX_LEN - 1);
> > > where_null(hostname, MAX_LEN, rank);
> > >
> > > hostname_recv_buf = calloc(size * (MAX_LEN), (sizeof(char)));
> > > MPI_Allgather(hostname, MAX_LEN, MPI_CHAR,
> > > hostname_recv_buf, MAX_LEN, MPI_CHAR, MPI_COMM_WORLD);
> > > for (i = 0; i < size; ++i) {
> > > where_null(hostname_recv_buf + i * MAX_LEN, MAX_LEN, rank);
> > > }
> > >
> > > MPI_Finalize();
> > > return 0;
> > > }
> > >
> > >
> > >
> > > On Jan 13, 2012, at 2:32 AM, Gabriele Fatigati wrote:
> > >
> > > > Dear OpenMPI,
> > > >
> > > > using MPI_Allgather with MPI_CHAR type, I have a doubt about
> null-terminated character. Imaging I want to spawn node names where my
> program is running on:
> > > >
> > > >
> > > > ----------------------------------------
> > > >
> > > > char hostname[MAX_LEN];
> > > >
> > > > char*
> hostname_recv_buf=(char*)calloc(num_procs*(MAX_STRING_LEN),(sizeof(char)));
> > > >
> > > > MPI_Allgather(hostname, MAX_STRING_LEN, MPI_CHAR, hostname_recv_buf,
> MAX_STRING_LEN, MPI_CHAR, MPI_COMM_WORLD);
> > > >
> > > > ----------------------------------------
> > > >
> > > >
> > > > Now, is the null-terminated character of each local string included?
> Or I have to send and receive in MPI_Allgather MAX_STRING_LEN+1 elements?
> > > >
> > > > Using Valgrind, in a subsequent simple strcmp:
> > > >
> > > > for( i= 0; i< num_procs; i++){
> > > > if(strcmp(&hostname_recv_buf[MAX_STRING_LEN*i],
> local_hostname)==0){
> > > > ... doing something....
> > > > }
> > > > }
> > > >
> > > > raise a warning:
> > > >
> > > > Conditional jump or move depends on uninitialised value(s)
> > > > ==19931== at 0x4A06E5C: strcmp (mc_replace_strmem.c:412)
> > > >
> > > > The same warning is not present if I use MAX_STRING_LEN+1 in
> MPI_Allgather.
> > > >
> > > >
> > > > Thanks in forward.
> > > >
> > > > --
> > > > Ing. Gabriele Fatigati
> > > >
> > > > HPC specialist
> > > >
> > > > SuperComputing Applications and Innovation Department
> > > >
> > > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> > > >
> > > > www.cineca.it Tel: +39 051 6171722
> > > >
> > > > g.fatigati [AT] cineca.it
> > > > _______________________________________________
> > > > users mailing list
> > > > users_at_[hidden]
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > >
> > > --
> > > Jeff Squyres
> > > jsquyres_at_[hidden]
> > > For corporate legal information go to:
> > > http://www.cisco.com/web/about/doing_business/legal/cri/
> > >
> > >
> > > _______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > >
> > >
> > > --
> > > Ing. Gabriele Fatigati
> > >
> > > HPC specialist
> > >
> > > SuperComputing Applications and Innovation Department
> > >
> > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> > >
> > > www.cineca.it Tel: +39 051 6171722
> > >
> > > g.fatigati [AT] cineca.it
> > >
> > >
> > >
> > > --
> > > Ing. Gabriele Fatigati
> > >
> > > HPC specialist
> > >
> > > SuperComputing Applications and Innovation Department
> > >
> > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> > >
> > > www.cineca.it Tel: +39 051 6171722
> > >
> > > g.fatigati [AT] cineca.it
> > > <all_gather.c>_______________________________________________
> > > users mailing list
> > > users_at_[hidden]
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > --
> > Jeff Squyres
> > jsquyres_at_[hidden]
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > Ing. Gabriele Fatigati
> >
> > HPC specialist
> >
> > SuperComputing Applications and Innovation Department
> >
> > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
> >
> > www.cineca.it Tel: +39 051 6171722
> >
> > g.fatigati [AT] cineca.it
> > <all_gather.c>_______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquyres_at_[hidden]
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Ing. Gabriele Fatigati
HPC specialist
SuperComputing Applications and Innovation Department
Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy
www.cineca.it                    Tel:   +39 051 6171722
g.fatigati [AT] cineca.it