Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] File seeking with shared filepointer issues
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2011-06-25 06:54:32


I'm not super-familiar with the IO portions of MPI, but I think that you might be running afoul of the definition of "collective." "Collective," in MPI terms, does *not* mean "synchronize." It just means that all functions must invoke it, potentially with the same (or similar) parameters.

Hence, I think you're seeing cases where MPI processes are showing correct values, but only because the updates have not completed in the background. Using a barrier is forcing those updates to complete before you query for the file position.

...although, as I type that out, that seems weird. A barrier should not (be guaranteed to) force the completion of collectives (file-based or otherwise). That could be a side-effect of linear message passing behind the scenes, but that seems like a weird interface.

Rob -- can you comment on this, perchance? Is this a bug in ROMIO, or if not, how is one supposed to use this interface can get consistent answers in all MPI processes?

On Jun 23, 2011, at 10:04 AM, Christian Anonymous wrote:

> I'm having some issues with MPI_File_seek_shared. Consider the following small test C++ program
>
>
> #include <iostream>
> #include <mpi.h>
>
>
> #define PATH "simdata.bin"
>
> using namespace std;
>
> int ThisTask;
>
> int main(int argc, char *argv[])
> {
> MPI_Init(&argc,&argv); /* Initialize MPI */
> MPI_Comm_rank(MPI_COMM_WORLD,&ThisTask);
>
> MPI_File fh;
> int success;
> MPI_File_open(MPI_COMM_WORLD,(char *) PATH,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
>
> if(success != MPI_SUCCESS){ //Successfull open?
> char err[256];
> int err_length, err_class;
>
> MPI_Error_class(success,&err_class);
> MPI_Error_string(err_class,err,&err_length);
> cout << "Task " << ThisTask << ": " << err << endl;
> MPI_Error_string(success,err,&err_length);
> cout << "Task " << ThisTask << ": " << err << endl;
>
> MPI_Abort(MPI_COMM_WORLD,success);
> }
>
>
> /* START SEEK TEST */
> MPI_Offset cur_filepos, eof_filepos;
>
> MPI_File_get_position_shared(fh,&cur_filepos);
>
> //MPI_Barrier(MPI_COMM_WORLD);
> MPI_File_seek_shared(fh,0,MPI_SEEK_END); /* Seek is collective */
>
> MPI_File_get_position_shared(fh,&eof_filepos);
>
> //MPI_Barrier(MPI_COMM_WORLD);
> MPI_File_seek_shared(fh,0,MPI_SEEK_SET);
>
> cout << "Task " << ThisTask << " reports a filesize of " << eof_filepos << "-" << cur_filepos << "=" << eof_filepos-cur_filepos << endl;
> /* END SEEK TEST */
>
> /* Finalizing */
> MPI_File_close(&fh);
> MPI_Finalize();
> return 0;
> }
>
> Note the comments before each MPI_Barrier. When the program is run by mpirun -np N (N strictly greater than 1), task 0 reports the correct filesize, while every other process reports either 0, minus the filesize or the correct filesize. Uncommenting the MPI_Barrier makes each process report the correct filesize. Is this working as intended? Since MPI_File_seek_shared is a collective, blocking function each process have to synchronise at the return point of the function, but not when the function is called. It seems that the use of MPI_File_seek_shared without an MPI_Barrier call first is very dangerous, or am I missing something?
>
> _______________________________________________________________
> Care2 makes it easy for everyone to live a healthy, green lifestyle and impact the causes you care about most. Over 12 Million members! http://www.care2.com Feed a child by searching the web! Learn how http://www.care2.com/toolbar_______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/