I dont think so,  we are using the hdf5 serial io module, our hosts have just 1 gb ethernet and our OSS has gigbit also. But again our lustre setup is brand-new with only a few users so its effectively Idle.

We also see the same behavior on NFS v3 backed by OnStor bobcats.  

Brock Palen
Center for Advanced Computing
brockp@umich.edu
(734)936-1985


On Jan 25, 2008, at 5:01 PM, Jeff Pummill wrote:

Brock,

The only thing that came to mind was that possibly on the second dump, the I/O was substantial enough to cause an overload of the OSS's (I/O servers) resulting in a process or task hang? Can you tell if your Lustre environment is getting overwhelmed when the Open MPI / FLASH combination checkpoints the second time? I know you write files > 2gb all the time, but if this particular combination is delivering that amount of data in a much shorter period of time.....

Just a thought :-\


Jeff F. Pummill
University of Arkansas



Brock Palen wrote:
I started a new run with some changes,

Shortening the run wont work well, it takes 3 days just to get  
through the AMR.

Brock Palen
Center for Advanced Computing
brockp@umich.edu
(734)936-1985


On Jan 25, 2008, at 3:01 PM, Daniel Pfenniger wrote:

  
Hi,

Brock Palen wrote:
    
Is anyone using flash with openMPI?  we are here, but when ever it
tries to write its second checkpoint file it segfaults once it gets
to 2.2GB always in the same location.

Debugging is a pain as it takes 3 days to get to that point.  Just
wondering if anyone else has seen this same behavior.
      
Just to make testing faster you might think reducing the file output
interval (trstrt or nrstrt parameters in flash.par), and decrease the
resolution (lrefine_max) to produce smaller files and to see whether
the problem is related with the file size.

	Dan

_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


    
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
  
_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users