> Even inside MPICH2, I have given little attention to threadsafety and
> the MPI-IO routines. In MPICH2, each MPI_File* function grabs the big
> critical section lock -- not pretty but it gets the job done.
> When ported to OpenMPI, I don't know how the locking works.
> Furthermore, the MPI-IO library inside OpenMPI-1.4.3 is pretty old. I
> wonder if the locking we added over the years will help? Can you try
> openmpi-1.5.3 and report what happens?
In Openmpi-1.5.3 with enabled threading support, the MPI-IO routines work
without any problems. However, the dead lock now occurs when calling
mpi_finalize with the backtrace given below. This deadlock is independent
of the number of mpi tasks.
However, the deadlock during mpi_finalize does not occur when no MPI-IO
routines where called before. Unfortunately, the program terminates with a
segfault in this case, after returning from mpi_finalize (at the end of the program)
opal_mutex_lock(): Resource deadlock avoided
#0 0x0012e416 in __kernel_vsyscall ()
#1 0x01035941 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2 0x01038e42 in abort () at abort.c:92
#3 0x00d9da68 in ompi_attr_free_keyval (type=COMM_ATTR, key=0xbffda0e4, predefined=0 '\000') at attribute/attribute.c:656
#4 0x00dd8aa2 in PMPI_Keyval_free (keyval=0xbffda0e4) at pkeyval_free.c:52
#5 0x01bf3e6a in ADIOI_End_call (comm=0xf1c0c0, keyval=10, attribute_val=0x0, extra_state=0x0) at ad_end.c:82
#6 0x00da01bb in ompi_attr_delete. (type=UNUSED_ATTR, object=0x6, attr_hash=0x2c64, key=14285602, predefined=232 '\350', need_lock=128 '\200')
#7 0x00d9fb22 in ompi_attr_delete_all (type=COMM_ATTR, object=0xf1c0c0, attr_hash=0x8d0fee8) at attribute/attribute.c:1043
#8 0x00dbda65 in ompi_mpi_finalize () at runtime/ompi_mpi_finalize.c:133
#9 0x00dd12c2 in PMPI_Finalize () at pfinalize.c:46
#10 0x00d6b515 in mpi_finalize_f (ierr=0xbffda2b8) at pfinalize_f.c:62