I removed ompi/mca/io/romio/romio/acinclude.m4. I put "autoreconf -ivf -I confdb" in And I "chmod +x" (my
stupid error is that this file wasn't executable).
And all is now OK.
These modifications have been pushed in bitbucket.

I tried to run the ROMIO tests and got an error in ompi/mpi/c/profile/MPI_File_set_errhandler.c:
OBJ_RELEASE(tmp) is calling an assertion:

 pfile_set_errhandler.c:75: PMPI_File_set_errhandler: Assertion `((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *) (tmp))->obj_magic_id' failed.
[cuzco10:10336] *** Process received signal ***
[cuzco10:10336] Signal: Aborted (6)
[cuzco10:10336] Signal code:  (-6)
[cuzco10:10336] [ 0] /lib64/ [0x3e8560f440]
[cuzco10:10336] [ 1] /lib64/ [0x3e852329c5]
[cuzco10:10336] [ 2] /lib64/ [0x3e852341a5]
[cuzco10:10336] [ 3] /lib64/ [0x3e8522b945]
[cuzco10:10336] [ 4] /home_nfs/devezep/ATLAS/openmpi-default/lib/ [0x7fcbee89d1d4]
[cuzco10:10336] [ 5] /home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/ [0x7fcbe7dbc4ea]
[cuzco10:10336] [ 6] /home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/ [0x7fcbe7d8e764]
[cuzco10:10336] [ 7] /home_nfs/devezep/ATLAS/openmpi-default/lib/ [0x7fcbee853309]
[cuzco10:10336] [ 8] /home_nfs/devezep/ATLAS/openmpi-default/lib/ [0x7fcbee852aa0]
[cuzco10:10336] [ 9] /home_nfs/devezep/ATLAS/openmpi-default/lib/ [0x7fcbee896832]
[cuzco10:10336] [10] ./a.out(main+0x3a4) [0x402434]
[cuzco10:10336] [11] /lib64/ [0x3e8521ec5d]
[cuzco10:10336] [12] ./a.out() [0x401fc9]
[cuzco10:10336] *** End of error message ***

I am currently analysing the problem (MPI_File_close() now calls MPI_File_set_errhandler()).


Jeff Squyres a écrit :
On Dec 1, 2010, at 7:35 AM, Pascal Deveze wrote:

I am not on AIM nor on google talk. Sorry. In the case you think it is necessary, I could ask for an ID.

FWIW.  Many of us find it convenient for quickie/informal discussions.  We can keep going here in email and switch to phone if it becomes necessary.
I see that we have the whole romio/confdb directory, so it seems like we should use that tree rather than copy to acinclude.m4.
I agree with you. But, as I said, I have a problem with the macro PAC_FUNC_NEEDS_DECL and the only way to solve it is to put it in acinclude.m4.

Per below, I think this is now moot -- the romio/ script should fix this.

- there's no .hgignore file -- making "hg status" difficult.  In your SVN+HG tree, can you run ./contrib/hg/ and commit/push the resulting .hgignore?  That would be most helpful.  

I have done it, and pushed.

Awesome; thanks.

- ompi/mca/io/romio/romio/adio/include/ is in the hg repo, but should not be (it's generated).

I removed it and pushed the modification.
- I don't see a romio/acinclude.m4 file in the repo, so whatever you did there doesn't show up for me.  

I see the file romio/romio/acinclude.m4 in bitbucket:

Weird.  Ok.  But I think this is now moot.

- I tried to add an ompi/mca/io/romio/romio/ executable file that contained:

autoreconf -ivf -I confdb

and that seems to make everything work.  Can you confirm/double check?  

Yes I tried what you suggest (without acinclude.m4), it seems that everything work:
autoreconf -ivf -I confdb
autoreconf: Entering directory `.'
autoreconf: not using Gettext
autoreconf: running: aclocal -I confdb --force 
autoreconf: tracing
autoreconf: running: libtoolize --copy --force
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `confdb'.
libtoolize: copying file `confdb/'
libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to and
libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree.
libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in
libtoolize: `AC_PROG_RANLIB' is rendered obsolete by `LT_INIT'
autoreconf: running: /homes/openmpi/tools/2010-10-12/bin/autoconf --include=confdb --force
autoreconf: running: /homes/openmpi/tools/2010-10-12/bin/autoheader --include=confdb --force
autoreconf: running: automake --add-missing --copy --force-missing
autoreconf: Leaving directory `.'

If I try to generate the whole MPI, works but configure fails in the romio directory.

I'm confused by this statement.  Did you run the top-level first?  That would should automatically invoke the romio/ in the Right place, do a few extra things, etc.  Then you should be able to run configure properly (and have it invoke ROMIO's configure at the Right time, etc.).

Is that what you tried?

I just did a fresh checkout of your hg, removed ompi/mca/io/romio/romio/acinclude.m4 and put in an (and made it executable) that contained:

autoreconf -ivf -I confdb

I then ran the top-level and configure, and it all worked.

You can see that ompi/mca/io/romio/romio/aclocal.m4 m4_include()'s all the relevant m4 macro files in the confdb directory, including aclocal_cc.m4, which defines PAC_FUNC_NEEDS_DECL.

If I try your autoreconf, then it works for ROMIO.
===== This does not work without acinclude.m4 ==================
./configure --prefix=$HOME/bitbucket/new-romio-for-openmpi/install --disable-ipv6 --with-openib=${OFED_BUILDROOT}/usr --enable-openib-connectx-xrc --enable-contrib-no-build=libnbc,vt --with-io-romio-flags="CFLAGS=-I$LUSTRE_PATH/usr/include/ --with-file-system=ufs+nfs+lustre"

===== This works without acinclude.m4 ==================
cd ompi/mca/io/romio/romio
autoreconf -ivf -I confdb
cd -
./configure --prefix=$HOME/bitbucket/new-romio-for-openmpi/install --disable-ipv6 --with-openib=${OFED_BUILDROOT}/usr --enable-openib-connectx-xrc --enable-contrib-no-build=libnbc,vt --with-io-romio-flags="CFLAGS=-I$LUSTRE_PATH/usr/include/ --with-file-system=ufs+nfs+lustre"

My conclusion is: There is something to change in to deal with ROMIO (call autoreconf -ivf -I confdb). In that case, the file acinclude.m4 is no more usefull.

I'm not sure what you mean...

Maybe try getting a fresh checkout that does not have any auto* kruft in it at all, remove the aclocal/acinclude, and then put in the file and re-run the top-level to see what happens.

I attached the stdout/stderr from running, configure, and make so that you can see what my output looks like.


