On Wed, 16 Jul 2008, Adam Jundt wrote:
> I have been working on getting a nightly tarball of Open MPI to build on
> a Cray XT4 system running CNL. I found the following post on the forum:
> http://www.open-mpi.org/community/lists/users/2007/09/4059.php. I had to
> modify the configure options a little (added another include directory
> to CFLAGS, and inserted the '--disable-mpi-f77' flag) to get it to build
> for me, here is what I used:
> ./configure CC=/opt/xt-pe/default/bin/snos64/linux-pgcc
> LDFLAGS=-L/opt/xt-mpt/default/lib/snos64/ LIBS="-lpct -lalpslli
> -lalpsutil" --build=x86_64-unknown-linux-gnu
> --with-io-romio-flags=--disable-aio build_alias=x86_64-unknown-linux-gnu
> host_alias=x86_64-cray-linux-gnu --enable-ltdl-convenience
> --no-recursion --disable-mpi-f77 --prefix=~/OpenMPI
I don't think it's a huge deal, but I think things will be a bit more sane
if you change the --with-platform argument to cray_xt_cnl_romio instead of
cray_xt3_romio (which is really targeting Catamount instead of CNL). One
of the ORNL guys can probably be more helpful than I can here, as I'm only
familiar with building on Red Storm / Catamount.
> ~/OpenMPI/lib/libopen-pal.a(timer_catamount_component.o): In function
> timer_catamount_component.c:(.text+0x6): undefined reference to `__cpu_mhz'
> Looking into timer_catamount_component.c, __cpu_mhz is defined within
> the <catamount/dclock.h> file (which it should have already pulled in).
> I realize that this is a very specified question, but I was curious if
> anyone else had successfully gotten Open MPI to work on a similar
> system, and if so, what configure options were used? If not, is anyone
> aware of how to circumvent the problem?
> By the way, I did try modifying the file timer_catamount_component.c to
> not reference __cpu_mhz to see the result, and the program is able to
> successfully compile, but hangs upon execution, i.e.:
That's a weird result. The configure test for the timer catamount
component checks to see if __cpu_mhz is defined when linking. Can you
send me (off list is probably best) the config.log generated by configure?
That component was added just to the trunk/v1.3 branch in the last month,
which is probably why no one on CNL noticed yet (obviously it works great
on Catamount). I'm not really familiar with CNL -- does
catamount/dclock.h exist on a standard CNL setup?
>> ~/OpenMPI/bin/mpicc test.c
> ~/OpenMPI/lib/libopen-rte.a(session_dir.o): In function
> session_dir.c:(.text+0x7e): warning: Using 'getpwuid' in statically
> linked applications requires at runtime the shared libraries from the
> glibc version used for linking
> ~/OpenMPI/lib/libmpi.a(btl_tcp_component.o): In function
> btl_tcp_component.c:(.text+0x11c0): warning: Using 'getaddrinfo' in
> statically linked applications requires at runtime the shared libraries
> from the glibc version used for linking
>> aprun -n 2 ./a.out
> ... program hangs...
I'm afraid I can't help a whole lot here. HOwever, there are some
differences between how Open MPI initializes Portals between CNL and
Catamount. Since you configured for Catamount, it's possible that's the
cause of the hang. Again, the ORNL people would probably know better than