I have attached a small program that when run on my machine produces the
error message below and locks up.
[node0000:06319] [mpool_gm_module.c:100] error(8) registering gm memory
I get the error when I run with 32 processors, but not with 4 (even if I
increase the loop count to 20000). This is on a cluster of dual-dual
core opterons with myrinet switches (i.e. using the gm routines).
Unfortunately, I don't have the configure options that were used to
build openmpi, but I don't think there was anything unusual. I've also
attached the open_info output. Here is the compile line for the code
g95 -o allreducetest allreducetest.F -I/usr/local/ompi/1.1-gcc/include
Also note that I did have to make changes to the fortran include files
in openmpi to force all of the integers to be of size 4 (i.e. declaring
them integer(4)) since the default integer size used by g95 is 8 bytes
but the openmpi fortran interface was compiled with f77 which uses 4
Any suggestions on what to look for?
Thanks for the help,
c Use reduction routines to sum whole beam moments across all
c of the processors. It also shares z moment data at PE boundaries.
c --- temporary for z moments
zmmnts0 = my_index
zmmnts = my_index
c --- Do reduction on beam z moments.
ztemp = zmmnts
nn = (1+360)*28*(1+8)