Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

From: Ralph H Castain (rhc_at_[hidden])
Date: 2007-01-09 10:10:44


Hi Michael

I would suggest using the nightly snapshot off of the trunk - the poe module
compiles correctly there. I suspect we need an update to bring that fix over
to the 1.2 branch.

Ralph

On 1/9/07 7:55 AM, "Michael Marti" <m.marti_at_[hidden]> wrote:

> Thanks Jeff for the hint.
>
> Unfortunately neither openmpi-1.2b3r12956 nor openmpi-1.2b2 compile
> on aix-5.3/power5. Therefore I was not able to check if the poll
> issue is gone on these versions. Both (beta2 and beta3) fail for the
> same reason:
>
> "pls_poe_module.c", line 640.2: 1506-204 (S) Unexpected end of file.
> make: 1254-004 The error code from the last command is 1.
>
> I presume there is a missing bracket or so probably inside some
> ifdef. As soon as I have a little more time I will have a look into
> it - any suggestion as to where to start are welcome...
>
> Thanks again, Michael.
>
> On Jan 2, 2007, at 3:50 PM, Jeff Squyres wrote:
>
>> Yikes - that's not a good error. :-(
>>
>> We don't regularly build / test on AIX, so I don't have much
>> immediate guidance for you. My best suggestion at this point would
>> be to try the latest 1.2 beta or nightly snapshot. We did an update
>> of the event engine (the portion of the code that you're seeing the
>> error issue from) that *may* alleviate the problem...? (I have no
>> idea, actually -- I'm just kinda hoping that the new version of the
>> event engine will fix your problem :-\ )
>>
>>
>> On Dec 27, 2006, at 10:29 AM, Michael Marti wrote:
>>
>>> Dear All
>>>
>>> I am trying to get openmpi-1.1.2 to work on AIX 5.3 / power5.
>>>
>>> :: Compilation seems to have worked with the following sequence:
>>> ====================================================================
>>> setenv OBJECT_MODE 64
>>>
>>> setenv CC xlc
>>> setenv CXX xlC
>>> setenv F77 xlf
>>> setenv FC xlf90
>>>
>>> setenv CFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -
>>> q64"
>>> setenv CXXFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -
>>> qtune=pwr5 -q64"
>>> setenv FFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5 -
>>> q64"
>>> setenv FCFLAGS "-qthreaded -O3 -qmaxmem=-1 -qarch=pwr5x -qtune=pwr5
>>> -q64"
>>> setenv LDFLAGS "-Wl,-brtl"
>>>
>>> ./configure --prefix=/ist/openmpi-1.1.2 \
>>> --disable-mpi-cxx \
>>> --disable-mpi-cxx-seek \
>>> --enable-mpi-threads \
>>> --enable-progress-threads \
>>> --enable-static \
>>> --disable-shared \
>>> --disable-io-romio
>>> ====================================================================
>>>
>>> :: After the compilation I ran make check and all 11 tests passed
>>> successfully.
>>>
>>> :: Now I'm trying to run the following command just for test:
>>> # mpirun -hostfile /gpfs/MICHAEL/MPI_hostfiles/mpinodes_b41-
>>> b44_1.asc -np 2 /usr/bin/hostname
>>> - The file /gpfs/MICHAEL/MPI_hostfiles/mpinodes_b41-b44_1.asc
>>> contains 4 hosts:
>>> r1blade041 slots=1
>>> r1blade042 slots=1
>>> r1blade043 slots=1
>>> r1blade044 slots=1
>>> - The mpirun command eventually hangs with the following message:
>>> [r1blade041:418014] poll failed with errno=25
>>> [r1blade041:418014] opal_event_loop: ompi_evesel->dispatch()
>>> failed.
>>> - In this state mpirun cannot be killed by hitting <ctrl-c> only a
>>> kill -9 will do the trick.
>>> - While the mpirun still hangs I can see that the "orted" has been
>>> launched on both requested hosts.
>>>
>>> :: I turned on all debug options in openmpi-mca-params.conf. The
>>> output for the same call of mpirun is in the file mpirun-
>>> debug.txt.gz.
>>> <mpirun-debug.txt.gz>
>>>
>>> :: As sugested in the mailinglis rules I include config.log
>>> (config.log.gz) and the output of ompi_info (ompi_info.txt.gz).
>>> <config.log.gz>
>>>
>>> <ompi_info.txt.gz>
>>>
>>>
>>> :: As I am completely new to openmpi (I have some experience with
>>> lam) I am lost at this stage. I would really appreciate if someone
>>> could give me some hints as to what is going wrong and where I
>>> could get more info.
>>>
>>> Best regards,
>>>
>>> Michael Marti.
>>>
>>>
>>> --
>>> ---------------------------------------------------------------------
>>> -
>>> ------
>>> Michael Marti
>>> Centro de Fisica dos Plasmas
>>> Instituto Superior Tecnico
>>> Av. Rovisco Pais
>>> 1049-001 Lisboa
>>> Portugal
>>>
>>> Tel: +351 218 419 379
>>> Fax: +351 218 464 455
>>> Mobile: +351 968 434 327
>>> ---------------------------------------------------------------------
>>> -
>>> ------
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> Server Virtualization Business Unit
>> Cisco Systems
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users