Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Open MPI User's mailing list

Subject: Re: [OMPI users] possible bug exercised by mpi4py
From: Jeff Squyres (jsquyres_at_[hidden])
Date: 2012-05-23 18:04:07


Thanks for all the info!

But still, can we get a copy of the test in C? That would make it significantly easier for us to tell if there is a problem with Open MPI -- mainly because we don't know anything about the internals of mpi4py.

On May 23, 2012, at 5:43 PM, Bennet Fauber wrote:

> Thanks, Ralph,
>
> On Wed, 23 May 2012, Ralph Castain wrote:
>
>> I don't honestly think many of us have any knowledge of mpi4py. Does this test work with other MPIs?
>
> The mpi4py developers have said they've never seen this using mpich2. I have not been able to test that myself.
>
>> MPI_Allgather seems to be passing our tests, so I suspect it is something in the binding. If you can provide the actual test, I'm willing to take a look at it.
>
> The actual test is included in the install bundle for mpi4py, along with the C source code used to create the bindings.
>
> http://code.google.com/p/mpi4py/downloads/list
>
> The install is straightforward and simple. Unpack the tarball, make sure that mpicc is in your path
>
> $ cd mpi4py-1.3
> $ python setup.py build
> $ python setup.py install --prefix=/your/install
> $ export PYTHONPATH=/your/install/lib/pythonN.M/site-packages
> $ mpirun -np 5 python test/runtests.py \
> --verbose --no-threads --include cco_obj_inter
>
> where N.M are the major.minor numbers of your python distribution.
>
> What I find most puzzling is that, maybe, 1 out of 10 times it will run to completion with -np 5, and it runs with all other numbers of processors I've tested always.
>
> -- bennet
>
>> On May 23, 2012, at 2:52 PM, Bennet Fauber wrote:
>>
>>> I've installed the latest mpi4py-1.3 on several systems, and there is a repeated bug when running
>>>
>>> $ mpirun -np 5 python test/runtests.py
>>>
>>> where it throws an error on mpigather with openmpi-1.4.4 and hangs with openmpi-1.3.
>>>
>>> It runs to completion and passes all tests when run with -np of 2, 3, 4, 6, 7, 8, 9, 10, 11, and 12.
>>>
>>> There is a thread on this at
>>>
>>> http://groups.google.com/group/mpi4py/browse_thread/thread/509ac46af6f79973
>>>
>>> where others report being able to replicate, too.
>>>
>>> The compiler used first was gcc-4.6.2, with openmpi-1.4.4.
>>>
>>> These are all Red Hat machines, RHEL 5 or 6 and with multiple compilers and versions of openmpi 1.3.0 and 1.4.4.
>>>
>>> Lisandro who is the primary developer of mpi4py is able to replicate on Fedora 16.
>>>
>>> Someone else is able to reproduce with
>>>
>>> [ quoting from the groups.google.com page... ]
>>> ===============================================================
>>> It also happens with the current hg version of mpi4py and
>>> $ rpm -qa openmpi gcc python
>>> python-2.7.3-6.fc17.x86_64
>>> gcc-4.7.0-5.fc17.x86_64
>>> openmpi-1.5.4-5.fc17.1.x86_64
>>> ===============================================================
>>>
>>> So, I believe this is a bug to be reported. Per the advice at
>>>
>>> http://www.open-mpi.org/community/help/bugs.php
>>>
>>> If you feel that you do have a definite bug to report but are
>>> unsure which list to post to, then post to the user's list.
>>>
>>> Please let me know if there is additional information that you need to replicate.
>>>
>>> Some output is included below the signature in case it is useful.
>>>
>>> -- bennet
>>> --
>>> East Hall Technical Services
>>> Mathematics and Psychology Research Computing
>>> University of Michigan
>>> (734) 763-1182
>>>
>>> On RHEL 5, openmpi 1.3, gcc 4.1.2, python 2.7
>>>
>>> $ mpirun -np 5 --mca btl ^sm python test/runtests.py --verbose --no-threads --include cco_obj_inter
>>> [0_at_[hidden]] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
>>> [0_at_[hidden]] MPI 2.0 (Open MPI 1.3.0)
>>> [0_at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
>>> [1_at_[hidden]] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
>>> [1_at_[hidden]] MPI 2.0 (Open MPI 1.3.0)
>>> [1_at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
>>> [2_at_[hidden]] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
>>> [2_at_[hidden]] MPI 2.0 (Open MPI 1.3.0)
>>> [2_at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
>>> [3_at_[hidden]] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
>>> [3_at_[hidden]] MPI 2.0 (Open MPI 1.3.0)
>>> [3_at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
>>> [4_at_[hidden]] Python 2.7 (/home/bennet/epd7.2.2/bin/python)
>>> [4_at_[hidden]] MPI 2.0 (Open MPI 1.3.0)
>>> [4_at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.7/mpi4py)
>>> testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather (test_cco_obj_inter.TestCCOObjInter) ...
>>> [ hangs ]
>>>
>>> RHEL5
>>> ===================================================
>>> $ python
>>> Python 2.6.6 (r266:84292, Sep 12 2011, 14:03:14)
>>> [GCC 4.4.5 20110214 (Red Hat 4.4.5-6)] on linux2
>>>
>>> $ gcc -v
>>> Using built-in specs.
>>> COLLECT_GCC=gcc
>>> COLLECT_LTO_WRAPPER=/home/software/rhel6/gcc/4.7.0/libexec/gcc/x86_64-
>>> unknown-linux-gnu/4.7.0/lto-wrapper
>>> Target: x86_64-unknown-linux-gnu
>>> Configured with: ../gcc-4.7.0/configure --prefix=/home/software/rhel6/
>>> gcc/4.7.0 --with-mpfr=/home/software/rhel6/gcc/mpfr-3.1.0/ --with-mpc=/
>>> home/software/rhel6/gcc/mpc-0.9/ --with-gmp=/home/software/rhel6/gcc/
>>> gmp-5.0.5/ --disable-multilib
>>> Thread model: posix
>>> gcc version 4.7.0 (GCC)
>>>
>>> $ mpirun -np 5 python test/runtests.py --verbose --no-threads --include cco_obj_inter
>>> [4..._at_[hidden]] Python 2.6 (/usr/bin/python)
>>> [4..._at_[hidden]] MPI 2.1 (Open MPI 1.6.0)
>>> [4..._at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
>>> [2..._at_[hidden]] Python 2.6 (/usr/bin/python)
>>> [2..._at_[hidden]] MPI 2.1 (Open MPI 1.6.0)
>>> [2..._at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
>>> [1..._at_[hidden]] Python 2.6 (/usr/bin/python)
>>> [1..._at_[hidden]] MPI 2.1 (Open MPI 1.6.0)
>>> [1..._at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
>>> [0..._at_[hidden]] Python 2.6 (/usr/bin/python)
>>> [0..._at_[hidden]] MPI 2.1 (Open MPI 1.6.0)
>>> [0..._at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
>>> [3..._at_[hidden]] Python 2.6 (/usr/bin/python)
>>> [3..._at_[hidden]] MPI 2.1 (Open MPI 1.6.0)
>>> [3..._at_[hidden]] mpi4py 1.3 (build/lib.linux-x86_64-2.6/mpi4py)
>>> testAllgather (test_cco_obj_inter.TestCCOObjInter) ... testAllgather
>>> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather
>>> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather
>>> (test_cco_obj_inter.TestCCOObjInter) ... testAllgather
>>> (test_cco_obj_inter.TestCCOObjInter) ... ERROR
>>> _______________________________________________
>>> users mailing list
>>> users_at_[hidden]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users_at_[hidden]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 
Jeff Squyres
jsquyres_at_[hidden]
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/