Open MPI logo

Open MPI User's Mailing List Archives

  |   Home   |   Support   |   FAQ   |  

This web mail archive is frozen.

This page is part of a frozen web archive of this mailing list.

You can still navigate around this archive, but know that no new mails have been added to it since July of 2016.

Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.

Subject: Re: [OMPI users] Bug in MPI_REDUCE in CUDA-aware MPI
From: Peter Zaspel (zaspel_at_[hidden])
Date: 2013-12-02 08:29:07


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Rolf,

OK, I didn't know that. Sorry.

Yes, it would be a pretty important feature in cases when you are doing
reduction operations on many, many entries in parallel. Therefore, each
reduction is not very complex or time-consuming but potentially hundreds
of thousands reductions are done at the same time. This is definitely a
point where a CUDA-aware implementation can give some performance
improvements.

I'm curious: Rather complex operations like allgatherv are CUDA-aware,
but a reduction is not. Is there a reasoning for this? Is there some
documentation, which MPI calls are CUDA-aware and which not?

Best regards

Peter

On 12/02/2013 02:18 PM, Rolf vandeVaart wrote:
> Thanks for the report. CUDA-aware Open MPI does not currently support doing reduction operations on GPU memory.
> Is this a feature you would be interested in?
>
> Rolf
>
>> -----Original Message-----
>> From: users [mailto:users-bounces_at_[hidden]] On Behalf Of Peter Zaspel
>> Sent: Friday, November 29, 2013 11:24 AM
>> To: users_at_[hidden]
>> Subject: [OMPI users] Bug in MPI_REDUCE in CUDA-aware MPI
>>
>> Hi users list,
>>
>> I would like to report a bug in the CUDA-aware OpenMPI 1.7.3
>> implementation. I'm using CUDA 5.0 and Ubuntu 12.04.
>>
>> Attached, you will find an example code file, to reproduce the bug.
>> The point is that MPI_Reduce with normal CPU memory fully works but the
>> use of GPU memory leads to a segfault. (GPU memory is used when defining
>> USE_GPU).
>>
>> The segfault looks like this:
>>
>> [peak64g-36:25527] *** Process received signal *** [peak64g-36:25527]
>> Signal: Segmentation fault (11) [peak64g-36:25527] Signal code: Invalid
>> permissions (2) [peak64g-36:25527] Failing at address: 0x600100200 [peak64g-
>> 36:25527] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0)
>> [0x7ff2abdb24a0]
>> [peak64g-36:25527] [ 1]
>> /data/zaspel/openmpi-1.7.3_build/lib/libmpi.so.1(+0x7d410)
>> [0x7ff2ac4b9410] [peak64g-36:25527] [ 2]
>> /data/zaspel/openmpi-
>> 1.7.3_build/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_reduce_intra_
>> basic_linear+0x371)
>> [0x7ff2a5987531]
>> [peak64g-36:25527] [ 3]
>> /data/zaspel/openmpi-1.7.3_build/lib/libmpi.so.1(MPI_Reduce+0x135)
>> [0x7ff2ac499d55]
>> [peak64g-36:25527] [ 4] /home/zaspel/testMPI/test_reduction() [0x400ca0]
>> [peak64g-36:25527] [ 5]
>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7ff2abd9d76d]
>> [peak64g-36:25527] [ 6] /home/zaspel/testMPI/test_reduction() [0x400af9]
>> [peak64g-36:25527] *** End of error message ***
>> --------------------------------------------------------------------------
>> mpirun noticed that process rank 0 with PID 25527 on node peak64g-36 exited
>> on signal 11 (Segmentation fault).
>> --------------------------------------------------------------------------
>>
>> Best regards,
>>
>> Peter
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and may contain
> confidential information. Any unauthorized review, use, disclosure or distribution
> is prohibited. If you are not the intended recipient, please contact the sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------
> _______________________________________________
> users mailing list
> users_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

- --
Dipl.-Inform. Peter Zaspel
Institut fuer Numerische Simulation, Universitaet Bonn
Wegelerstr.6, 53115 Bonn, Germany
tel: +49 228 73-2748 mailto:zaspel_at_[hidden]
fax: +49 228 73-7527 http://wissrech.ins.uni-bonn.de/people/zaspel.html

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSnIsjAAoJEKPU5iaGEeWb8P4P/iJBmdEev/jK0wpTkM0Fi1Dt
BXaJjDKUOaNVxrvQXJPtY1g6AZUWphndi26Y5SP4T7JyvF2isHtjwJq6KiCBJ4KW
KYEga3y8m8o1hocqoW465EkVaibo5zHqXcX7yzVGqkWb/1LwZJh9zjrGBhjPoFzT
JwuEaw7rq1DSn9QeQQPB+CnQsCrKuef5MqDQCfNcBFSoifYks32cdj2l5+Ye/Ltx
vaxPi7VeQuWGcPlvAIE4rdgQVjV3IS+1WcxiMSpUoj2D1IgLDveXWdUlRFjxwEu8
gmRxKMAH4A4WfvpppQYGV9h49kim8EZHfVtHf7c+jRRPDJEDLPdmOltkAlfENL5e
GroMx5PFUqWRpBYoFPh51XqBak9uqai3tD/R2YdBITufRC/UvrfIq0nYyKsnOLUc
0VXejoRJRMuRrJbjHJMtT+EZsln0jaoRuNERbikCwlFvkNevSpcHnC+SNIN73KUY
99g+hwtxdk4oIH4W+YmRlzbKPRBxiTTw9VjufIwo0EcFoI9JfiVbFpXGDTZfUu6x
Z088fu3hCA/q5UoXS1NsDHWUywzkrWsnANSQHXIKXK8jMnounX1kGZ7NH1eA3rrF
IX+EqBybTyrbUQb+XDy3cltBeXFiMxTfN0f4KN8yATol7qeSIpxeeYf5NMT/eBn/
uEWxs9hiQW1IYJ4q3F1S
=Wr/G
-----END PGP SIGNATURE-----