Hi jeffy,

Thanks for your reply. 

I am not understanding how MPI_Reduce would be useful. 

Say I have 3 processes and each process has array [1,2,3,4] 

When each process calculates the prefix sum using cuda each process will have array as [1,3,6,10]

so if I use MPI_Reduce to gather results it returns me sum as 30. 

but the original array is [1,2,3,4,1,2,3,4,1,2,3,4] and the prefix sum of this array should be

[1,3,6,10,11,13,16,20,21,23,26,30]

Is my understanding wrong somewhere?

On Fri, May 18, 2012 at 7:05 AM, Jeff Squyres <jsquyres@cisco.com> wrote:
You probably want MPI_Reduce, instead.

   http://www.open-mpi.org/doc/v1.6/man3/MPI_Reduce.3.php


On May 15, 2012, at 11:27 PM, Rohan Deshpande wrote:

> I am performing Prefix scan operation on cluster
>
> I have 3 MPI tasks and master task is responsible for distributing the data
>
> Now, each task calculates sum of its own part of array using GPUs and returns the results to master task.
>
> Master task also calculates its own part of array using GPU.
>
> When each task returns its result (which would be array) master task needs to combine all the results to get the final result.
>
> Can I use MPI_SCAN to combine the results?
>
>
>
>
>
> _______________________________________________
> users mailing list
> users@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquyres@cisco.com
For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/


_______________________________________________
users mailing list
users@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--

Best Regards,

ROHAN DESHPANDE