Thanks for your reply.
I am not understanding how MPI_Reduce would be useful.
Say I have 3 processes and each process has array [1,2,3,4]
When each process calculates the prefix sum using cuda each process will
have array as [1,3,6,10]
so if I use MPI_Reduce to gather results it returns me sum as 30.
but the original array is [1,2,3,4,1,2,3,4,1,2,3,4] and the prefix sum of
this array should be
Is my understanding wrong somewhere?
On Fri, May 18, 2012 at 7:05 AM, Jeff Squyres <jsquyres_at_[hidden]> wrote:
> You probably want MPI_Reduce, instead.
> On May 15, 2012, at 11:27 PM, Rohan Deshpande wrote:
> > I am performing Prefix scan operation on cluster
> > I have 3 MPI tasks and master task is responsible for distributing the
> > Now, each task calculates sum of its own part of array using GPUs and
> returns the results to master task.
> > Master task also calculates its own part of array using GPU.
> > When each task returns its result (which would be array) master task
> needs to combine all the results to get the final result.
> > Can I use MPI_SCAN to combine the results?
> > _______________________________________________
> > users mailing list
> > users_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> Jeff Squyres
> For corporate legal information go to:
> users mailing list