> Rank 0 accumulates all the res_cpu values into a single array, res. It
> starts with its own res_cpu and then adds all other processes. When
> np=2, that means the order is prescribed. When np>2, the order is no
> longer prescribed and some floating-point rounding variations can start
> to occur.
Yes you are right. Now, the question is why would these floating-point rounding
variations occur for np>2? It cannot be due to a not prescribed order!!
> If you want results to be more deterministic, you need to fix the order
> in which res is aggregated. E.g., instead of using MPI_ANY_SOURCE, loop
> over the peer processes in a specific order.
> P.S. It seems to me that you could use MPI collective operations to
> implement what you're doing. E.g., something like:
I could use these operations for the res variable (Will it make the summation
any faster?). But, I can not use them for the other 3 variables.