amjad ali wrote:
> thanks T.Prince,
> Your saying:
> "I'll just mention that we are well into the era of 3 levels of
> programming parallelization: vectorization, threaded parallel (e.g.
> OpenMP), and process parallel (e.g. MPI)." is a really great new
> learning for me. Now I can perceive better.
> Can you please explain a bit about:
> " This application gains significant benefit from cache blocking, so
> vectorization has more opportunity to gain than for applications which
> have less memory locality."
> So now should I conclude from your reply that if we have single core
> processor in a PC, even than we can get benefit of Auto-Vectorization?
> And we do not need free cores for getting benefit of auto-vectorization?
> Thank you very much.
Yes, we were using auto-vectorization from before the beginnings of MPI
back in the days of single core CPUs; in fact, it would often show a
greater gain than it did on later multi-core CPUs.
The reason for greater effectiveness of auto-vectorization with cache
blocking and possibly with single core CPUs would be less saturation of