On Tue, 2009-12-01 at 05:47 -0800, Tim Prince wrote:
> amjad ali wrote:
> > Hi,
> > thanks T.Prince,
> > Your saying:
> > "I'll just mention that we are well into the era of 3 levels of
> > programming parallelization: vectorization, threaded parallel (e.g.
> > OpenMP), and process parallel (e.g. MPI)." is a really great new
> > learning for me. Now I can perceive better.
> > Can you please explain a bit about:
> > " This application gains significant benefit from cache blocking, so
> > vectorization has more opportunity to gain than for applications which
> > have less memory locality."
> > So now should I conclude from your reply that if we have single core
> > processor in a PC, even than we can get benefit of Auto-Vectorization?
> > And we do not need free cores for getting benefit of auto-vectorization?
> > Thank you very much.
> Yes, we were using auto-vectorization from before the beginnings of MPI
> back in the days of single core CPUs; in fact, it would often show a
> greater gain than it did on later multi-core CPUs.
> The reason for greater effectiveness of auto-vectorization with cache
> blocking and possibly with single core CPUs would be less saturation of
> memory buss.
Just for the record, there's a huge difference between "back in the days
of single core CPUs" and "before the beginnings of MPI". They're
separated by a decade or two.
Vectorisation (automatic or otherwise) is useful on pipeline
architectures. Pipeline architectures do go back a long way, at least
to the 80s. They do predate MPI I think, but not parallel programming
and message passing in general. Multi-core chips are