Hiho,
On 12:33 Tue 19 Dec , Harakiri wrote:
> i need some input on the parallelisation of quicksort.
> If anyone knows a more appropriate forum/list than
> this, please tell me =)
uhm, do you have access to a library? Get yourself a textbook on
algorithms. Knuth should be a good basis to start with...
>
> I've written 2 different approaches to the quicksort
> parallelisation.
>
> 1. The "master" server splits the array of numbers
> into N seperate parts, where N is the number of
> available clients or slave servers. Each slave then
> does a quicksort on its own subarray - finally all
> slaves sent their subarray to the master. The master
> then does a quicksort on the presorted list.
> *This was done for the assumption that presorted list
> sort faster with quicksort than very unordered/random
> numbers lists.
Just some random ideas:
* Are you bound to use Quicksort in the master? Since the arrays are
presorted, Mergesort would be more advisable.
* I wouldn't combine the arrays at the master but do this step in
parallel, too. Given 4 machines 0..3, machine 0 would merge its own
subarray with no. 1's while machine 2 would merge with
no. 3. finally no. 0 and no. 3 would combine their arrays.
* All of this is quite obsolete, as there is an optimal parallel
Quicksort at hands (all praise Wikipedia):
http://citeseer.ist.psu.edu/cache/papers/cs/16170/ftp:zSzzSzai.ist.flinders.edu.auzSzpubzSzaizSzpaperszSz199109-PaCT.pdf/powers91parallelized.pdf
> 2. The "master" server finds out the MIN and MAX value
> >from the to sorting array (iterates through all
> numbers). Then it will assign boundaries to each slave
> starting from MIN for the first slave, to MAX for the
> last slave (i.e. for 2 slaves it would be from MIN ->
> MAX/2 and MAX/2+1 -> MAX).
That would imply assumptions on the distribution of the
elements. This may be clever from a practical point of view, but is
flawed because of theoretical considerations. Assume you had 1000
elements ranging from 1.0 to 1.1 and just one element with a value of
100.0. No good.
> The master then sends the whole array to each slave,
> each slave filters through the whole array all numbers
> within their own boundary (slave 1 for example would
> find all numbers between MIN -> MAX/2),
Communication is expensive. Sending n elements is more expensive than
filtering them on the local machine and sending only the remainder. At
least this is what I'd expect, although I haven't benchmarked this.
> Anyone has any different ideas ? I've read a few
> papers about it - the basic approach would be to
> parallel the divide and conquer part - which would
> result in ALOT of network messages...
As already said, please read Powers' paper from above. I could imagine
that even though this results in _many_ messages, the algorithms
optimal runtime complexity will compensate for it.
But benchmark your own ;-)
hth
-Andreas
--
============================================
Andreas Schäfer
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
+49 3641 9 46376 - voice
============================================
- application/pgp-signature attachment: stored
|