This web mail archive is frozen.
This page is part of a frozen web archive of this mailing list.
You can still navigate around this archive, but know that no new mails
have been added to it since July of 2016.
Click here to be taken to the new web archives of this list; it includes all the mails that are in this frozen archive plus all new mails that have been sent to the list since it was migrated to the new archives.
On 12:33 Tue 19 Dec , Harakiri wrote:
> i need some input on the parallelisation of quicksort.
> If anyone knows a more appropriate forum/list than
> this, please tell me =)
uhm, do you have access to a library? Get yourself a textbook on
algorithms. Knuth should be a good basis to start with...
> I've written 2 different approaches to the quicksort
> 1. The "master" server splits the array of numbers
> into N seperate parts, where N is the number of
> available clients or slave servers. Each slave then
> does a quicksort on its own subarray - finally all
> slaves sent their subarray to the master. The master
> then does a quicksort on the presorted list.
> *This was done for the assumption that presorted list
> sort faster with quicksort than very unordered/random
> numbers lists.
Just some random ideas:
* Are you bound to use Quicksort in the master? Since the arrays are
presorted, Mergesort would be more advisable.
* I wouldn't combine the arrays at the master but do this step in
parallel, too. Given 4 machines 0..3, machine 0 would merge its own
subarray with no. 1's while machine 2 would merge with
no. 3. finally no. 0 and no. 3 would combine their arrays.
* All of this is quite obsolete, as there is an optimal parallel
Quicksort at hands (all praise Wikipedia):
> 2. The "master" server finds out the MIN and MAX value
> >from the to sorting array (iterates through all
> numbers). Then it will assign boundaries to each slave
> starting from MIN for the first slave, to MAX for the
> last slave (i.e. for 2 slaves it would be from MIN ->
> MAX/2 and MAX/2+1 -> MAX).
That would imply assumptions on the distribution of the
elements. This may be clever from a practical point of view, but is
flawed because of theoretical considerations. Assume you had 1000
elements ranging from 1.0 to 1.1 and just one element with a value of
100.0. No good.
> The master then sends the whole array to each slave,
> each slave filters through the whole array all numbers
> within their own boundary (slave 1 for example would
> find all numbers between MIN -> MAX/2),
Communication is expensive. Sending n elements is more expensive than
filtering them on the local machine and sending only the remainder. At
least this is what I'd expect, although I haven't benchmarked this.
> Anyone has any different ideas ? I've read a few
> papers about it - the basic approach would be to
> parallel the divide and conquer part - which would
> result in ALOT of network messages...
As already said, please read Powers' paper from above. I could imagine
that even though this results in _many_ messages, the algorithms
optimal runtime complexity will compensate for it.
But benchmark your own ;-)
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany
+49 3641 9 46376 - voice
- application/pgp-signature attachment: stored