Subsections
The four processors available really limited the scope of
investigation. Measurements should be done on computers with
far more processors to see what happens with performance.
Some questions to be answered are: Will bus contention limit the
parallelism? Can the Improved Parallel Quicksort keep up the
parallelism as the number of processors increase, or will
lock contention increase? Will thread latency increase even
more for small number of elements?
The FastThread library, although simple and robust enough,
has its problems. One is the inability to do I/O in a
straightforward manner as mentioned in section 2.2.
The other is the operating system interference that
causes thread latency, described in
section 5.2 above.
If the sorting algorithms were implemented
on an operating system with integrated threads,
performance should increase.
For my measurements, I've use arbitrary input data (though
chosen with some thought). To get a more useful evaluation
of the algorithms, real applications should be examined in
order to locate typical sort data. With measurements based
on real data, program designers could choose the most
appropriate algorithm for each problem.
The Improved Parallel Quicksort is, as mentioned above, far
from optimal. Further tuning of the critical parts should
improve the efficiency to nearly that of the ordinary
Quicksort.
The parallel quicksort algorithms suggested have the
parallel threshold hard-coded. This means that they will
perform optimally only for a rather limited data set. With
some sort of dynamic threshold, the algorithms should be
able to adapt to different input data.
The external algorithms should be exhaustively timed using
different I/O system configurations. Especially the I/O
configuration should affect performance.
A different thread implementation than FastThreads' should be
used when implementing the external algorithms. The Unix
process is probably sufficient for this purpose.
© Ola Sigurdson 1991-08-20 Ola@Sigurdson.SE