7.4 Bin Sort
Assume that in the bin sort
T(n) = c1n + c2m
and it is strictly O(n + m).
Now if m <= n, this is clearly O(n).
However if m >> n, then it is O(m).
For example, if we wish to sort 104 32-bit integers,
then m = 232 and we need 232 operations
(and a rather large memory!).
For n = 104:
nlogn ~ 104 x 13 ~ 213 x 24
~ 217
So quicksort or heapsort would clearly be preferred.
An implementation of bin sort might look like:
Having highlighted this constraint, there is a version of
bin sort which can sort in place:
Assume that in the bin sort
- the keys of the items that we wish to sort lie in a small fixed range and
- that there is only one item with each value of the key.
- Set up an array of "bins" - one for each value of the key - in order,
- Examine each item and use the value of the key to place it in the appropriate bin.
Constraints on bin sort
To understand these restrictions, let's be a little more precise about the specification of the problem and assume that there are m values of the key. To recover our sorted collection, we need to examine each bin. This adds a third step to the algorithm above,- Examine each bin to see whether there's an item in it.
For n = 104:
#define EMPTY -1 /* Some convenient flag */
void bin_sort( int *a, int *bin, int n ) {
int i;
/* Pre-condition: for 0<=i<n : 0 <= a[i] < M */
/* Mark all the bins empty */
for(i=0;i<M;i++) bin[i] = EMPTY;
for(i=0;i<n;i++)
bin[ a[i] ] = a[i];
}
main() {
int a[N], bin[M]; /* for all i: 0 <= a[i] < M */
.... /* Place data in a */
bin_sort( a, bin, N );
If there are duplicates, then each bin can be replaced by a linked list.
The third step then becomes:
- Link all the lists into one list.
so that we would normally use memory rather profligately to obtain performance, memory consumes power and in some circumstances, eg computers in space craft, power might be a higher constraint than performance. |
#define EMPTY -1 /* Some convenient flag */
void bin_sort( int *a, int n ) {
int i;
/* Pre-condition: for 0<=i<n : 0 <= a[i] < n */
for(i=0;i<n;i++)
if ( a[i] != i )
SWAP( a[i], a[a[i]] );
}
However, this assumes that there are n distinct keys
in the range 0 .. n-1.
In addition to this restriction, the SWAP operation is relatively
expensive, so that this version trades space for time.
The bin sorting strategy may appear rather limited,
but it can be generalised into a strategy known as
Radix sorting.
No comments:
Post a Comment