Friday, February 11, 2011

...Types of Sorting and Sorting Algorithms....


Sorting
Sorting is any process of arranging items in some sequence and/or in different sets, and accordingly, it has two common, yet distinct meanings:
  1. Ordering: arranging items of the same kind, class, nature, etc. in some ordered sequence,
  2. Categorizing: grouping and labeling items with similar properties together (by sorts).

Here, I will present four kinds of sorting: Selection Sort, Bubble Sort, Insertion Sort, and Shell Sort.

©      SELECTION SORT
Definition:
            Selection sort is a simple sorting algorithm that develops on the performance of bubble sort. It works by first looking the least element using a linear scan and exchanging it into the first spot in the list, then finding the second smallest element by checking the remaining elements, and so on. Selection sort is exceptional compared to approximately any other algorithm in that its running time is not influenced by the preceding ordering of the list: it executes the same quantity of operations because of its simple structure. Selection sort requires (n -1) swaps and hence O(n) memory writes. Thus selection sort can be incredibly attractive if writes are the most expensive operation, but otherwise it will frequently be outperformed by insertion sort or the more complicated algorithms.
Selection sort is an attempt to localize the exchanges of array elements by finding a misplaced element first and putting it in its final place.

�� the list is divided into two sub lists, sorted and unsorted, which are divided by an imaginary wall.

�� We select the smallest element from the unsorted sub list and swap it with the element at the beginning of the unsorted data.

�� Then the smallest value among the remaining elements is selected and put in the second position and so on.

�� After each selection and swapping, the imaginary wall between the two sub lists move one element ahead, increasing the number of sorted elements and decreasing the number of unsorted ones.

�� Each time we move one element from the unsorted sub list to the sorted sub list, we say that we have completed a sort pass.

�� A list of n elements requires n-1 passes to completely rearrange the data.

  • ·         Time Complexity of Selection Sort
§  The number of comparisons and moves is independent of the data (the initial order of elements doesn’t matter).
§  Therefore, the best, average and worst case time complexities of the Selection Sort are all O(n2).
  • ·         Space Complexity of Selection Sort
§  Besides the collection itself, the only extra storage for this sort is the single temp reference used in the swap method.
§  Therefore, the space complexity of Selection Sort is O(n).


Selection Sort Algorithm  
/* Sorts by selecting smallest element in unsorted
portion of array and exchanging it with element
at the beginning of the unsorted list.
*/
void selectionSort(int list[], int n)
{
int cur, j, smallest, tempData;
for (cur = 0; cur <= n; cur ++){
smallest = cur;
for ( j = cur +1; j <= n ; j++)
if(list[ j ] < list[smallest])
smallest = j ;
// Smallest selected; swap with current element
tempData = list[cur];
list[cur] = list[smallest];
list[smallest] = tempData;
}
}
The outer loop is executed n times, and the inner loop iterates
j = (n –1 – cur) times.
So it is a n times summation of j. Thus the order turns out to be O(n 2).
The number of comparisons remains the same in all cases.
There may be some saving in terms of data movements (swappings).


©      BUBBLE SORT
Imagine all elements are objects of various sizes and are placed in a vertical column. If the objects are allowed to float, then the smallest element will bubble its way to the top. This is how the algorithm gets its name.

            �� the list scanned from the bottom up, and two adjacent elements are interchanged if they are found to be out of order with respect to each other.
Then next pair of adjacent elements are considered and so on.

�� thus the smallest element is bubbled from the unsorted list to the top of the array.

           �� after that, the wall moves one element ahead, increasing the number of sorted elements and decreasing the number of unsorted ones.

�� Each time an element moves from the unsorted part to the sorted part one sort pass is completed.
�� given a list of n elements, bubble sort requires up to n-1 passes to sort the data.

            �� Bubble sort was originally written to “bubble up” the highest element in the list. From an efficiency point of view it makes no difference whether the high element is bubbled or the low element is bubbled.

In some researches, bubble sort is referred to as “sorting by repeated comparison and exchanging, respectively.” It is easy to remember and to program, and little time is necessary to complete a single step. However, this sorting algorithm is normally used where the value of n is not too big and programming effort is to be kept to a least amount.



Bubble Sort Algorithm
/* Sorts list using bubble sort. Adjacent elements
are compared and exchanged until list is
completely ordered.
*/
void bubbleSort(int list[], int n)
{
int cur, j, temp;
for (cur = 0; cur <= n; cur++){
for ( j = n; j > cur; j--)
if(list[j ] < list[ j - 1]){
temp = list[ j ];
list[ j ] = list[ j – 1];
list[ j -1] = temp;
}
}
return;
}

What is the time complexity of the Bubble sort algorithm?

The number of comparisons for the worst case (when the array is in the reverse order sorted ) equals the number of iterations for the inner loop ,i.e. O(n 2). How many comparisons are to be made for the best case? Well, it turns out that the number of comparisons remains the same.

What about the number of swaps?
In the worst case, the number of movements is 3 n(n – 1 )/2. In the best case, when all elements are ordered, no swaps are necessary.
If it is a randomly ordered array, then number of movements are around half of the worst case figure.

Compare it with insertion sort. Here each element has to be bubbled one step at a time. It does not go directly to its proper place as in insertion sort. It could be said that insertion sort is twice as fast as the bubble sort , for the average case., because bubble sort makes approximately twice as many comparisons .

In summary, we can say that bubble sort, insertion sort and selection sort are not very efficient methods, as the complexity is O(n 2).

©      INSERTION SORT
The key idea is to pick up a data element and insert it into its proper place in the partial data considered so far. An outline of the algorithm is as follows:
Let the n data elements be stored in an array list[ ]. Then,
for ( cur = 1; cur < n ; cur++ )
move all elements list [ j ] greater than data [ cur ] by one position;
place data [ cur ] in its proper position. 

Note that the sorting is restricted only to fraction of the array in each iteration.
The list is divided into two parts: sorted and unsorted.
�� In each pass, the first element of the unsorted part is picked up, transferred to the sorted sub list, and inserted at the appropriate place. A list of n elements will take at most n-1 passes to sort the data.


Insertion Sort Algorithm
/* With each pass, first element in unsorted
sub list is inserted into sorted sub list.
*/
void insertionSort(int list[], int n)
{
int cur, located, temp, j;
for (cur = 1; cur <= n ; cur++){
located = 0;
temp = list[cur];
for ( j = cur - 1; j >= 0 && !located;)
if(temp < list[ j ]){
list[j + 1]= list[ j ];
j--;
}
else
located = 1;
list[j + 1] = temp;
}
return;
}

An advantage of this method is that it sorts the array only when it is really necessary.
If the array is already in order, no moves are performed.
However, it overlooks the fact that the elements may already be in their proper positions.
When an item is to be inserted, all elements greater than this item have to be moved.
There may be large number of redundant moves, as an element (properly located) may be moved,
but later, brought back to its position.

The best case is when the data are already in order. Only one comparison is made for each
position, so the comparison complexity is O(n),
and the data movement is 2n – 1 , i.e. it is O(n).

The worst case is when the data are in reverse order.
Each data element is to be moved to new position and for that each of the other elements
have to be shifted. This works out to be complexity of O(n2).

©      SHELL SORT
 
Shell sort is the oldest quick sorting algorithms. Also called Diminishing Increments sort It is said to be a better insertion sort. It is named after its inventor, Donald Shell. Insertion sort does a preset amount of work to take away each inversion. Shell sort can remove numerous inversions with one replace. It is typically works very well on real data, better then on random sequences. It is fast, easy to understand and easy to implement. However, its difficulty analysis is a little more complicated. The idea of Shell sort is the following: organize the data sequence in a two-dimensional array sort the columns of the array. The result is that the data sequence is partly sorted.
Its running time is proved to be O(N3/2)in the worst case, and seems on average to be O(N5/4) (Weiss p230). It might even be as good as O(N7/6.

            We can consider shell sort to be an elegant extension of insertion sort that gains speed by allowing exchanges of elements that are far apart. It sorts slices with a particular step h. Such a file is said to be h-sorted. If we first h-sort a file using a large value of h, elements move long distances and the efficiency of h-sorting for smaller values of  h improves.  When you get to the h=1 you implement a regular insertion sort and thus get a sorted file.
But the details of this elegant idea are non-trivial. First of all the result of h-sorting a file that is already k-sorted is a file became both h-and k-sorted. While any sequence of values of h will work as long as ends with h1 = 1, some sequences are clearly better than others.


Shell Sort Algorithm
void shellsort (int[] a, int n)
{
    int i, j, k, h, v;
    int[] cols = {1391376, 463792, 198768, 86961, 33936, 13776, 4592,
                    1968, 861, 336, 112, 48, 21, 7, 3, 1}
    for (k=0; k<16; k++)
    {
        h=cols[k];
        for (i=h; i
        {
            v=a[i];
            j=i;
            while (j>=h && a[j-h]>v)
            {
                a[j]=a[j-h];
                j=j-h;
            }
            a[j]=v;
        }
    }
}

References:
  • Cmput 115 - Lecture 9
Department of Computing Science
University of Alberta
©Duane Szafron 2000
  •   Keone Hon

  • Owen Astrachan  
        Computer Science Department
        Duke University

No comments:

Post a Comment