DotNet Programming World

Tuesday, August 25, 2015

Changgyu Oh's new fixed array size based sorting algorithm O(n)

It's been a while that I encountered sorting algorithms.

Last night, I had a chance to look at the bubble sorting, merge sorting, insertion sorting, and etc for the integer sorting. While reading those algorithms, I started thinking that why do I have to swap the input elements in order to sort.

I think it would be a lot faster if I can get rid of data comparison operations and swapping functions. And I categorize my idea how to implement such sorting algorithm as follow:

  • If you already know max value of the input array, that will be good because it will be your sorted array's size. Otherwise find the max value.(O(n)). 
  • It may be faster if assign integer value as address index of sorted array.
    • This is a key to my idea that after assigning each element of input array to the new sorted array, you are basically done sorting.
    • If the same values happen to exist, increase its value by one each time in the new sorted array.
  • In order to distinguish the difference between nth element having value and not having value, I used nullable integer type for the sorted array type
  • For reading, iterate through the sorted array; if you encounter null as element, skip to the next item.
    • I think this is not too bad performance because our array is listed in the memory consequently and performing address access is in linear process. I want to avoid using pointer like structure because random address access may take longer time to go the different memory address and updating address in CPU may require more bit update operation for the next address accessing. Also creating such linkedList like structure takes a lot of time(?!). 
    • I think the speed of linear scanning array in the consequent memory space and checking null value will overcome the operation cost of creating complex sorting operation.
    • If I use a binary file and using address calculation function to mimicking the nullable array  it would have the same behavoir and an outcome of such operation may require one more step to display outcome data because of displaying data

Here is a sample code I come up with for quick sorting performance testing with merge and insertion sorting.

As you can see, var array is my input array and I already know what is my max value because of arraySize and where I generate random integer value. Without max value knowing, I have to iterate through array once in order to get the max value O(n).

This is my core logic of my own sorting idea. if you want distinct sorting un-comment the commented and comment out if-else phrase.  once for loop is done, sorting is done as well.

This is how I display the outcome of my sorting logic. Note that I commented out Console.Write() because I want to the printing time in the console. And I also measured displaying time because I want to compare performances of printing outcome due to the fact that my logic using a lot more memory space that others and my logic check a null(O(n))
 
I am a lazy developer so I don't want to type more. This is not creative work. So I will cut to the bottom and I will not do more writing. But I will show you the outperforming outcome of my sorting over merge and insertion sort. 


As you can see my sorting algorithm is very steady and more than 600 % faster than merge sort for 15 million random integer sorting.

Also, I found an interesting behavior of the merge sorting comparing mine. With merge sorting, I couldn't sort more than 100 million. My computer shows out-of-memory exception.
While mine works up to 200 million random number sorting.

If you want to share this with other, please get my permission.
I would like to know who viewed my idea and like to hear your feedback.

Following is merge sample I used in my code.


10:42AM 8/25/2015 PSD

0 Comments:

Post a Comment

<< Home