Ben Cantrick (mackys) wrote,
Ben Cantrick

Heapsort, Quicksort, and Entropy

Quicksort and heapsort have been thoroughly compared by Paul Hsieh. He says `I suspected that heapsort should do better than its poor reputation and I think these results bear that out.' In his tests, the best compiler (for either heapsort or quicksort) produced a heapsort that was about 20% faster than quicksort, in total CPU time. The question I'd like to address, however, is, why Heapsort uses more comparisons than quicksort. Paul Hsieh says `what struck me is that I could not see really why heapsort is slower than quicksort. And I've not heard or read a credible explanation for this either.' I think there is a simple explanation, based on the idea of expected information content.

Heapsort is inefficient in its comparison count because it pulls items from the bottom of the heap and puts them on the top, allowing them to trickle down, exchanging places with bigger items. This always struck me as odd, putting a quite-likely-to-be-small individual up above a quite-likely-to-be-large individual, and seeing what happens. Why does heapsort do this? Could no-one think of an elegant way to promote one of the two sub-heap leaders to the top of the heap? Let's call that algorithm Fast Heapsort. It is not an in-place algorithm, but, just like Heapsort, it extracts the sorted items one at a time from the top of the heap. I evaluated the performance of Fast Heapsort on random permutations. Performance was measured solely on the basis of the number of binary comparisons required.
Tags: reddit
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.