Talk:Algorithm Implementation/Sorting/Quicksort

From Wikibooks, the open-content textbooks collection

Jump to: navigation, search

Contents

[edit] Transwiki

This article was created from Transwiki:Quicksort implementations. This is the content of the talk page:

[edit] Quicksort implementations edit history

[edit] Talk:Quicksort implementations edit history

[edit] Talk:Quicksort implementations text

The following text is from a Wikipedia talk page. Remember that Wikibooks is not Wikipedia, so the current module at Wikibooks does not necessarily need to be encyclopedia content.

[edit] Sorted?

"... sorted by number of non-comment lines of code. ..." Is that true? Not for the last two i gues. -- Panzi 22:04, 19 Feb 2005 (UTC)

People add things without sorting them, as one might expect. Feel free to either reorder them or remove the statement about how they're sorted. Deco 01:06, 21 Feb 2005 (UTC)


Also, the C++ example won't work. It needs stable_partition or something. See [1]

[edit] C: Built-in Quicksort function

In C (and thus also C++) there is a built-in Quicksort implementation called "qsort", as from [2]. Should this be mentioned? --210.6.198.187 08:07, 4 May 2005 (UTC)

I don't think the qsort() function is technically required by the standard to use quicksort, despite its name, but sure, maybe. Deco 02:39, 7 May 2005 (UTC)

[edit] I've discovered an optimization

I've discovered an optimization for the standard quicksort algorithm that could be useful in many of these implementations. I have tested it to be sure it works (other optimizations I thought about were in fact slower, because they added cache faults). I didn't got the idea from any other site, it's just my personal work, and I would want to release the information as GNU, appearing as a contributor.

It's very late at night, so I can't talk more about it. I'll recheck this page in a few days to see if there's someone interested. It's my first time here, so if someone could help me explaining what steps I should take, it would be great. I don't know HTML (yet), if that's important. If there's something I should learn first, please tell me so.

Thank you!

Well, this page is for discussing this article, but if you have a good idea that's great. I would be advised though that a lot of really smart people have put time into engineering fast quicksort implementations and you're quite possibly rediscovering something or missing something. You're also a little confused about software licenses. The GNU software license is called the GPL, and you can only release documents such as source code under it, not algorithms or ideas (these can be patented, but this isn't usually worthwhile). If you want to tell everybody about it, though, you could try a newsgroup like comp.programming. Deco 08:57, 19 May 2005 (UTC)


Maybe someone else discovered it before me, but the optimization doesn't appear on this article, so I'm free to post it as mine, and allow anyone to modify it. I'm aware of the GPL, and I was referring to the source code, and the text I posted on the other article explaining how it worked.

Here you have a simple C source code, which only sorts arrays of integers with the 'less than' comparison:

 //Quicksort Implementation without swap by DrJones
 void quicksort(int *start, int *end) {
      if (start<end) {
           MEDIAN(*start, start[1], *end);
           register int pivot=start[1];
           register int *i=start+1;
           register int *j=end;
           for (;;) {
                while (pivot<*--j) ;
                if (j<=i)
                     break; //j may be lower than i
                *i=*j;
                while (*++i<pivot) ;
                if (j<=i) {
                     i=j;
                     break; //i may be lower than j, as the test comparison comes after the 'while'
                }
                *j=*i;
           }
           *i=pivote;
           quicksort(start, i-1);
           quicksort(i+1, end);
      } //end of 'if(start<end)'
 } //end of method

Some points:

  • MEDIAN(a,b,c) is not described here, it's a function that put on position 'a' the lowest of the three elements, then put on position 'c' the highest of the elements, and the remaining element on position b.
  • This code can be improved, but I chose not to do it, so I could focus on the replacement for the swap method. Feel free to post a code which uses all optimizations available.
  • The code with the swap method could be as follows:
 //Quicksort Implementation with swap by DrJones
 void quicksort(int *start, int *end) {
      if (start<end) {
           register int pivot=start[1]; //though it could be any, I suggest (start+end)/2
           register int *i=start;
           register int *j=end;
           while (i<j) {
                while (pivot<*j) --j;
                while (*i<pivot) ++i;
                swap(i, j);
                i++; j--;
           }
           quicksort(start, i-1);
           quicksort(i+1, end);
      } //end of if
 } //end of method
I'd say you have an infinite loop there. It will happen when the pivot value appears more than once. When i and j reach those elements, the conditions of the inner loops won't be met any more; i and j will not change, and the two pivots will be swapped once and again... forever --Comocomocomocomo 16:15, 8 Jun 2005 (UTC)
Yes, you are right. I made a quick fix, but I haven't tried if it works yet. DrJones 15:15, 9 Jun 2005 (UTC)

I wrote back the code. I hope you can understand it, I'm not used to write these things. DrJones 09:55, 22 May 2005 (UTC)


You don't have to remove anything from here if you don't want to. Most of our policies don't apply to talk pages. Deco 23:46, 21 May 2005 (UTC)

I finally found a webpage that could be interesting: http://linux.wku.edu/~lamonml/algor/sort/quick.html

At this time, it's down, so I had to read the page on Google's Cache. It uses the method I found instead of swapping, but it has to do a lot more comparisons in its inner loop than mine, to prevent one index surpassing the other.

It seems my code could be included on the main page, afterall.

Here you got a macro function for the MEDIAN function, if you want.

 #define SWAP(x,y) register int temp = (x); (x) = (y); (y) = temp
 #define MEDIAN(a, b, c) \
         if (c<b) { SWAP(b, c); } \
         if (b<a) { SWAP(a, b); \
                    if (c<b) { SWAP(c, b); } \
         }

This MEDIAN macro works exactly as the insertion algorithm for 3 elements. I wrote it using a SWAP macro for clarity, but I feel it's weird using the SWAP function here, when I'm not using it on the Inner loop. Other thing I don't know if would be an enhacement, is using for(;;) instead of while(1). I'm going to write it on the code anyways. DrJones 12:41, 22 May 2005 (UTC)

There are very no modern compilers which would compile for(;;) differently from while(1). Deco 21:06, 3 Jun 2005 (UTC)
That converging pointer method of partitioning was first invented by R Sedgewick, according to "The Art of Computer Programming". -BadJim

[edit] Comprehensible, iterative, stack-safe version (though longish)

What do you think about this one?

#define MAX 30  // Sort up to pow(2,MAX)==1G elements

void QuickSort (int * arr, int count)
{
  int pivot;               // Pivot value (hopefully the median)
  int tmp;                 // Temp variable for swapping
  int i, j;                // Inner bounds (high and low part)
  int beg, end;            // Bounds of current task (outer)
  int stack[MAX*2];        // Pending tasks (can be made static)
  int top;                 // 2 * (number of pending tasks)

  if (count<=1)            // If less than two elements,
    return;                // nothing to sort

  top = 0;
  stack[top++] = 0;        // First "beginning"
  stack[top++] = count-1;  // First "ending"

  do  // ---- TASK LOOP: repeat until there's no pending task
  {
    end = stack[--top];           // Pop pending task
    beg = stack[--top];
                                  // Take the value in the
    pivot = arr[(beg+end)/2];     // middle as pivot value

    // i = beg + (int) (rand() /           // _OR_: Take the
    //                  (RAND_MAX+1.0) *   // pivot value from
    //                  (end-beg+1));      // any position at
    //                                     // random (store pos
    // pivot = arr[ i<beg ? beg :          // in i and check it
    //              i>end ? end : i ];     // just in case)

    i = beg;                      // Low position: beginning
    j = end;                      // High position: ending

    for (;;)  // -------------- PARTITION LOOP --------------
    {
      while (i<j &&                  // Search the low part for
             arr[i]<pivot)           // an element that goes up
        i ++;                        // (skip lesser elements)

      while (i<j &&                  // Search the high part for
             arr[j]>pivot)           // an element that goes down
        j --;                        // (skip greater elements)

      if (i>=j)                      // If both positions met,
        break;                       // partition is finished

      if (arr[i]!=arr[j])            // If the reached elements
      {                              // are different from each
        tmp = arr[i];                // other, swap them
        arr[i] = arr[j];
        arr[j] = tmp;                // If both==pivot, skip
      }

      i ++;                          // These are already ok;
      j --;                          // go to the next ones

    }         // ---------- END OF PARTITION LOOP ----------

                                  // NOTE: i and j swapped
    while (i<=end &&              // their roles when they met
           arr[i]<=pivot)
      i ++;                       // Discard lower elements
                                  // (and values eq. to pivot)
    while (j>=beg &&              // from the high part
           arr[j]>=pivot)
      j --;                       // Same thing with low part

    if (end-i>j-beg)              // If the high part is longer,
    {                             // push it first
      if (end>i)
      {                           // If high part is not empty,
        stack[top++] = i;         // push it as pending task
        stack[top++] = end;
      }

      if (j>beg)
      {                           // If low part is not empty,
        stack[top++] = beg;       // push it as pending task
        stack[top++] = j;
      }
    }
    else                          // Otherwise, push the low
    {                             // part first.
      if (j>beg)
      {                           // The last pushed part will
        stack[top++] = beg;       // be popped inmediately. The
        stack[top++] = j;         // other one will remain in
      }                           // the stack until later.
                                  // Therefore, it's better to
      if (end>i)                  // push first the longer one
      {                           // in order to achieve an
        stack[top++] = i;         // efficient usage of the
        stack[top++] = end;       // stack. This way, stack
      }                           // overflow is avoided while
    }                             // count remains < pow(2,MAX)
  }
  while (top);  // ---- END OF TASK LOOP ----
}

Most implementations use function recursion. In the worst case (where the chosen pivot value allways happens to be the minimum or the maximun) the maximun recursion level will be O(n). I propose using a stack array in order to apply tail recursion in the critical recursive call. --Comocomocomocomo 16:37, 8 Jun 2005 (UTC)

Explicit stacks are not necessary to implement the partially tail recursive variant. It suffices to use a loop and only make a recursive call on the smaller of the two sublists, as the pseudocode here shows. On the other hand, explicit stacks can save a significant amount of time/space overhead because of all the other junk that standard calling conventions tend to throw on the stack; it's a good idea to use one in an efficient implementation, but (no offense) not particularly innovative. Deco 19:49, 8 Jun 2005 (UTC)
Don't worry, I'm not trying to invent anything new. I just want to write a fast implementation (pivot taken from the middle or from a pos. at random, minimun number of swap operations...) and make it comprehensible. I intentionally discarded other refinements (general implementation with pointers to functions or C++ templates, saving push/pop operations...). So, my question is: do you find it easy to understand? (Not only for expert programmers, but for beginners too) If not, what would you change? --Como 10:56, 9 Jun 2005 (UTC)

[edit] Wikisource

Wouldn't this article be better placed at Wikisource? Wikipedia already has a Quicksort article, and unlike Wikipedia, Wikisource is intended to be a repository of things like sample implementations. They already have a Wikisource:Quicksort page - it could easily be expanded with the content of this article. --Allan McInnes 03:16, 28 January 2006 (UTC)

That's a possibility, but it would make it impossible to do the transclusion trick where we include a few of the implementations in both quicksort and this page. I've proposed elsewhere that there should be a project for code with a suitable license, but that got nowhere. Deco 04:25, 28 January 2006 (UTC)
Ah. Then maybe the question I should have asked was whether or not Wikipedia should include sample implementations of algorithms at all. IMHO it would be more appropriate to point to external sites that carry sample implementations, instead of having (unverified, potentially buggy) sample implementations of every algorithm, in every language. Unfortunately, I suspect that trying to achieve a consensus answer to the question of sample implementations is likely to lead into the same kind of interminable debate that the Wikipedia:Wikicode proposal provoked... --Allan McInnes 06:08, 28 January 2006 (UTC)
I moved these to here to provide a way of removing large amounts of content from the Quicksort article without upsetting the innumerable contributors who inexplicably enjoy adding sample implementations. I think it's alright as long as it's separated from the main content and its purpose is clear. It might be compared to lists of minor characters from a TV series or video game or Notable lines in the Star Wars series. Incidentally, I'm glad you're familiar with my ill-fated proposal. 08:33, 28 January 2006 (UTC)
Oh, I agree with your approach. It seems like a reasonable compromise, given the small likelihood that I could ever get consensus for my own personal views on the matter. And I certainly think it's better to have these things on a separate page from the main article. I just wanted to get my opinion "on record" so to speak :-)
Going slightly off on a tangent: I thought wikicode seemed like a reasonable proposal, and I may well have supported if I'd been a contributor when the debates were going on. Much like this article, it's not what I would consider ideal (the curly braces bug me a bit for starters), but after reading through some of the debate it seems like the best compromise between the various viewpoints. I applaud your ability to find workable solutions! I'll most likely end up using wikicode or some close derivative of it for any pseudocoding I need to do here. I may even see if I can get Wikipedia:WikiProject Computer science to adopt it as part of some kind of style guideline. --Allan McInnes 10:21, 28 January 2006 (UTC)

[edit] Bad History?

I am puzzled that the history of the Quicksort Implementations page seems to be corrupted. For example, compare the following two versions:

  • 2005-11-18 21.42.19 (PST) DCoetzee - A less obtuse J example
  • 2005-11-28 19.25.16 (PST) DCoetzee - Mathematica - a test driver

I am pretty sure that at that time the "Core implementations" contains a "obtuse J example". The comment in the J example (the comment is still there in the current revision) also indicates that at that time, there was a J example in the "Core implementations". Yet I can not find any revision of the page that has the J example in the Core implementations. Roger Hui 08:02, 28 January 2006 (UTC)

You're looking at the wrong page. Mere hours ago, editor R. Koot removed swathes of content from the core implementations template. Here's how it looked before. I'm reverting it. Deco 08:36, 28 January 2006 (UTC)

[edit] Encyclopedic content?

WP:WWIN#Wikipedia is not a mirror or a repository of links, images, or media files says, explicitly, that wikipedia is not:

"Mere collections of public domain or other source material such as entire books or source code..."

This page looks an awful lot like a mere collection of source code; is it really encyclopedic in nature? If so, what is its importance? It seems to me that quicksort implementations belong somewhere else, like the Great Compiler Shootout or Sourceforge. --bmills 05:17, 15 February 2006 (UTC)

Or Wikisource. The only problem is that it makes it harder to share the implementations at the top with the quicksort page if they're on another site. I created the article originally to get implementations out of the quicksort article, which people feel compelled to add implementations to. Deco 06:44, 15 February 2006 (UTC)
Isn't that the point? An encyclopedia article on Quicksort shouldn't contain any more implementations than required to describe the algorithm. If people feel compelled to add them, they should add them to a code repository, not Wikipedia. --bmills 15:13, 15 February 2006 (UTC)

[edit] Why is this here?

@The first C implementation above does not sort the list properly if the initial input is a reverse sorted list, or any time in which the pivot turns out be the largest element in the list:

If it's not a proper quicksort, then why is it here? It should be fixed or removed. 82.139.85.118 01:21, 8 May 2006 (UTC)

[edit] what on earth??

why do you choose really random pivot.. choose the first one.

[edit] Python 3K

I'm just wondering if the Python version runs under Py3K. I haven't got Py3K up and running on any boxes yet, and I haven't worked with Python for a while, so I was wondering if anyone knew? Tinned Tuna (talk)