Search

6.4 — Sorting an array using selection sort

A case for sorting

Sorting an array is the process of arranging all of the elements in the array in a particular order. There are many different cases in which sorting an array can be useful. For example, your email program generally displays emails in order of time received, because more recent emails are typically considered more relevant. When you go to your contact list, the names are typically in alphabetical order, because it’s easier to find the name you are looking for that way. Both of these presentations involve sorting data before presentation.

Sorting an array can make searching an array more efficient, not only for humans, but also for computers. For example, consider the case where we want to know whether a name appears in a list of names. In order to see whether a name was on the list, we’d have to check every element in the array to see if the name appears. For an array with many elements, searching through them all can be expensive.

However, now assume our array of names is sorted alphabetically. In this case, we only need to search up to the point where we encounter a name that is alphabetically greater than the one we are looking for. At that point, if we haven’t found the name, we know it doesn’t exist in the rest of the array, because all of the names we haven’t looked at in the array are guaranteed to be alphabetically greater!

It turns out that there are even better algorithms to search sorted arrays. Using a simple algorithm, we can search a sorted array containing 1,000,000 elements using only 20 comparisons! The downside is, of course, that sorting an array is comparatively expensive, and it often isn’t worth sorting an array in order to make searching fast unless you’re going to be searching it many times.

In some cases, sorting an array can make searching unnecessary. Consider another example where we want to find the best test score. If the array is unsorted, we have to look through every element in the array to find the greatest test score. If the list is sorted, the best test score will be in the first or last position (depending on whether we sorted in ascending or descending order), so we don’t need to search at all!

How sorting works

Sorting is generally performed by repeatedly comparing pairs of array elements, and swapping them if they meet some predefined criteria. The order in which these elements are compared differs depending on which sorting algorithm is used. The criteria depends on how the list will be sorted (e.g. in ascending or descending order).

To swap two elements, we can use the std::swap() function from the C++ standard library, which is defined in the algorithm header. For efficiency reasons, std::swap() was moved to the utility header in C++11.

This program prints:

Before swap: x = 2, y = 4
After swap:  x = 4, y = 2

Note that after the swap, the values of x and y have been interchanged!

Selection sort

There are many ways to sort an array. Selection sort is probably the easiest sort to understand, which makes it a good candidate for teaching even though it is one of the slower sorts.

Selection sort performs the following steps to sort an array from smallest to largest:
1) Starting at array index 0, search the entire array to find the smallest value
2) Swap the smallest value found in the array with the value at index 0
3) Repeat steps 1 & 2 starting from the next index

In other words, we’re going to find the smallest element in the array, and swap it into the first position. Then we’re going to find the next smallest element, and swap it into the second position. This process will be repeated until we run out of elements.

Here is an example of this algorithm working on 5 elements. Let’s start with a sample array:

{ 30, 50, 20, 10, 40 }

First, we find the smallest element, starting from index 0:

{ 30, 50, 20, 10, 40 }

We then swap this with the element at index 0:

{ 10, 50, 20, 30, 40 }

Now that the first element is sorted, we can ignore it. Now, we find the smallest element, starting from index 1:

{ 10, 50, 20, 30, 40 }

And swap it with the element in index 1:

{ 10, 20, 50, 30, 40 }

Now we can ignore the first two elements. Find the smallest element starting at index 2:

{ 10, 20, 50, 30, 40 }

And swap it with the element in index 2:

{ 10, 20, 30, 50, 40 }

Find the smallest element starting at index 3:

{ 10, 20, 30, 50, 40 }

And swap it with the element in index 3:

{ 10, 20, 30, 40, 50 }

Finally, find the smallest element starting at index 4:

{ 10, 20, 30, 40, 50 }

And swap it with the element in index 4 (which doesn’t do anything):

{ 10, 20, 30, 40 50 }

Done!

{ 10, 20, 30, 40, 50 }

Note that the last comparison will always be with itself (which is redundant), so we can actually stop 1 element before the end of the array.

Selection sort in C++

Here’s how this algorithm is implemented in C++:

The most confusing part of this algorithm is the loop inside of another loop (called a nested loop). The outside loop (startIndex) iterates through each element one by one. For each iteration of the outer loop, the inner loop (currentIndex) is used to find the smallest element in the remaining array (starting from startIndex+1). smallestIndex keeps track of the index of the smallest element found by the inner loop. Then smallestIndex is swapped with startIndex. Finally, the outer loop (startIndex) advances one element, and the process is repeated.

Hint: If you’re having trouble figuring out how the above program works, it can be helpful to work through a sample case on a piece of paper. Write the starting (unsorted) array elements horizontally at the top of the paper. Draw arrows indicating which elements startIndex, currentIndex, and smallestIndex are indexing. Manually trace through the program and redraw the arrows as the indices change. For each iteration of the outer loop, start a new line showing the current state of the array.

Sorting names works using the same algorithm. Just change the array type from int to std::string, and initialize with the appropriate values.

std::sort

Because sorting arrays is so common, the C++ standard library includes a sorting function named std::sort. std::sort lives in the <algorithm> header, and can be invoked on an array like so:

We’ll talk more about std::sort in a future chapter.

Quiz

1) Manually show how selection sort works on the following array: { 30, 60, 20, 50, 40, 10 }. Show the array after each swap that takes place.

2) Rewrite the selection sort code above to sort in descending order (largest numbers first). Although this may seem complex, it is actually surprisingly simple.

3) This one is going to be difficult, so put your game face on.

Another simple sort is called “bubble sort”. Bubble sort works by comparing adjacent pairs of elements, and swapping them if the criteria is met, so that elements “bubble” to the end of the array. Although there are quite a few ways to optimize bubble sort, in this quiz we’ll stick with the unoptimized version here because it’s simplest.

Unoptimized bubble sort performs the following steps to sort an array from smallest to largest:
A) Compare array element 0 with array element 1. If element 0 is larger, swap it with element 1.
B) Now do the same for elements 1 and 2, and every subsequent pair of elements until you hit the end of the array. At this point, the last element in the array will be sorted.
C) Repeat the first two steps again until the array is sorted.

Write code that bubble sorts the following array according to the rules above:

Print the sorted array elements at the end of your program.

Hint: If we’re able to sort one element per iteration, that means we’ll need to iterate as many times as there are numbers in our array to guarantee that the whole array is sorted.
Hint: When comparing pairs of elements, be careful of your array’s range.

4) Add two optimizations to the bubble sort algorithm you wrote in the previous quiz question:

  • Notice how with each iteration of bubble sort, the biggest number remaining gets bubbled to the end of the array. After the first iteration, the last array element is sorted. After the second iteration, the second to last array element is sorted too. And so on… With each iteration, we don’t need to recheck elements that we know are already sorted. Change your loop to not re-check elements that are already sorted.
  • If we go through an entire iteration without doing a swap, then we know the array must already be sorted. Implement a check to determine whether any swaps were made this iteration, and if not, terminate the loop early. If the loop was terminated early, print on which iteration the sort ended early.

Your output should match this:

Early termination on iteration 6
1 2 3 4 5 6 7 8 9

Quiz solutions

1) Show Solution

2) Show Solution

3) Show Solution

4) Show Solution

6.5 -- Multidimensional Arrays
Index
6.3 -- Arrays and loops

173 comments to 6.4 — Sorting an array using selection sort

  • Gio

    I may have done things differently without realizing, but my program keeps telling me it went through 5 iterations instead of 6. I did the swaps by paper and they seem to check out:

    original:{ 6, 3, 2, 9, 7, 1, 5, 4, 8 }
    1st {3, 2, 6, 7, 1, 5, 4, 8 , 9}
    2nd {2, 3, 6, 1, 5, 4, 7, 8 , 9}
    3rd {2, 3, 1, 5, 4, 6, 7, 8, 9}
    4th {2, 1, 3, 4, 5, 6, 7, 8, 9}
    5th {1, 2, 3, 4, 5, 6, 7, 8, 9}

    Did I do something off, or am I thinking about this in a weird way? Or are we also counting the original form of the array as  an iteration?

    • Peter

      The task was to print in "which iteration the sort ended early" it means to print the number of iteration that did nothing.  
      As you correctly found there are 5 iterations swapping some array elements. The 6th iteration is the first one that did nothing.

  • Quang

    Thank you for this tutorial. I just wonder how do i know if my program is optimized or not, is it based on the time the compiler runs the program or there are more different aspects to let me know that???

    • Alex

      When we use the word optimized in reference to programs, we usually mean in terms of either speed or memory usage. So an optimized program is one in which the developer (or developers) have worked to reduce the memory usage or increase the speed at which the program runs.

      There are programs called profilers that can help identify where bottlenecks exist in your code (places where your code is taking lots of time). However, removing bottlenecks is beyond the scope of this tutorial!

  • Pablo

    Hi Alex,

    Great site, you really are doing a great job! Know that it is much appreciated.

    Regarding question 3, I have a performance doubt. Below you have my code, with a logic almost matching that of the given answer. However, I tried to follow the principle of modularity and put the sorting code in a function. The result is that my code takes around 8 seconds (!) to finish, while yours is almost instantaneous.

    Am I doing something terribly wrong here?

    Thanks for the help!

    • Alex

      Your loop termination condition is wrong:

      indexfin is always less than size until it goes so negative it overflows into a huge positive number.

      • Pablo

        Of course! I changed the logic behind the loop and forgot to fix that. Shame on me -_-

        Thanks a lot!

        PS: It’s a pity you don’t get an error message when variables are overflowed!

  • When trying to blindly right the bubble sort, I wrote (the code below)
    and it sorted perfectly. however, that wasn’t how you wrote it in the answer section (question 3).

    Is this a bubble sort, or an entirely different sort?
    EDIT: Actually I see they arent that differently wrote now, besides me doing (index + 1) and you doing (size -1).

  • Typo:
    "For each iteration of the outer loop, the inner loop (currentIndex) is used to find the smallest element in the remaining array (starting from startIndex)"

    currentIndex is started from startIndex+1

    “Then smallestIndex is then swapped with startIndex”
    “then” used twice 🙂

  • Ali

    Hi Alex,

    Wonderful lessons I appreciate your effort, I was just wondering in the quiz solution 3 you wrote

    for (int currentIndex = 0; currentIndex < endOfArray-1; ++currentIndex)

    since endofArray is not defined did you mean size here?

  • Shijian

    Hi,

    I have a question about how to pass an array as a parameter to a function. For example, a function named doSomething, it takes an integer array as input, and return void. How to declare that function?

    void doSomething(int array[ ])?

    it seems the array size depends on the actual array we passed. So how to define the array size in [ ]?

    Thanks.

    • Eric Jones

      I don’t fully understand it yet, but the way I understand it so far is:  you have to use a pointer to pass a function to an array.  section 6.10 at the bottom says that that will be covered in functions.  Hope this helped.

    • Alex

      The answer to this question is slightly complicated. 🙂 There are two syntaxes you can use.

      If you’re passing a fixed array with a known size, you can do this:

      (where ARRAY_SIZE is some constant representing the known size of the array).

      However, if you want to pass arrays where the size of the array is not known in advance, you have to use the pointer syntax (along with a separate size parameter):

      We discuss pointers and their relationship to arrays in upcoming lessons.

  • Eric Jones

    My Solution was in a slightly different order:

    30 60 20 50 40 10
    10 60 30 50 40 20
    10 20 60 50 40 30
    10 20 30 60 50 40
    10 20 30 40 60 50
    10 20 30 40 50 60
    10 20 30 40 50 60

    I didn’t initialize new variables within the statements but just used the loop counts.  
    Is there a good reason to use those variables instead?  Here is my code:
    --------------------
    #include "stdafx.h"
    #include <iostream>
    #include <algorithm>

    int const arraySize = 6;
    int anArray[arraySize] = { 30,60,20,50,40,10 };

    void printArray()
    {
        for (int indexCount = 0; indexCount < arraySize; ++indexCount)
            std::cout << " " << anArray[indexCount];
        std::cout << "\n";
    }

    void assendSort()
    {
        using namespace std;
        for (int outCount = 0; outCount < arraySize; ++outCount)
        {
            for (int inCount = outCount + 1; inCount < arraySize; ++inCount)
            {
                if (anArray[outCount] > anArray[inCount])
                    swap(anArray[outCount], anArray[inCount]);
            }
            printArray();
        }
    }

    int _tmain(int argc, _TCHAR* argv[])
    {
        std::cout << "Starting Array\n\n";
        printArray();
        std::cout << "\n";

        assendSort();

        return 0;
    }

  • Samirax

    // If the current element is smaller than our previously found smallest

    in Solution to Question 2 should be changed to:

    //If the current element is larger than our previously found largest

    P.S.
    Sorry for being "picky" 😉

    BTW
    I enjoy your tutorials so much. They’re the best I’ve seen so far 🙂

  • C++ newbie

    Code isn’t performing how I expect it to perform.  I was wondering if someone could help me out.

    Upon printing out a[0], each time I do it, it prints out 30.  Obviously, I want it to print out 10, and then 60 (the lowest and highest numbers respectively).

    • Todd

      Your two functions, sortAscending() and sortDescending(), are only printing to the terminal; they are not affecting array a[] as defined in main(). Hence, your calls in main() always grab from your original a[] defined in main.

      I don’t know how to make a function return a new array. I think you need pointers for that (which Alex covers in future sessions).

    • Alex

      Arrays are funny things in C++.

      When you try to pass an array to a function in C++, the array “decays” into a pointer to the array. And while that pointer still points to the array, it no longer remembers what size the array was. Consequently this line:

      will not do what you want because a is no longer an array, it’s a pointer. In your case, sizeof(a) and sizeof(a[0]) probably both evaluated to 4, which means this evaluated to 1 instead of the actual length of your array. So your algorithms are likely sorting just the first element of your array!

      This could be fixed by passing the array size as a second parameter and using that instead.

  • Cody Blaucks

    Great tutorial web-site!  This is the first quiz I found somewhat challenging.  My code was similar to yours, yet a little different.  People have posted enough code, so I won't bother with mine.  I did do it without using the nSmallestIndex variable, which made quiz 2 even easier.

    I had errors, so I went into debug mode.  I know you haven't covered that yet, but it wasn't hard to figure out.  I really like the Code::Blocks interface, expecially when combined with certain Linux windows environment features.  Coding in Linux is awesome!!!

  • papagym177

    At the beginning of this lesson Alex says “The swap() function from the C++ standard library. Swap is defined in the algorithm header, and lives in the std namespace.”

    Excuse me for asking but I’m confused about why a function would be defined in one place and it lives in another.

    When we are programming how are we supposed to know what, when, and where,& why to include: using namespace std, which #include , or a function name from a library?

    • I am not a pro but you probably have to use a reference. For example, http://www.cplusplus.com/reference/ is a good place. Learncpp says swap(a, b) is in algorithm header file but in C++11 it’s in the utility header file. These things changes and you have to have a updated reference to follow.

    • Alex

      Swap used to be declared in the algorithm header. As yes23 mentioned, it was moved to the utility header in C++11.

      My use of the word “lives” was poor wording on my part. I simply meant to indicate that it was part of the std namespace (as all standard library functionality is).

      yes23 is also correct: if you want to use a bit of functionality, you need to know what header to include, and that typically means using a reference. It’s a good thing that search engines exist. 🙂

  • 3000blasters

    sir just a simple doubt
    i am using turboo C++ and i guess this headerfile
    ALGORITHM.H is not available there
    soo can you please suggest me an alternative for swap function

    • Alex

      You can write your own:

      But honestly, if you’re using Turbo C++, you should update your compiler. There are plenty of free, modern alternatives available.

  • lampshade

    Can’t seem to get the following to work in bloodshed C++. The output just brings me the unsorted list.

    #include
    #include
    #include

    int main(){
    using namespace std;
    const int size=6;
    int array[size]={30, 60, 20, 50, 40, 10};

    for(int outer; outer<size; outer++)
    {

    int smallest=outer;

    for(int inner=outer+1;inner<size;inner++)
    {if(array[inner] < array[smallest])
    smallest=inner;}

    swap(array[outer], array[smallest]);

    }

    for (int nIndex=0; nIndex < size; nIndex++){
    cout << array[nIndex] << endl;

    }
    cin.clear();
    cin.ignore(255, '\n');
    cin.get();
    }

    anyone see where this went wrong?

    • lampshade

      needed to make that outerloop’s loop variable = to zero… silly me.

      • Ahmed Gurie

        move swap(array[outer], array[smallest]); from the nested loop

        // swap(array[outer], array[smallest]);

        }
        swap(array[outer], array[smallest]); // Here is the right place

        for (int nIndex=0; nIndex < size; nIndex++){
        cout << array[nIndex] << endl;

  • Divyank

    Can anyone tell me what is wrong with this code ?

    It outputs -

    3 5 237 34 264 234 767 656 786

  • AUASP

    a single statement can swap 2 values.
    a^=b^=a^=b;//swaps a and b

  • My 2ct:

  • Zach

    I am definitely new to programming, but so far I’m understanding everything reasonably well. I do have one question though. If you are re-initializing “nSmallestIndex” to a value of “nStartIndex” each time, why do you use “nSmallestIndex” in this part of your loop?

    If the purpose of “nSmallestIndex” is just to temporarily hold the smallest index whilst you search the rest, wouldn’t it make more sense just to use “nStartIndex” and not bother giving “nSmallestIndex” a value of anything when it’s initialized, thus using “nSmallestIndex” Only as a temp holding place? I hope this is clear, I’m not a C++ pro like some of you. It just took me a while to understand it because of this seemingly unnecessary assignment. Please let me know if there is a good reason for doing so. Thanks -Zach

    • Zach

      Disregard that. I figured it out. Upon further examination I realized that if you don’t continually check against your hold value, the last value that evaluated as “true” gets swapped, in this case 40. Had it been checking against nSmallestIndex, which gets incrementally smaller with each check, the loop would be checking 40 < 10 rather than 40 < 50. For the record, this is a hard concept to grab when your asked to write one off the top of your head lol. I hope I get quicker with this stuff.

  • Petey

    hello, i was wondering how swap works since you’re passing an argument by value, not reference, or am i missing something? just wondering, that’s all. great tutorials by the way

  • GrantSolar

    After following your lessons (Awesome, by the way) I always enjoy the quiz’s however I’ve been having some difficulty with this one. I tried the first question without out-right copying what you did but repeatedly got the same results. I compared yours against mine and changed my variable names to make the differences more obvious. I couldn’t see where I’d gone wrong. I opened a new project and copied your code (and added #include etc.) Yours worked fine.
    My code reads:

    My other issue is that, whilst yours works straight off, mine refuses to compile unless nStartIndex is declared earlier whilst you don’t. Any ideas why that might be?

  • skyler

    nevermind now i understand perfectly, just had to write out what was happening on paper so my brain could fathom it.

  • Skyler

    Two questions, when the outerloop runs the first time nCurrentIndex is set to 1. If the nested loop runs all of its iterations before the outerloop can run again, then nCurrentIndex would get keep getting incremented by 1 until (nCurrentIndex < nSize) wasnt true. So does it then proceed to the swap() command?

    Second, if thats true, then on the second round of the outerloop, does the nested loop reiterate (int nCurrentIndex = nStartIndex + 1)? I thought that ‘for loops’ only use the initializing statement once??? I have the program working, i am just not understanding how exactly it is working.

  • FuturePixstar

    im just lost completely here, i ran through the code the first time, everything when fine, smallest to largest. then to switch it to largest to smallest i did exactly what the solution showed (before looking at it) and well… see for yourself

    • rawr

      You wait until you have found the smallest, THEN swap(). I commented out the old swap function, then located it outside the internal for loop.

  • runner

    I’m using Selection Sort as a function. Can I get the function to return the sorted array, or do I have to use it to change a global variable?

  • Mat

    Who ever wrote this post thank you. The book i’m using gave me a horrible example for sorting. Great job who ever put this up.

  • Jeffey

    attach the loop statement from the last lesson to the end of the for statement and see this program work through the sorting. Thought it was kinda neat how it works.

  • Chad

    C++ is definitely harder than MATLAB, but better in some ways.

  • Allen01

    In the sort example/problem, it would seem that the output statement should read:

      cout < < anArray << endl; 
    But this yields a funky output. The only thing that yields the output below 10 20 30 40 50 Press any key to conitiue… is this output statement:
      cout < < anArray[nStartIndex] << endl; 
    Can you elaborate on this a little more? Thanks.
    • anArray refers to the array itself, and the array variable is really just a pointer. So when you do this:

      You’re printing out the address held in anArray pointer (which as you say “yields funky output”)

      When you really want is to print the value of the ELEMENTS of anArray. To do that, you need to index the array using the [] operator -- in this case, the index is the value of nStartIndex.

      Generally speaking, it’s a better idea to sort your array and THEN re-loop through it to print out the sorted result. You could do this by adding the following lines:

      By the way, the issue with the disappearing + symbols should be fixed now.

  • I didn’t compile this, so I could be wrong, but on line 5, shouldn’t the for loop expression be:

    nStartIndex < (nSize - 1)
    would that be a one off error?
    • The code was written to follow the 5-element example that proceeds it. You will note the last step in the example is redundant because it will always swap the last element with itself. Similarly, the last iteration of the outer loop will always cause the last element to be swapped with itself!

      This doesn’t affect the correctness of the algorithm, it’s just non-optimal. As you note, changing the loop to nStartIndex < (nSize - 1) removes this non-optimality.

      Generally when we talk about "off by one" errors, we're talking about errors that make the algorithm perform incorrectly because the loop executes one too many or too few times. In this case, the algorithm is correct, it just has a redundant step that can be removed by changing that loop condition.

  • Abhishek

    Then what’s the faster and difficult to understand sort?
    Will my small mind be able to understand that?

    • There are a large number of faster sorting algorithms. Two of the more popular faster sorts are quicksort and mergesort. However, sorting algorithms are really the topic for a data structures/algorithms class or tutorial (since they can be implemented in any language), not a C one. 🙂

      See wikipedia for more information.

      • James Ray

        I did a couple of hours reading or so of that article, plus others:
        http://en.cppreference.com/w/cpp/algorithm/sort, plus links on the page, links on linked to pages, and so on.
        http://www.cplusplus.com/articles/NhA0RXSz/
        http://www.cplusplus.com/reference/algorithm/sort/
        https://stackoverflow.com/questions/1840121/which-type-of-sorting-is-used-in-the-stdsort
        https://stackoverflow.com/questions/5038895/does-stdsort-implement-quicksort

Leave a Comment

Put C++ code inside [code][/code] tags to use the syntax highlighter