Search

6.4 — Sorting an array using selection sort

A case for sorting

Sorting an array is the process of arranging all of the elements in the array in a particular order. There are many different cases in which sorting an array can be useful. For example, your email program generally displays emails in order of time received, because more recent emails are typically considered more relevant. When you go to your contact list, the names are typically in alphabetical order, because it’s easier to find the name you are looking for that way. Both of these presentations involve sorting data before presentation.

Sorting an array can make searching an array more efficient, not only for humans, but also for computers. For example, consider the case where we want to know whether a name appears in a list of names. In order to see whether a name was on the list, we’d have to check every element in the array to see if the name appears. For an array with many elements, searching through them all can be expensive.

However, now assume our array of names is sorted alphabetically. In this case, we only need to search up to the point where we encounter a name that is alphabetically greater than the one we are looking for. At that point, if we haven’t found the name, we know it doesn’t exist in the rest of the array, because all of the names we haven’t looked at in the array are guaranteed to be alphabetically greater!

It turns out that there are even better algorithms to search sorted arrays. Using a simple algorithm, we can search a sorted array containing 1,000,000 elements using only 20 comparisons! The downside is, of course, that sorting an array is comparatively expensive, and it often isn’t worth sorting an array in order to make searching fast unless you’re going to be searching it many times.

In some cases, sorting an array can make searching unnecessary. Consider another example where we want to find the best test score. If the array is unsorted, we have to look through every element in the array to find the greatest test score. If the list is sorted, the best test score will be in the first or last position (depending on whether we sorted in ascending or descending order), so we don’t need to search at all!

How sorting works

Sorting is generally performed by repeatedly comparing pairs of array elements, and swapping them if they meet some predefined criteria. The order in which these elements are compared differs depending on which sorting algorithm is used. The criteria depends on how the list will be sorted (e.g. in ascending or descending order).

To swap two elements, we can use the std::swap() function from the C++ standard library, which is defined in the algorithm header. For efficiency reasons, std::swap() was moved to the utility header in C++11.

This program prints:

Before swap: x = 2, y = 4
After swap:  x = 4, y = 2

Note that after the swap, the values of x and y have been interchanged!

Selection sort

There are many ways to sort an array. Selection sort is probably the easiest sort to understand, which makes it a good candidate for teaching even though it is one of the slower sorts.

Selection sort performs the following steps to sort an array from smallest to largest:
1) Starting at array index 0, search the entire array to find the smallest value
2) Swap the smallest value found in the array with the value at index 0
3) Repeat steps 1 & 2 starting from the next index

In other words, we’re going to find the smallest element in the array, and swap it into the first position. Then we’re going to find the next smallest element, and swap it into the second position. This process will be repeated until we run out of elements.

Here is an example of this algorithm working on 5 elements. Let’s start with a sample array:

{ 30, 50, 20, 10, 40 }

First, we find the smallest element, starting from index 0:

{ 30, 50, 20, 10, 40 }

We then swap this with the element at index 0:

{ 10, 50, 20, 30, 40 }

Now that the first element is sorted, we can ignore it. Now, we find the smallest element, starting from index 1:

{ 10, 50, 20, 30, 40 }

And swap it with the element in index 1:

{ 10, 20, 50, 30, 40 }

Now we can ignore the first two elements. Find the smallest element starting at index 2:

{ 10, 20, 50, 30, 40 }

And swap it with the element in index 2:

{ 10, 20, 30, 50, 40 }

Find the smallest element starting at index 3:

{ 10, 20, 30, 50, 40 }

And swap it with the element in index 3:

{ 10, 20, 30, 40, 50 }

Finally, find the smallest element starting at index 4:

{ 10, 20, 30, 40, 50 }

And swap it with the element in index 4 (which doesn’t do anything):

{ 10, 20, 30, 40 50 }

Done!

{ 10, 20, 30, 40, 50 }

Note that the last comparison will always be with itself (which is redundant), so we can actually stop 1 element before the end of the array.

Selection sort in C++

Here’s how this algorithm is implemented in C++:

The most confusing part of this algorithm is the loop inside of another loop (called a nested loop). The outside loop (startIndex) iterates through each element one by one. For each iteration of the outer loop, the inner loop (currentIndex) is used to find the smallest element in the remaining array (starting from startIndex+1). smallestIndex keeps track of the index of the smallest element found by the inner loop. Then smallestIndex is swapped with startIndex. Finally, the outer loop (startIndex) advances one element, and the process is repeated.

Hint: If you’re having trouble figuring out how the above program works, it can be helpful to work through a sample case on a piece of paper. Write the starting (unsorted) array elements horizontally at the top of the paper. Draw arrows indicating which elements startIndex, currentIndex, and smallestIndex are indexing. Manually trace through the program and redraw the arrows as the indices change. For each iteration of the outer loop, start a new line showing the current state of the array.

Sorting names works using the same algorithm. Just change the array type from int to std::string, and initialize with the appropriate values.

std::sort

Because sorting arrays is so common, the C++ standard library includes a sorting function named std::sort. std::sort lives in the <algorithm> header, and can be invoked on an array like so:

By default, std::sort sorts in ascending order using operator< to compare pairs of elements and swapping them if necessary (much like our selection sort example does above).

We’ll talk more about std::sort in a future chapter.

Quiz

1) Manually show how selection sort works on the following array: { 30, 60, 20, 50, 40, 10 }. Show the array after each swap that takes place.

2) Rewrite the selection sort code above to sort in descending order (largest numbers first). Although this may seem complex, it is actually surprisingly simple.

3) This one is going to be difficult, so put your game face on.

Another simple sort is called “bubble sort”. Bubble sort works by comparing adjacent pairs of elements, and swapping them if the criteria is met, so that elements “bubble” to the end of the array. Although there are quite a few ways to optimize bubble sort, in this quiz we’ll stick with the unoptimized version here because it’s simplest.

Unoptimized bubble sort performs the following steps to sort an array from smallest to largest:
A) Compare array element 0 with array element 1. If element 0 is larger, swap it with element 1.
B) Now do the same for elements 1 and 2, and every subsequent pair of elements until you hit the end of the array. At this point, the last element in the array will be sorted.
C) Repeat the first two steps again until the array is sorted.

Write code that bubble sorts the following array according to the rules above:

Print the sorted array elements at the end of your program.

Hint: If we’re able to sort one element per iteration, that means we’ll need to iterate as many times as there are numbers in our array to guarantee that the whole array is sorted.
Hint: When comparing pairs of elements, be careful of your array’s range.

4) Add two optimizations to the bubble sort algorithm you wrote in the previous quiz question:

  • Notice how with each iteration of bubble sort, the biggest number remaining gets bubbled to the end of the array. After the first iteration, the last array element is sorted. After the second iteration, the second to last array element is sorted too. And so on… With each iteration, we don’t need to recheck elements that we know are already sorted. Change your loop to not re-check elements that are already sorted.
  • If we go through an entire iteration without doing a swap, then we know the array must already be sorted. Implement a check to determine whether any swaps were made this iteration, and if not, terminate the loop early. If the loop was terminated early, print on which iteration the sort ended early.

Your output should match this:

Early termination on iteration 6
1 2 3 4 5 6 7 8 9

Quiz solutions

1) Show Solution

2) Show Solution

3) Show Solution

4) Show Solution

6.5 -- Multidimensional Arrays
Index
6.3 -- Arrays and loops

407 comments to 6.4 — Sorting an array using selection sort

  • hellmet

    Hmm... I noticed that std::swap is able to change the values of x, y. Is it safe to assume that the function is not working 'by value', rather working on the 'pointer' as alluded in the previous Arrays chapter?

  • Muirgheal Wen

    Hi Alex and nascardriver,
    Would you mind looking over this code? I tested it with a few arrays and it seems to work, although I'm not sure about edge cases. I did it slightly differently than the optimized solution above. I think it should be the same logically but I'm not sure.

    • My comments about your last submission apply to this code too. Other than that:
      - Use single quotation marks for characters.
      - You can move line 46 into the `if` and change it to `isSorted = true`. That's faster and easier to read.
      - Don't use `std::endl` unless you need the stream to be flushed.
      - If your program prints anything, the last thing it prints should be a line feed.

      > for some reason it doesn't work when i pass the array into this function
      The array is decaying to a pointer. Your compiler should've printed a warning. Use `std::size` instead of `sizeof` to get an array's size.

  • Parsa

    I really couldn't come up with a solution for question 3 and the solution was hard to understand.

    Is that normal?

    • Cypeace

      As a beginner I try not to worry too much if I can't figure out some problem. I give it couple of tries and if I'm still unable to resolve it, I just look at the solution and try to understand it from there.
      For example I couldn't solve the 4) and after 2 days of thinking I just looked at the solution and was like "oh, I would never think of that" :-D
      But now I might think of it next time ;-)
      Don't be too hard on yourself, it all comes with time and practice. And even professionals have to ask sometimes..

  • Pat Hardikar

    Hi Alex and Nascardriver, something very weird is going and i cant find why.

    On the early termination line using std::cout, i can write

    ```
    int iteration = itr + 1;
    std::cout << "Early termination on iteration " << iteration << std::endl;
    ```

    but i cant write:

    ```
    std::cout << "Early termination on iteration " << itr + 1 << std::endl;
    ```

    interestingly, this works:
    ```
    std::cout << "Early termination on iteration " << itr + 1.0 << std::endl;
    ```
    Not sure why, since itr is `int` datatype.

    ```
    #include <iostream>
    #include <algorithm>

    void printArray(int array[], int length)
    {
        for (auto itr = 0; itr < length; itr++)
        {
            std::cout << array[itr] << " " ;
        }

        std::cout << "\n";
    }
    int main()
    {

        const int length(9);
        int array[length] = {6, 3, 2, 9, 7, 1, 5, 4, 8};

        printArray(array, length);
        
        bool swapped = false;

        for (int itr = 0; itr < length; itr++)
        {
            swapped = false;
            for (int index = 0; index < length; index++)
            {
                if (array[index] > array[index + 1])
                {
                    std::swap(array[index], array[index + 1]);
                    swapped = true;
                }
                
            }

            if (!swapped && itr < length - 1)
            {
                int iteration = itr + 1;
                // This works, but directly using itr + 1 fails. why???
                std::cout << "Early termination on iteration " << iteration << std::endl;
                break;
            }

        }

        printArray(array, length);
        return 0;
    }
    ```

  • Hi Alex!

    The `std::sort` paragraph

    • Alex

      Thanks for the suggestion! I integrated std::begin and std::end because those are C++11 functions, but am not quite ready to adopt a C++20 only solution for the array length yet. :)

  • Avijit Pandey

    Hi, thanks again for the great tutorials.
    In question 4, I kept track of how many times to iterate slightly differently, I wanna ask if it's okay to do so.
    Please point out any mistakes that I'm not noticing too.

    • - Use `false`/`true` for `bool`. Not 0.
      - Only iterate over 1 variable for better readability.
      - Use single quotation marks for characters.
      - If your program prints anything, the last thing it prints should be a line feed '\n'.

      It's ok, but it's better the way Alex did it. If two variables carry the same information (Which is the case for `currentIndex` and `iterationCount`), their calculations should be based upon one or the other. If something else modified `currentIndex` in your code, the 2 variable would get out of sync, potentially causing unwanted behavior.

  • Grzego

    Hey all,

    Could I ask you to look through my code for the exercise 4?
    It does its job properly, but I would appreciate feedback on the code quality.
    I know I could move some of the functionality to other functions than main, but since it fits on one screen, I did not feel it was overgrown.
    The header has all the includes, below is the main file:

    • Hi!

      - Initialize your variables with brace initializers.
      - Only iterate over one variable. Declare `tempLength` inside the loop.
      - Line 9, 17: If you're using the initial value of a variable, explicitly initialize it.
      - `swapCount` can be replaced by a bool. Increasing an integer is more expensive than setting a bool to true. Using a bool expresses what you're trying to do, making your code more readable.
      - If your program prints anything, the last thing it prints should be a line feed ('\n').

      > I know I could move some of the functionality to other functions than main, but since it fits on one screen, I did not feel it was overgrown.
      You could use your screen in portrait and on the smallest font size with this argument. You can't break this code up in too many functions, but the sorting itself should happen in its own function to increase reusability.

      • Grego

        Thanks for the reply!

        >Initialize your variables with brace initializers.
        - Am I missing something? I looked through the code line by line a couple times, but I cannot find what you mean here.

        >You could use your screen in portrait and on the smallest font size with this argument.
        >But the sorting itself should happen in its own function to increase reusability.
        - Good Point, I should focus on reusability instead of other factors when thinking about functions.

        Thanks again, I appreciate you!

        • > Am I missing something?
          Line 4, 7, 11, 35. If there's an equals sign in your initialization, you're doing it wrong.

  • Denis

    Hi! Check it out my version #4 =D
    Thank you!

  • Andrey

    Hi, first of all thanks for this beautiful educational website! Helps a lot studying C++ on my own. Any mistakes in my code for 4th assignment?

    Thank you!!

    • Hi!

      No mistakes, but room for improvement:

      - Line 7: Initialize your variables with brace initialization.
      - Line 10, 19: If you're using the initial value of a variable, explicitly initialize it.
      - "flag" is a poor name. "swapped" or "shouldContinue" are much more helpful.
      - If your program prints anything, the last thing it prints should be a line feed ('\n').

  • potterman28wxcv

    Here is my code for 4), just in case anyone is curious

    • Hello there!

      - Line 5 and 7 should be merged into `bool swapped{ false };`.
      - Use ++prefix unless you need postfix++. postfix++ can lead to huge amounts of wasted resources due to unnecessary copied being created.
      - Line 27 and 28 should use brace initialization. Alex is to blame for this, but you're posting the code so I'll treat it as yours.
      - Line 31-37 should be in a separate function. You'll never use `bubble_step` without repeating these lines.

      Assuming you're still using camel case for variables, your naming is now fine :)

      • potterman28wxcv

        Hello, thanks again for your feedback!

        > - Use ++prefix unless you need postfix++. postfix++ can lead to huge amounts of wasted resources due to unnecessary copied being created.

        You can disregard my previous question earlier on - I would think that, after compilation, this doesn't have any impact. Indeed, this temporary is never used, so I imagine it could be discarded during some dead code analysis pass.

        I did some experiment on Godbolt, I wrote two versions of some piece of C++ code using either ++a or a++. In both -O0 and -O3, on GCC, it seems to have no influence on the generated assembly.

        https://godbolt.org/z/7Slt4I

        But I guess there could be a performance impact if operator++ is overloaded?

        • > But I guess there could be a performance impact if operator++ is overloaded?
          Exactly. For fundamental types, the compiler is probably able to optimize postfix++ to ++prefix.
          Since you should use ++prefix for all non-fundamental types, you might as well use it for fundamental types.

  • TheFutureCoder

    Hi teacher, can you give me some suggestion on my code?

    • - Initialize your variables with brace initializers.
      - Limit your lines to 80 characters in length for better readability on small displays.
      - Don't compare booleans to false/true.
      - Use ++prefix unless you need postfix++.
      - Use single quotation marks for characters (' ' instead of " ").
      - `main`: Missing return-statement.
      - Unnecessary forward declaration of `printArray`.
      - If your program prints anything, the last thing it prints should be a line feed ('\n' or `std::endl`).

  • Anastasia

    3, 4.
    It was quite a challenge to try and implement a sorting algorithm, a very interesting task too. This one seems to be working:

    • Hi!

      - Line 16: This may or may not work. See lesson O.3.3. Enable compiler warnings.
      - Declare `sorted` inside the loop so you don't have to reset it in every cycle.
      - `index` should be initialized to `((length - 1) - count)`. See the paragraph about optimizations.
      - Inconsistent formatting. Use your editor's auto-formatting feature.

      • Anastasia

        Hi nascardriver!

        I guess my sort is too messy to try to fix it, but still.

        - Line 16: This may or may not work. See lesson O.3.3. Enable compiler warnings.
        Is it better now? Not modifying the same variable in one statement anymore.

        - Declare `sorted` inside the loop so you don't have to reset it in every cycle.
        Where exactly? Couldn't figure out that.

        - `index` should be initialized to `((length - 1) - count)`. See the paragraph about optimizations.
        Hm, I've tried to optimize it so that by the time `count` is decreased by 1 that element is already sorted and not touched anymore. Can you explain in a beginner-friendly way what exactly did I do wrong, please?

        - Inconsistent formatting. Use your editor's auto-formatting feature.
        Sorry again for that, this course is an opportunity for me to learn vim too. I'll try to use another editor for the lessons for now.

        • > Is it better now?
          Yep, that fixed it.

          > Where exactly?
          Sorry, I missed one line in your code. It's fine where it is.

          > Can you explain in a beginner-friendly way what exactly did I do wrong, please?
          Sorry again, I was thinking the front was sorted, but it's the back.

          > learn vim
          Ha! I just started too. There are auto-formatting plugins for vim. You don't need one as long as you're consistent. Auto-formatting definitely saves time though.

          • Anastasia

            When writing something for the first time the code goes through a lot of modifications sometimes resulting in a total mess, having a decent auto-format would certainly help. I've tried to format code with clang, but didn't like the result much (it's more than likely that I just didn't set it up properly), need to look into other options or find a friendlier secondary editor untill I'm more comfortable with vim.

            Anyway, thank you for looking into the code. I guess I'll leave it be as is then (with the line 16 fixed, thanks for pointing that out).

        • YSC

          and also the 8th line

          should be

          because bubble sort only takes (length - 1) to finish

  • Merlin

    My dear teachers,
    what are your thougths about my code for 3)?

    Thanks with regards,
    Merlin

    • - Initialize your variables with brace initializers.
      - If your program prints anything, the last thing it prints should be a line feed ('\n').

      Rest looks good!

      • Nirbhay

        Hey!

        When should we use () and {} for initializations?

        Alex mentioned that {} doesn't drop the value after the decimal in a float value so using {} is better.

        Why did you say "Initialize your variables with brace initializers."?

        Thanks

        • > When should we use () and {} for initializations?
          Use {} unless the type you're initializing has a list constructor that you don't want to use. Since you don't know about list constructors yet, use {} unless it does something weird.

          > Why did you say "Initialize your variables with brace initializers."?
          Line 6, 9, 11, 13, 22. Use {}.

  • DangerousVanilla

    Is it true that the algorithm in quiz question 4 is actually completing the sort in 5 iterations? I think it was just a matter of wording that confused me at first.
    I added a couple visual prints to show me what the array looked like in each step, and where each iteration was ending. It seems to end after 5 iterations in my code below. I just wanted to make sure I hadn't broken something, and that when your code says "Early termination on iteration 6" that this means it's just stopping before iterating the 6th time. Thoughts?

    • You're breaking (line 31) before you get to print in line 33. If you moved line 33 above line 28, you'd get "End iteration 6." as well.

      • DangerousVanilla

        That makes sense. Is there somewhere better I should be putting the "step print loop" as I've called it here to actually see the output of the 6th iteration? As it stands, this is the entire output of the program (no step print after iteration 5):

        6 3 2 9 7 1 5 4 8
        3 6 2 9 7 1 5 4 8
        3 2 6 9 7 1 5 4 8
        3 2 6 7 9 1 5 4 8
        3 2 6 7 1 9 5 4 8
        3 2 6 7 1 5 9 4 8
        3 2 6 7 1 5 4 9 8
        End iteration 1.
        3 2 6 7 1 5 4 8 9
        2 3 6 7 1 5 4 8 9
        2 3 6 1 7 5 4 8 9
        2 3 6 1 5 7 4 8 9
        End iteration 2.
        2 3 6 1 5 4 7 8 9
        2 3 1 6 5 4 7 8 9
        2 3 1 5 6 4 7 8 9
        End iteration 3.
        2 3 1 5 4 6 7 8 9
        2 1 3 5 4 6 7 8 9
        End iteration 4.
        2 1 3 4 5 6 7 8 9
        End iteration 5.
        Done on iteration 6. Final result below.
        1 2 3 4 5 6 7 8 9

  • arcad

    Hi, how does this look like. To avoid to recheck values that are already ordered, I initialize my nested loop at one additional value than the previous for loop. This saved from going over things that gave been already ordered.

  • noobmaster

    Took me more than one hour just to solve the 3rd quiz..

  • Torraturd

    I did this for the fourth assignment with some help from looking at other solutions after getting a little stuck on how to go about it

    • - Limit your lines to 80 characters in length for better readability on small displays.
      - Line 17: You don't need to compare booleans to false/true, you can use `if (!isswapped)`.
      - Line 24: Should be ' '.

      Good job!

  • Michael

    I got this code for the 3rd quiz. Any comment?

    • - Initialize your variables with brace initializers.
      - `main`: Missing return-statement.
      - Missing printed trailing line feed (Print a line feed when your program is done).

      • Michael

        Thanks sir. And I'm having trouble figuring out how to make quiz 4 from my code.

        • You can start the inner loop at `length - (lastIndex + 1)`, that way the inner loop starts at 0, then at 1, then at 2, etc. You can do this because you know that the beginning of the array is already sorted.
          Add a `bool` that's false by default. When you swap 2 elements, set it to true. When the inner loop is done, you can check whether a swap has occurred. If not, the array is sorted.

  • OmegaX128

    Hi there! How does this look for assignment 3 & 4?

  • Alireza

    Hello,
    sorting is very hard to understand . Is there any easier way to ?

    • Alex

      If you're having trouble with sorting, it can help to run the sort algorithm yourself. Take a piece of paper, and put some random numbers in a horizontal line along the top.
      Then execute the algorithm by hand -- for each swap, add a new row below the preceding one with the current sort order.
      After doing that a couple of times, it should help you better understand what these algorithms are actually doing.

      • Alireza

        Thanks for helping,
        And is it good to use other sorting algorithm ? I don't know do they work exactly, though.
        Is it enough to just know selection-sort and not to know other algorithms (e.g. bubble-sort, binary-sort, Linear-sort, ...) ?

        • You don't need to know any sorting algorithm, because you can use @std::sort.
          If you want to write efficient programs (eg. for databases), you'll have to learn sorting algorithms and implement them yourself.

  • Dimbo1911

    Also this (fourth) :)

  • Dimbo1911

    Third assingment

Leave a Comment

Put all code inside code tags: [code]your code here[/code]