Search

6.4 — Sorting an array using selection sort

A case for sorting

Sorting an array is the process of arranging all of the elements in the array in a particular order. There are many different cases in which sorting an array can be useful. For example, your email program generally displays emails in order of time received, because more recent emails are typically considered more relevant. When you go to your contact list, the names are typically in alphabetical order, because it’s easier to find the name you are looking for that way. Both of these presentations involve sorting data before presentation.

Sorting an array can make searching an array more efficient, not only for humans, but also for computers. For example, consider the case where we want to know whether a name appears in a list of names. In order to see whether a name was on the list, we’d have to check every element in the array to see if the name appears. For an array with many elements, searching through them all can be expensive.

However, now assume our array of names is sorted alphabetically. In this case, we only need to search up to the point where we encounter a name that is alphabetically greater than the one we are looking for. At that point, if we haven’t found the name, we know it doesn’t exist in the rest of the array, because all of the names we haven’t looked at in the array are guaranteed to be alphabetically greater!

It turns out that there are even better algorithms to search sorted arrays. Using a simple algorithm, we can search a sorted array containing 1,000,000 elements using only 20 comparisons! The downside is, of course, that sorting an array is comparatively expensive, and it often isn’t worth sorting an array in order to make searching fast unless you’re going to be searching it many times.

In some cases, sorting an array can make searching unnecessary. Consider another example where we want to find the best test score. If the array is unsorted, we have to look through every element in the array to find the greatest test score. If the list is sorted, the best test score will be in the first or last position (depending on whether we sorted in ascending or descending order), so we don’t need to search at all!

How sorting works

Sorting is generally performed by repeatedly comparing pairs of array elements, and swapping them if they meet some predefined criteria. The order in which these elements are compared differs depending on which sorting algorithm is used. The criteria depends on how the list will be sorted (e.g. in ascending or descending order).

To swap two elements, we can use the std::swap() function from the C++ standard library, which is defined in the algorithm header. For efficiency reasons, std::swap() was moved to the utility header in C++11.

This program prints:

Before swap: x = 2, y = 4
After swap:  x = 4, y = 2

Note that after the swap, the values of x and y have been interchanged!

Selection sort

There are many ways to sort an array. Selection sort is probably the easiest sort to understand, which makes it a good candidate for teaching even though it is one of the slower sorts.

Selection sort performs the following steps to sort an array from smallest to largest:
1) Starting at array index 0, search the entire array to find the smallest value
2) Swap the smallest value found in the array with the value at index 0
3) Repeat steps 1 & 2 starting from the next index

In other words, we’re going to find the smallest element in the array, and swap it into the first position. Then we’re going to find the next smallest element, and swap it into the second position. This process will be repeated until we run out of elements.

Here is an example of this algorithm working on 5 elements. Let’s start with a sample array:

{ 30, 50, 20, 10, 40 }

First, we find the smallest element, starting from index 0:

{ 30, 50, 20, 10, 40 }

We then swap this with the element at index 0:

{ 10, 50, 20, 30, 40 }

Now that the first element is sorted, we can ignore it. Now, we find the smallest element, starting from index 1:

{ 10, 50, 20, 30, 40 }

And swap it with the element in index 1:

{ 10, 20, 50, 30, 40 }

Now we can ignore the first two elements. Find the smallest element starting at index 2:

{ 10, 20, 50, 30, 40 }

And swap it with the element in index 2:

{ 10, 20, 30, 50, 40 }

Find the smallest element starting at index 3:

{ 10, 20, 30, 50, 40 }

And swap it with the element in index 3:

{ 10, 20, 30, 40, 50 }

Finally, find the smallest element starting at index 4:

{ 10, 20, 30, 40, 50 }

And swap it with the element in index 4 (which doesn’t do anything):

{ 10, 20, 30, 40 50 }

Done!

{ 10, 20, 30, 40, 50 }

Note that the last comparison will always be with itself (which is redundant), so we can actually stop 1 element before the end of the array.

Selection sort in C++

Here’s how this algorithm is implemented in C++:

The most confusing part of this algorithm is the loop inside of another loop (called a nested loop). The outside loop (startIndex) iterates through each element one by one. For each iteration of the outer loop, the inner loop (currentIndex) is used to find the smallest element in the remaining array (starting from startIndex+1). smallestIndex keeps track of the index of the smallest element found by the inner loop. Then smallestIndex is swapped with startIndex. Finally, the outer loop (startIndex) advances one element, and the process is repeated.

Hint: If you’re having trouble figuring out how the above program works, it can be helpful to work through a sample case on a piece of paper. Write the starting (unsorted) array elements horizontally at the top of the paper. Draw arrows indicating which elements startIndex, currentIndex, and smallestIndex are indexing. Manually trace through the program and redraw the arrows as the indices change. For each iteration of the outer loop, start a new line showing the current state of the array.

Sorting names works using the same algorithm. Just change the array type from int to std::string, and initialize with the appropriate values.

std::sort

Because sorting arrays is so common, the C++ standard library includes a sorting function named std::sort. std::sort lives in the <algorithm> header, and can be invoked on an array like so:

By default, std::sort sorts in ascending order using operator< to compare pairs of elements and swapping them if necessary (much like our selection sort example does above).

We’ll talk more about std::sort in a future chapter.

Quiz time

Question #1


Manually show how selection sort works on the following array: { 30, 60, 20, 50, 40, 10 }. Show the array after each swap that takes place.

Show Solution

Question #2


Rewrite the selection sort code above to sort in descending order (largest numbers first). Although this may seem complex, it is actually surprisingly simple.

Show Solution

Question #3


This one is going to be difficult, so put your game face on.

Another simple sort is called “bubble sort”. Bubble sort works by comparing adjacent pairs of elements, and swapping them if the criteria is met, so that elements “bubble” to the end of the array. Although there are quite a few ways to optimize bubble sort, in this quiz we’ll stick with the unoptimized version here because it’s simplest.

Unoptimized bubble sort performs the following steps to sort an array from smallest to largest:
A) Compare array element 0 with array element 1. If element 0 is larger, swap it with element 1.
B) Now do the same for elements 1 and 2, and every subsequent pair of elements until you hit the end of the array. At this point, the last element in the array will be sorted.
C) Repeat the first two steps again until the array is sorted.

Write code that bubble sorts the following array according to the rules above:

Print the sorted array elements at the end of your program.

Hint: If we’re able to sort one element per iteration, that means we’ll need to iterate roughly as many times as there are numbers in our array to guarantee that the whole array is sorted.
Hint: When comparing pairs of elements, be careful of your array’s range.

Show Solution

Question #4


Add two optimizations to the bubble sort algorithm you wrote in the previous quiz question:

  • Notice how with each iteration of bubble sort, the biggest number remaining gets bubbled to the end of the array. After the first iteration, the last array element is sorted. After the second iteration, the second to last array element is sorted too. And so on… With each iteration, we don’t need to recheck elements that we know are already sorted. Change your loop to not re-check elements that are already sorted.
  • If we go through an entire iteration without doing a swap, then we know the array must already be sorted. Implement a check to determine whether any swaps were made this iteration, and if not, terminate the loop early. If the loop was terminated early, print on which iteration the sort ended early.

Your output should match this:

Early termination on iteration 6
1 2 3 4 5 6 7 8 9

Show Solution


6.5 -- Multidimensional Arrays
Index
6.3 -- Arrays and loops

456 comments to 6.4 — Sorting an array using selection sort

  • Shady

    That's what i came up with. Can i get some feedback for it please? :)

  • Abhishek A Udupa

    Here's the bubble sort code I wrote:

    Could you please check and tell me if this could be done any better?

    • nascardriver

      - Initialize variables with brace initialization for higher type safety.
      - Line 6 is `true` when it's evaluated the first time. Use a do-while-loop instead.
      - Use single quotation marks for characters (' ' vs " "). Strings are more expensive.
      - If you don't need an index, use a range-based for-loop (Line 16).

  • Ayrton Fithiadi Sedjati

    I have an objection regarding the hint in the quiz no. 3 saying, "...we’ll need to iterate as many times as there are numbers in our array to guarantee that the whole array is sorted."

    I found that it is inconsistent with your solution. We have 9 numbers in our array, but your first for loop:

    suggests that the number of required iteration is 8 (starting with zero and ending *before* "length-1" which means 7).

    I believe that your code is correct but your hint is not. The number of required iteration should be "one less than the number of elements in your array". We can show that for an array with 4 elements, in the worst case scenario (sorted the opposite way), it would require no more than 3 iterations to sort using the bubble sort algorithm:

    initial state:             4 3 2 1
    1st iteration result:  3 2 1 4
    2nd iteration result: 2 1 3 4
    3rd iteration result: 1 2 3 4

    I think this holds true for any number of elements in the array.

    • nascardriver

      I agree, the hint is wrong. Though, I'll keep it like that, because writing one less than the number of elements is too complicated to write in a short and easy to understand way, and it's too specific. Being too specific will cause readers to try to come up with exactly this number of iterations, which is not required. Anyway, thank you for your suggestion and example!

  • paprika

    You may insert this dynamic picture to make the description of selection sort more clearly. :D
    [https://images2017.cnblogs.com/blog/849589/201710/849589-20171015224719590-1433219824.gif]

  • Constantine

    Hi! I wanted to get size of an array inside of a function, but, I ran onto some issues that I can't understand by myself. Run the code below:

    So, in output, everything seems to be fine when in main(), but, fun begins when in printLength(); where only the size of an element is "measured" correctly.

    • nascardriver

      When you pass an array to a function, it will decay to a pointer and lose its size information (We talk about this later in this chapter).
      For the time being, you can pass the size of the array as a separate argument to the function that uses the array.

  • Panda23

    So this is the code for the program to sort an array in ascending format without messing with the original array. So I used an array of pointers. And the code worked fine at first. But then I introduced the lines in the main function (which I have commented out for you to see) that compile successfully but give a run time error on Visual studio compiler. The error says "Debug Assertation failed". What am I doing wrong?

    • nascardriver

      You're calling `delete[]` on an object that wasn't dynamically allocated. `array_of_ptr[i]` points to `arr[i]`, which has automatic storage duration, ie. you can't free it manually.

      - Initialize variables with brace initialization for higher type safety.
      - Use ++prefix. postfix++ is slower.
      - Use single quotation marks for characters, they're faster.

      • Panda23

        Oh so I should just do :

        delete[]array_of_ptr;

        Got it.

        One more question:Project vs Solution

        I have searched on stack overflow and various other sites....but I cannot quite grasp the difference between a Project and a Solution in the Visual Studio IDE.

        What I understand so far:

        -> Solution is a container of sorts that can contain multiple Projects and multiple assemblies dont have to be generated. But what does this actually mean?

        -> Its all a bit confusing because, though I use Visual Studio daily, I still have a very basic idea of a project as a combination of header files, cpp files and executable files.

        ->So what does the statement "multiple projects in one solution" means in light of this? Do some projects use files and codes from other projects when they are from one solution?

        Something tells me there are more details to these. Please explain and/or point me to a place where I can learn more about these things! I would be really grateful!

  • spazzlo

    I'm trying to have some code mix up the array so I can sort it again, but it doesn't seem to be mixing anything up!

    What I expect it to output would be a random list of numbers, like 2, 4, 3, 1, 0
    But what actually prints out is 0, 1, 2, 3, 4

  • sherrigan

    Hello hope you are having a good day!
    So I am writing a sort function which should push all the 0s present in an array to the right of the array, also preserving the order of the other elements of the array that are not zero. My array is:

    The /**/ indicate the place where my logic seems to breaking because the output produced is:

    Array Elements:
    1,1,1,2,2,1,2,1,2,2,1,2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,

    I cannot figure out why?

    My code:

    • nascardriver

      Hello!

      - Initialize variables with brace initialization for higher type safety (Lesson 1.4).
      - Use `std::size` to get an array's size (Lesson 6.2).
      - Use your editor's auto-formatter.
      - Name variables descriptively. Avoid abbreviations.
      - Use ++prefix unless you need postfix++, it's faster than postfix++.

      Change your array to `{ 0, 0, 1 }`. That should give you an idea of what's going wrong.

      • sherrigan

        Thanks! Solved the problem!
        Any tips on how to come up with potential test cases - for problems related to array sortings and arragments - that have the chance of exposing any shortcoming in my logic that I might have?I

        • nascardriver

          Look at what the algorithm is supposed to do, cover its edge cases. In this example, 1 and 2 element arrays, zeroes at the beginning and end. Then add some cases that have no reason to fail, zeroes in the middle and a random array.

  • kitabski

    Good day!!!
    I will really appriciate if someone could clarify one point...

    Here is the code:

    Once initializing int "outer" in the first loop to 0 than within the brackets it should immediately become 1,
    but it remains 0 till the outer loops starts running the second time.
    What have i missed?

    • nascardriver

      The iteration expression is executed _after_ every cycle.

      • kitabski

        Thanks a lot, nascardriver!!!
        Just to clarify...
        Does it mean after closing brace brackets?

        • nascardriver

          `i` doesn't exist after the for-loop. I suggest you re-read lesson L.5.7.

          • kitabski

            Thank you a lot, nascardriver!!!!
            As it turned out i just had to read more carefully:

            First, we declare a loop variable named count, and assign it the value 0.

            Second, count < 10 is evaluated, and since count is 0, 0 < 10 evaluates to true. Consequently, the statement executes, which prints 0.

            Third, ++count is evaluated, which increments count to 1. Then the loop goes back to the second step.

            Thanks again for the best tutorial and immediate help!!!
            You guys are awesome!!!

  • Sinethemba

    Hi. I have managed to solve question 4 as in below. Please feel free to comment, your input would be highly appreciated.

    • nascardriver

      You don't care how many swaps were made, so you don't need to count them. Setting `swapped` to `true` when you swap is enough. Then you can check `swapped` _after_ the inner loop.

  • kavin

    Hi, i was able to solve the 1st 3 quizzes without problems. But for 4th quiz i was not able to figure out how to implement bool. So, i checked with your solution and implemented bool to printout early termination for my code. Is there any other way to implement early termination ? Btw, is my solution style ok or ?  

    • nascardriver

      Hi!

      Instead of `swap`, you could use a variable that counts how many swaps where made and check if that variable is greater than 0 at the end. This isn't recommended, but several other people did this so it might be easier to understand. If you look through the comments, you'll find it.

      Your code looks great :) There's an extra space in- and above your array declaration and a missing space at your include. Your auto-formatter would have fixed those.

      • kavin

        Thank you @nascardriver. I am using auto format by edit-->advanced-->format document [or] selection all and using edit-->advanced-->format selection. It formats all code except #include(s) and spacing between statements. Which setting should i change ?

        • nascardriver

          I don't have VS, I can't help you there. Seems odd that the formatter doesn't add/remove spaces. Look through the settings and google.

  • Suyash

    Here is my complete implementation of the Bubble Sort algorithm (after implementing both the optimizations)... Works as expected but as usual, I would love to hear from you about any unintentional mistakes...

    (This exercise reminded me of my algorithms course in college... Good times...)

    In main.cpp (Compiler is C++17 compatible)

Leave a Comment

Put all code inside code tags: [code]your code here[/code]