Search

10.25 — Introduction to standard library algorithms

New programmers typically spend a lot of time writing custom loops to perform relatively simple tasks, such as sorting or counting or searching arrays. These loops can be problematic, both in terms of how easy it is to make an error, and in terms of overall maintainability, as loops can be hard to understand.

Because searching, counting, and sorting are such common operations to do, the C++ standard library comes with a bunch of functions to do these things in just a few lines of code. Additionally, these standard library functions come pre-tested, are efficient, work on a variety of different container types, and many support parallelization (the ability to devote multiple CPU threads to the same task in order to complete it faster).

The functionality provided in the algorithms library generally fall into one of three categories:

  • Inspectors -- Used to view (but not modify) data in a container. Examples include searching and counting.
  • Mutators -- Used to modify data in a container. Examples include sorting and shuffling.
  • Facilitators -- Used to generate a result based on values of the data members. Examples include objects that multiply values, or objects that determine what order pairs of elements should be sorted in.

These algorithms live in the algorithms library. In this lesson, we’ll explore some of the more common algorithms -- but there are many more, and we encourage you to read through the linked reference to see everything that’s available!

Note: All of these make use of iterators, so if you’re not familiar with basic iterators, please review lesson 10.24 -- Introduction to iterators.

Using std::find to find an element by value

std::find searches for the first occurrence of a value in a container. std::find takes 3 parameters: an iterator to the starting element in the sequence, an iterator to the ending element in the sequence, and a value to search for. It returns an iterator pointing to the element (if it is found) or the end of the container (if the element is not found).

For example:

Sample run when the element is found

Enter a value to search for and replace with: 5 234
13 90 99 234 40 80

Sample run when the element isn’t found

Enter a value to search for and replace with: 0 234
Could not find 0
13 90 99 5 40 80

Using std::find_if to find an element that matches some condition

Sometimes we want to see if there is a value in a container that matches some condition (e.g. a string that contains a specific substring) rather than an exact value. In such cases, std::find_if is perfect. The std::find_if function works similarly to std::find, but instead of passing in a value to search for, we pass in a callable object, such as a function pointer (or a lambda, which we’ll cover later) that checks to see if a match is found. std::find_if will call this function for every element until a matching element is found (or no more elements remain in the container to check).

Here’s an example where we use std::find_if to check if any elements contain the substring “nut”:

Output

Found walnut

If you were to write the above example by hand, you’d need at least three loops (one to loop through the array, and two to match the substring). The standard library functions allow us to do the same thing in just a few lines of code!

Using std::count and std::count_if to count how many occurrences there are

std::count and std::count_if search for all occurrences of an element or an element fulfilling a condition.

In the following example, we’ll count how many elements contain the substring “nut”:

Output

Counted 2 nut(s)

Using std::sort to custom sort

We previously used std::sort to sort an array in ascending order, but std::sort can do more than that. There’s a version of std::sort that takes a function as its third parameter that allows us to sort however we like. The function takes two parameters to compare, and returns true if the first argument should be ordered before the second. By default, std::sort sorts the elements in ascending order.

Let’s use std::sort to sort an array in reverse order using a custom comparison function named greater:

Output

99 90 80 40 13 5

Once again, instead of writing our own custom loop functions, we can sort our array however we like in just a few lines of code!

Our greater function needs 2 arguments, but we’re not passing it any, so where do they come from? When we use a function without parentheses (), it’s only a function pointer, not a call. You might remember this from when we tried to print a function without parentheses and std::cout printed “1”. std::sort uses this pointer and calls the actual greater function with any 2 elements of the array. We don’t know which elements greater will be called with, because it’s not defined which sorting algorithm std::sort is using under the hood. We talk more about function pointers in a later chapter.

Tip

Because sorting in descending order is so common, C++ provides a custom type (named std::greater) for that too (which is part of the functional header). In the above example, we can replace:

with:

Note that the std::greater{} needs the curly braces because it is not a callable function. It’s a type, and in order to use it, we need to instantiate an object of that type. The curly braces instantiate an anonymous object of that type (which then gets passed as an argument to std::sort).

For advanced readers

To further explain how std::sort uses the comparison function, we’ll have to take a step back to a modified version of the selection sort example from lesson 10.4 -- Sorting an array using selection sort.

So far, this is nothing new and sort always sorts elements from low to high. To add a comparison function, we have to use a new type, std::function, to store a function that takes 2 int parameters and returns a bool. Treat this type as magic for now, we will explain it in chapter 11.

We can now pass a comparison function like greater to sort, but how does sort use it? All we need to do is replace the line

with

Now the caller of sort can choose how to compare two elements.

Using std::for_each to do something to all elements of a container

std::for_each takes a list as input and applies a custom function to every element. This is useful when we want to perform the same operation to every element in a list.

Here’s an example where we use std::for_each to double all the numbers in an array:

Output

2 4 6 8

This often seems like the most unnecessary algorithm to new developers, because equivalent code with a range-based for-loop is shorter and easier. But there are benefits to std::for_each. Let’s compare std::for_each to a range-based for-loop.

With std::for_each, our intentions are clear. Call doubleNumber with each element of arr. In the range-based for-loop, we have to add a new variable, i. This leads to several mistakes that a programmer could do when they’re tired or not paying attention. For one, there could be an implicit conversion if we don’t use auto. We could forget the ampersand, and doubleNumber wouldn’t affect the array. We could accidentally pass a variable other than i to doubleNumber. These mistakes cannot happen with std::for_each.

Additionally, std::for_each can skip elements at the beginning or end of a container, for example to skip the first element of arr, std::next can be used to advance begin to the next element.

This isn’t possible with a range-based for-loop.

Like many algorithms, std::for_each can be parallelized to achieve faster processing, making it better suited for large projects and big data than a range-based for-loop.

Order of execution

Note that most of the algorithms in the algorithms library do not guarantee a particular order of execution. For such algorithms, take care to ensure any functions you pass in do not assume a particular ordering, as the order of invocation may not be the same on every compiler.

The following algorithms do guarantee sequential execution: std::for_each, std::copy, std::copy_backward, std::move, and std::move_backward.

Best practice

Unless otherwise specified, do not assume that standard library algorithms will execute in a particular sequence. std::for_each, std::copy, std::copy_backward, std::move, and std::move_backward have sequential guarantees.

Ranges in C++20

Having to explicitly pass arr.begin() and arr.end() to every algorithm is a bit annoying. But fear not -- C++20 adds ranges, which allow us to simply pass arr. This will make our code even shorter and more readable.

Conclusion

The algorithms library has a ton of useful functionality that can make your code simpler and more robust. We only cover a small subset in this lesson, but because most of these functions work very similarly, once you know how a few work, you can make use of most of them.

Best practice

Favor using functions from the algorithms library over writing your own functionality to do the same thing


10.x -- Chapter 10 comprehensive quiz
Index
10.24 -- Introduction to iterators

112 comments to 10.25 — Introduction to standard library algorithms

  • Vincent W Vesser

    In the very first example why is there a * in front of found (i.e. *found) in the else statement. Shouldn't you just be able to assign found a new value like : found = replace;

  • Waldo Lemmer

    This lesson uses 2-space indentation.

    Section "Using std::sort to custom sort":
    >

    The parameters should be `const`, since (https://en.cppreference.com/w/cpp/algorithm/sort):
    > While the signature does not need to have const &, the function must not modify the objects passed to it

    `const` is thus safer.

    • nascardriver

      There's no code style-guide for learncpp. The code examples with 2 spaces were likely posted by me.

      Pass-by-value parameters can never be modified, `const` on a pass-by-value parameter has no effect concerning the caller.

      Adding `const` would only confuse the caller.

      If you separate functions into declaration and definition, `const` at the definition makes sense, because it prevents the author of the function from modifying the variables (like local variables), but the `const` isn't visible to the caller.

      • Waldo Lemmer

        > There's no code style-guide for learncpp

        Ok, I'll stop pointing code formatting stuff out

        > Pass-by-value parameters can never be modified

        Oh yeah, I didn't think of that xD

        > `const` at the definition makes sense [...]

        I appreciate the tip, thanks!

  • "So far, this is nothing new and sort always sorts elements from low to high. To add a comparison function, we have to use a new type, std::function, to store a function that takes 2 int parameters and returns a bool. Treat this type as magic for now, we will explain it in chapter 10."

    it is not explained in chapter 10

  • yeokaiwei

    Hi Alex,
    Is the C++ standard library sorting algorithm called Introsort?

    Does Introsort switch between quicksort and heapsort?

    Lastly, do you cover Radix sort?

    • nascardriver

      There is no "standard library sorting algorithm", standard libraries can implement `std::sort()` however they like as long as it is O(N*log(N)).
      libstdc++ (The GNU standard library, which is usually used on linux), does use introsort.
      It recursively partitions the input on a median-of-3 up to a depth of log2(input.size)*2 (But stops on partitions with less than 16 elements).
      Each partition (If it has at least 16 elements), is heap-sorted.
      Finally, the entire array is insertion-sorted.

      Radix-sort is not covered on learncpp.

      • yeokaiwei

        Thanks nascar.

        Got it, so std::sort chooses the optimal method for sorting based on the number of elements.

        Is Radix-sort for special cases only, like >1000000 elements?

        Is Radix-sort used in the Standard Library?

        • nascardriver

          Radix sort can be faster than libstdc++'s `std::sort()` at small amounts of elements already.
          I've benchmarked [cpu007's radix sort](https://github.com/cpu007/Fast-Radix-Sort) vs libstdc++'s `std::sort()` here https://quick-bench.com/q/e14xHKAJEvXfReO516VtmAQnU3A (Lower is faster)
          `std::sort()` is faster for <1000 elements, but radix sort has the upper hand past that.

          The standard library only has `std::sort()` (An arbitrary sorting algorithm), `std::stable_sort()` (A stable but still arbitrary algorithm), and `std::sort_heap()`.
          libstdc++ does not have a radix sort as far as I can tell.

          The standard library is meant to make everyone somewhat happy. If you need a special sorting algorithm because it's better in your use-case, you'll need to write it yourself or get it from somewhere that's not the standard library.

          I suppose radix sort is not used much because it is too restrictive, because elements need to be separable.

  • Brad

    Typo:
    "Now the caller of sort can chose how to compare two elements."
    -> chose should be choose.

  • frog

    Hello, thanks for the tutorials.

    I want to ask about the order of execution of algorithms in the algorithms library, let's take std::sort for example

    The output is like what I expected (using VS2019), the elements sorted in ascending order. So is the concern with this is that other compilers might invoke the ascending sort first and then the descending sort after?

    • Alex

      No, your program will always execute deterministically. When we talk about sequential execution, we mean that a given algorithm always executes in sequential order. For example, since std::for_each has a sequential guarantee, we know it will look at the first element first, the second element second, etc... Without that guarantee, it might have worked in reverse order, or in a randomized order.

  • Haldhar Patel

    So find_if only returns the first occurrence of something in a container.

    Hence if we apply find_if(for "nut") on below array we will get only peanut not both(peanut and walnut)

    But I want to print both the occurrence of "nut" So I change the program like this:

    Output: peanut walnut

    But as you can see findif is a custom function and it don't have the third parameter(function name), I tried passing it but it asks for type of function in findif implementation and I don't know what to do till chapter 9.

    maybe I will make it work later in this tutorials so that this custom function can be used globally like if it is a algorithm.

    Also I don't know if there is any algorithm which does the same thing I did but still I wanted to try :)

    • std::for_each will be perfect for this case.

    • nascardriver

      If you want to copy the found elements, you can use `std::copy_if()`

      The vector is expensive, it can be avoided if you don't need to store the found elements. They can be printed immediately

      Since C++20, you can also use `std::views::filter()` to avoid the copy

  • Forhad Rahman

    In the advanced reader section of Using std::sort to custom sort, it is said that - "Treat this type as magic for now, we will explain it in chapter 7."

    If I am not wrong(or lost), the chapter '7' would be something else? Because we are already in chapter 9.x, and chapter 7 has already been past so...

  • Chris

    Why

    bool containsNut(std::string_view str)

    instead of just

    bool containsNut(){
    std::string_view str
    }

    Why does it work inside std::find_if when it should usually just return error "too few arguments in function call" ?

    • nascardriver

      This `containsNut` is not a function call (Note the missing parentheses). We're passing the `containsNut` function to `std::find_if`, and `std::find_if` calls `containsNut` with each element of the array as an argument.

      This use of `containsNut` is a function pointer, it's covered in chapter 10.

  • Charles

    For the `std::for_each(std::next(arr.begin()), arr.end(), doubleNumber);` example, is there a drawback to using `std::for_each(arr.begin() + 1, arr.end(), doubleNumber);` as done in earlier examples in previous chapters for pointers?

    • nascardriver

      If `arr` is a `std::array`, either way works. However, for other container types, the iterator returned by `arr.begin()` might not support `operator+`. `std::next()` is compatible with more iterator types (eg. `std::list` or `std::map`).

  • chai

    Hi there ! How does a type work in the sort function above?

    Is there a chapter to visit?
    Thanks.

  • Ryan

    Found my answer - explained later in the article!

  • Albert Hendriks

    Thanks for this great tutorial. I've been patiently working through chapters 1-8 while looking forward to chapter 9 and I wasn't disappointed. Just one question about this notation:

        std::function<bool(int, int)> compare

    Coming from Java, I would expect something else instead of the above notation. I'd expect something like:

        std::function2<bool, int, int> compare

    A round bracket () inside triangular brackets <> seems to only make sense in this specific context of `std::function` (where else would you use () inside <> it if it's not for a function?), but I don't suppose this C++ syntax was created particularly for `std::function`?

    I'm thinking that perhaps `bool(int, int)` is a valid type in C++, but then why isn't `compare` of type `bool(int, int)` instead of type `std::function<bool(int, int)>`?

    • nascardriver

      Function pointers are covered in chapter 10, that should clear things up for you. If you have any questions after that, please ask again.

  • Oran

    at the std::sort section , right before the tip it says "We talk more about function pointers in chapter 7" and also in the advanced section before the line "void sort(int *begin, int *end, std::function<bool(int, int)> compare)"

    thanks for the great tutorial!

  • Fr1end

    in this line

    if (found == arr.end ())

    why not replace it with this

    if (*found)

  • yeokaiwei

    1. Query
    Sorry, may I ask what std::string_view::npos does?

    I'm at 7.16 but I just realised I'm not really certain what it does.

    Is npos an error message?

    What does npos stand for?

    I figured it out from a video on Youtube, npos stands for "NOT A POSITION".

    "As a return value, it is usually used to indicate no matches.

    This constant is defined with a value of -1, which because size_t is an unsigned integral type, it is the largest possible representable value for this type."

    Reference:
    https://www.cplusplus.com/reference/string/string/npos/

Leave a Comment

Put all code inside code tags: [code]your code here[/code]