Search

8.16 — Timing your code

When writing your code, sometimes you’ll run across cases where you’re not sure whether one method or another will be more performant. So how do you tell?

One easy way is to time your code to see how long it takes to run. C++11 comes with some functionality in the chrono library to do just that. However, using the chrono library is a bit arcane. The good news is that we can easily encapsulate all the timing functional we need into a class that we can then use in our own programs.

Here’s the class:

That’s it! To use it, we instantiate a Timer object at the top of our main function (or wherever we want to start timing), and then call the elapsed() member function whenever we want to know how long the program took to run to that point.

Now, let’s use this in an actual example where we sort an array of 10000 elements. First, let’s use the selection sort algorithm we developed in a previous chapter:

On the author’s machine, three runs produced timings of 0.0507, 0.0506, and 0.0498. So we can say around 0.05 seconds.

Now, let’s do the same test using std::sort from the standard library.

On the author’s machine, this produced results of: 0.000693, 0.000692, and 0.000699. So basically right around 0.0007.

In other words, in this case, std::sort is 100 times faster than the selection sort we wrote ourselves!

A few caveats about timing

Timing is straightforward, but your results can be significantly impacted by a number of things, and it’s important to be aware of what those things are.

First, make sure you’re using a release build target, not a debug build target. Debug build targets typically turn optimization off, and that optimization can have a significant impact on the results. For example, using a debug build target, running the above std::sort example on the author’s machine took 0.0235 seconds -- 33 times as long!

Second, your timing results will be influenced by other things your system may be doing in the background. For best results, make sure your system isn’t doing anything CPU or memory intensive (e.g. playing a game) or hard drive intensive (e.g. searching for a file or running an antivirus scan).

Then measure at least 3 times. If the results are all similar, take the average. If one or two results are different, run the program a few more times until you get a better sense of which ones are outliers. Note that seemingly innocent things, like web browsers, can temporary spike your CPU to 100% utilization when the site you have sitting in the background rotates in a new ad banner and has to parse a bunch of javascript. Running multiple times helps identify whether your initial run may have been impacted by such an event.

Third, when doing comparisons between two sets of code, be wary of what may change between runs that could impact timing. Your system may have kicked off an antivirus scan in the background, or maybe you’re streaming music now when you weren’t previously. Randomization can also impact timing. If we’d sorted an array filled with random numbers, the results could have been impacted by the randomization. Randomization can still be used, but ensure you use a fixed seed (e.g. don’t use the system clock) so the randomization is identical each run. Also, make sure you’re not timing waiting for user input, as how long the user takes to input something should not be part of your timing considerations.

Finally, note that results are only valid for your machine’s architecture, OS, compiler, and system specs. You may get different results on other systems that have different strengths and weaknesses.

8.x -- Chapter 8 comprehensive quiz
Index
8.15 -- Nested types in classes

13 comments to 8.16 — Timing your code

  • Peter Baum

    1. In the timer class, why not use uniform initialization?  If the argument is that you want it to compile regardless of the compiler used, that same argument suggests we shouldn't bother learning any of the new c++ capabilities and just stick to the old formats.  Perhaps the way forward is to teach and use the new formats and make a separate note telling people how to make do if they aren't willing to obtain an uptodate compiler.

    2. The code in main() is missing the units.  Because of the duration cast used, it should be

  • Peter Baum

    First sentence: "performant" isn't a word.  You want to talk about execution speed here.

  • J Gahr

    Hello! In Visual Studio 2017, I ran your selection sort example, and then replaced std::array<>() with array[] to see how their times compared:

    In release mode, the time taken by array[] was identical to std::array<>()'s time (roughly 0.12 seconds).

    In debug mode, however, using array[] resulted in a consistent time of 0.12 seconds, and when I used std::array<>() the time taken was always hovering around 2.1 seconds.

    I was wondering if this is a typical result that happens across most compilers and machines, or if this was specific to my compiler/machine. What do you guys think?

    • nascardriver

      Hi J!

      Debug mode is always slower, because compiler optimizations are usually disabled and your compiler might have added some extra checks.

  • Pierce

    I'm assuming that if the timer outputs 0, then the timer is not set to have enough precision to actually hold the amount of seconds it took

    • Alex

      C++ doesn't require the time to have a specific resolution, so maybe yours simply doesn't have a good enough resolution. You can see your resolution as follows:

      This shows you how many ticks per second your clock has.

  • Sébastien

    I was not able to run the code above although it compiles without any warning. std::chrono crashes the program even before it reaches the first line of main (used gdb to discover that). I use a Win 10 64x architecture(Core-i5 4200H).

  • Benjamin

    Some thoughts about this section:
    The std::sort is deploying better algorithm than yours. Although std::sort sorting algorithm differs by compiler by compiler, it must have average complexity requirement of O(n*log(n)) by the ISOCPP. However, your sort implementation is a kind of selection sort, which causes average complexity(also best and worse) to be O(n^2). Roughly comparing, although the constant will be assumed to be 1, the complexity ratio in your case may be (yours):(std) = 10^8:40000. I think that that would be the one explanation why your implementation was slow, plus the compiler might have the optimized function itself.
    Also, I think your array initialization was not good, as you rely on some garbage values which may not be garbage values on some machine or compiler even compiling mode. Thus, I think it is wise that you fix your array initialization using random function inside <random.h>.
    Finally, I would be appreciated if you publicize the name of the compiler used on the compilation, as it would help everyone compare the algorithm implemented.

    • Alex

      You're correct in that my selection sort algorithm is o(n^2), whereas std::sort is o(n log n). This should account for the bulk of the difference in timing.

      My array initialization looks fine to me. I initialize elements 0 through length-1 with integer values length-1 through 0 (essentially, the array starts reverse sorted). Originally I was randomizing the array, but you have to be careful with that so to ensure you get the same randomization across both runs -- that adds a bit of complexity that I thought muddled the example slightly.

      This was done on Visual Studio 2017. I'm not sure what algorithm it's using. It might be https://en.wikipedia.org/wiki/Introsort

Leave a Comment

Put all code inside code tags: [code]your code here[/code]