Search

5.10 — std::cin, extraction, and dealing with invalid text input

Most programs that have a user interface of some kind need to handle user input. In the programs that you have been writing, you have been using std::cin to ask the user to enter text input. Because text input is so free-form (the user can enter anything), it’s very easy for the user to enter input that is not expected.

As you write programs, you should always consider how users will (unintentionally or otherwise) misuse your programs. A well-written program will anticipate how users will misuse it, and either handle those cases gracefully or prevent them from happening in the first place (if possible). A program that handles error cases well is said to be robust.

In this lesson, we’ll take a look specifically at ways the user can enter invalid text input via std::cin, and show you some different ways to handle those cases.

std::cin, buffers, and extraction

In order to discuss how std::cin and operator>> can fail, it first helps to know a little bit about how they work.

When we use operator>> to get user input and put it into a variable, this is called an “extraction”. operator>> is accordingly called the extraction operator when used in this context.

When the user enters input in response to an extraction operation, that data is placed in a buffer inside of std::cin. A buffer (also called a data buffer) is simply a piece of memory set aside for storing data temporarily while it’s moved from one place to another. In this case, the buffer is used to hold user input while it’s waiting to be extracted to variables.

When the extraction operator is used, the following procedure happens:

  • If there is data already in the input buffer, that data is used for extraction.
  • If the input buffer contains no data, the user is asked to input data for extraction (this is the case most of the time). When the user hits enter, a ‘\n’ character will be placed in the input buffer.
  • operator>> extracts as much data from the input buffer as it can into the variable (ignoring any leading whitespace characters, such as spaces, tabs, or ‘\n’).
  • Any data that can not be extracted is left in the input buffer for the next extraction.

Extraction succeeds if at least one character can be extracted from the input buffer. Any unextracted input is left in the input buffer for future extractions. For example:

If the user enters “5a”, 5 will be extracted, converted to an integer, and assigned to variable x. “a\n” will be left in the input stream for the next extraction.

Extraction fails if the input data does not match the type of the variable being extracted to. For example:

If the user were to enter ‘b’, extraction would fail because ‘b’ can not be extracted to an integer variable.

Validating input

The process of checking whether user input conforms to what the program is expecting is called input validation.

There are three basic ways to do input validation:

  • Inline (as the user types)
    • Prevent the user from typing invalid input in the first place.
  • Post-entry (after the user types)
    • Let the user enter whatever they want into a string, then validate whether the string is correct, and if so, convert the string to the final variable format.
    • Let the user enter whatever they want, let std::cin and operator>> try to extract it, and handle the error cases.

Some graphical user interfaces and advanced text interfaces will let you validate input as the user enters it (character by character). Generally speaking, the programmer provides a validation function that accepts the input the user has entered so far, and returns true if the input is valid, and false otherwise. This function is called every time the user presses a key. If the validation function returns true, the key the user just pressed is accepted. If the validation function returns false, the character the user just input is discarded (and not shown on the screen). Using this method, you can ensure that any input the user enters is guaranteed to be valid, because any invalid keystrokes are discovered and discarded immediately. Unfortunately, std::cin does not support this style of validation.

Since strings do not have any restrictions on what characters can be entered, extraction is guaranteed to succeed (though remember that std::cin stops extracting at the first non-leading whitespace character). Once a string is entered, the program can then parse the string to see if it is valid or not. However, parsing strings and converting string input to other types (e.g. numbers) can be challenging, so this is only done in rare cases.

Most often, we let std::cin and the extraction operator do the hard work. Under this method, we let the user enter whatever they want, have std::cin and operator>> try to extract it, and deal with the fallout if it fails. This is the easiest method, and the one we’ll talk more about below.

A sample program

Consider the following calculator program that has no error handling:

This simple program asks the user to enter two numbers and a mathematical operator.

Enter a double value: 5
Enter one of the following: +, -, *, or /: *
Enter a double value: 7
5 * 7 is 35

Now, consider where invalid user input might break this program.

First, we ask the user to enter some numbers. What if they enter something other than a number (e.g. ‘q’)? In this case, extraction will fail.

Second, we ask the user to enter one of four possible symbols. What if they enter a character other than one of the symbols we’re expecting? We’ll be able to extract the input, but we don’t currently handle what happens afterward.

Third, what if we ask the user to enter a symbol and they enter a string like “*q hello”. Although we can extract the ‘*’ character we need, there’s additional input left in the buffer that could cause problems down the road.

Types of invalid text input

We can generally separate input text errors into four types:

  • Input extraction succeeds but the input is meaningless to the program (e.g. entering ‘k’ as your mathematical operator).
  • Input extraction succeeds but the user enters additional input (e.g. entering ‘*q hello’ as your mathematical operator).
  • Input extraction succeeds but the user overflows a numeric value.
  • Input extraction fails (e.g. trying to enter ‘q’ into a numeric input).

Thus, to make our programs robust, whenever we ask the user for input, we ideally should determine whether each of the above can possibly occur, and if so, write code to handle those cases.

Let’s dig into each of these cases, and how to handle them using std::cin.

Error case 1: Extraction succeeds but input is meaningless

This is the simplest case. Consider the following execution of the above program:

Enter a double value: 5
Enter one of the following: +, -, *, or /: k
Enter a double value: 7

In this case, we asked the user to enter one of four symbols, but they entered ‘k’ instead. ‘k’ is a valid character, so std::cin happily extracts it to variable op, and this gets returned to main. But our program wasn’t expecting this to happen, so it doesn’t properly deal with this case (and thus never outputs anything).

The solution here is simple: do input validation. This usually consists of 3 steps:

1) Check whether the user’s input was what you were expecting.
2) If so, return the value to the user.
3) If not, tell the user something went wrong and have them try again.

Here’s an updated getOperator() function that does input validation.

As you can see, we’re using a while loop to continuously loop until the user provides valid input. If they don’t, we ask them to try again until they either give us valid input, shutdown the program, or destroy their computer.

Error case 2: Extraction succeeds but with extraneous input

Consider the following execution of the above program:

Enter a double value: 5*7

What do you think happens next?

Enter a double value: 5*7
Enter one of the following: +, -, *, or /: Enter a double value: 5 * 7 is 35

The program prints the right answer, but the formatting is all messed up. Let’s take a closer look at why.

When the user enters “5*7” as input, that input goes into the buffer. Then operator>> extracts the 5 to variable x, leaving “*7\n” in the buffer. Next, the program prints “Enter one of the following: +, -, *, or /:”. However, when the extraction operator was called, it sees “*7\n” waiting in the buffer to be extracted, so it uses that instead of asking the user for more input. Consequently, it extracts the ‘*’ character, leaving “7\n” in the buffer.

After asking the user to enter another double value, the “7” in the buffer gets extracted without asking the user. Since the user never had an opportunity to enter additional data and hit enter (causing a newline), the output prompts all get run together on the same line, even though the output is correct.

Although the above problem works, the execution is messy. It would be better if any extraneous characters entered were simply ignored. Fortunately, that’s easy to do:

Since the last character the user entered must be a ‘\n’, we can tell std::cin to ignore buffered characters until it finds a newline character (which is removed as well).

Let’s update our getDouble() function to ignore any extraneous input:

Now our program will work as expected, even if we enter “5*7” for the first input -- the 5 will be extracted, and the rest of the characters will be removed from the input buffer. Since the input buffer is now empty, the user will be properly asked for input the next time an extraction operation is performed!

Error case 3: Extraction fails

Now consider the following execution of the calculator program:

Enter a double value: a

You shouldn’t be surprised that the program doesn’t perform as expected, but how it fails is interesting:

Enter a double value: a
Enter one of the following: +, -, *, or /: Enter a double value: 

and the program suddenly ends.

This looks pretty similar to the extraneous input case, but it’s a little different. Let’s take a closer look.

When the user enters ‘a’, that character is placed in the buffer. Then operator>> tries to extract ‘a’ to variable x, which is of type double. Since ‘a’ can’t be converted to a double, operator>> can’t do the extraction. Two things happen at this point: ‘a’ is left in the buffer, and std::cin goes into “failure mode”.

Once in ‘failure mode’, future requests for input extraction will silently fail. Thus in our calculator program, the output prompts still print, but any requests for further extraction are ignored. The program simply runs to the end and then terminates (without printing a result, because we never read in a valid mathematical operation).

Fortunately, we can detect whether an extraction has failed and fix it:

That’s it!

Let’s integrate that into our getDouble() function:

Error case 4: Extraction succeeds but the user overflows a numeric value

Consider the following simple example:

What happens if the user enters a number that is too large (e.g. 40000)?

Enter a number between -32768 and 32767: 40000
Enter another number between -32768 and 32767: The sum is: 0

In the above case, std::cin goes immediately into “failure mode”, so it does not assign a value to variable x. Consequently, x is left with the initialized value of 0. Additional inputs are skipped, leaving y with the initialized value of 0 as well. We can handle this kind of error in the same way as a failed extraction.

Putting it all together

Here’s our example calculator with full error checking:

Conclusion

As you write your programs, consider how users will misuse your program, especially around text input. For each point of text input, consider:

  • Could extraction fail?
  • Could the user enter more input than expected?
  • Could the user enter meaningless input?
  • Could the user overflow an input?

You can use if statements and boolean logic to test whether input is expected and meaningful.

The following code will test for and fix failed extractions or overflow:

The following will also clear any extraneous input:

Finally, use loops to ask the user to re-enter invalid input.

Note: Input validation is important and useful, but it also tends to make examples more complicated and harder to follow. Accordingly, in future lessons, we will generally not do any kind of input validation unless it’s relevant to something we’re trying to teach.

5.11 -- Introduction to testing your code
Index
5.9 -- Random number generation

50 comments to 5.10 — std::cin, extraction, and dealing with invalid text input

  • Abe

    I don’t get it how this number 32767 leads to ignore invalid input?

    And it is kinda unlucky choosen when "ignore" leads to "removing" too.
    Why not just call it "ignorendremove" so it is easy comprehensible?

    • Alex

      std::cin.ignore(32767,’\n’); means “start discarding characters until you reach a ‘\n’, or discard 32767 characters”.

      The 32767 is an arbitrarily chosen large number that should be bigger than just about anything you have buffered in std::cin.

      As to why they named it “ignore”, I have no idea. The Standard Template Library can be funny that way.

    • Darren

      A general truism - programmers are lazy typist hence, ‘ignore’ is quicker to type than ‘ignore_and_remove’ plus ignore does sum up what it does quite succinctly.

    • Mila

      you can use this instead of the number

  • Bernhard

    May I point out a few errors in the text i noticed (missed words and such)? 🙂

    " Because text input so free-form " -> " Because text input is so free-form "
    "and either handle those cases gracefully (or prevent them from happening in the first place, if possible)" -> "and either handle those cases gracefully or prevent them from happening in the first place (if possible)."
    "Since strings do not have any restrictions on what characters be entered" -> "Since strings do not have any restrictions on what characters are entered"

    Thank you for the extraordinary tutorials, I really feel like I’m making progress 🙂

  • Jim

    Alex,
    Spelling error expecting.   The process of checking whether user input conforms to what the program is expectating is called input validation.

  • Brandon

    Something crossed my mind… in the spirit of being robust.

    What if the user asks to / by 0.  Both the / and 0 are valid inputs that would pass every test.

    Would a test at the beginning of the print function for (op == / && y == 0) work?

    • Alex

      Yes, you could check inside the printResult() function to ensure y != 0 if op == ‘/’. However, even better would be to disallow the user from entering a value of 0 in the first place if they’d previously selected operator ‘\’.

  • J3ANP3T3R

    I understand that .fail() should only be used on inputs that REALLY needed to be verified however i can see how this could be a problem when we use cin for minor situations. we have to use this block of code to validate input every time. is there a way to automatically have cin clear itself and use ignore every time it encounters itself in error mode ?

    • Alex

      Not that I”m aware of. One way to work around that is to write your own input functions that do contain all the needed logic failure detection logic, and then reuse them in your program.

  • Raghu

    Excellent to learn and prepare for the interviews even for experience programmers.

    Thanks a lot
    Raghu

  • Luat

    I think that in this sentence "Since the last character the user entered must be a ‘\n’, we can tell std::cout to ignore buffered characters until it finds a newline character (which is removed as well).", "std::cout" should be changed to "std::cin".

  • #include <iostream>

    double getDouble()
    {
        std::cout << "Enter a double value: ";
        double x;
        std::cin >> x;
        return x;
    }

    char getOperator()
    {
        std::cout << "Enter one of the following: +, -, *, or /: ";
        char op;
        std::cin >> op;
        return op;
    }

    void printResult(double x, char op, double y)
    {
        if (op == ‘+’)
            std::cout << x << " + " << y << " is " << x + y << ‘n’;
        else if (op == ‘-‘)
            std::cout << x << " - " << y << " is " << x - y << ‘n’;
        else if (op == ‘*’)
            std::cout << x << " * " << y << " is " << x * y << ‘n’;
        else if (op == ‘/’)
            std::cout << x << " / " << y << " is " << x / y << ‘n’;
    }

    int main()
    {
        double x = ‘a’ ;// instead of getDouble()
        char op = getOperator();
        double y = ‘a’; //again instead of getDouble()

        printResult(x, op, y);

        return 0;
    }

    In the above program, if i enter-   a or ‘a’ on being asked to enter my 1st double value, i am getting a blank result.

    Instead if i initialize both the doubles in main() to ‘a’ , implicit conversion does takes place in this case(as expected) and my output console screen reads:  97 +97 =194.

    Why didn’t the implicit type conversion happen in the 1st case when i entered- a or ‘a’ on being asked to enter my 1st double value.

    In case of int inputs,implicit type conversions to doubles are working fine,but not working for char inputs.

    • Alex

      I believe this is happening because in the case where you’re directly assigning ‘a’ to a double, the compiler is doing an implicit conversion on your behalf. However, when you’re trying to extract ‘a’ to a double variable, there are no implicit conversions happening, instead the extraction operator>> is left to deal with the input, and it does not handle this case.

  • Shekhar

    Hi Alex, I have a query regarding error case 2 in the example shown here. If this is something I should have got from earlier lessons please just direct me to the relevant one.

    What happens in case 2:
    Enter a double value: 5*7
    Enter one of the following: +, -, *, or /: Enter a double value: 5 * 7 is 35

    1) The user enters "5*7\n". So the input buffer now contains "5*7\n". The first extraction assigns 5 to x.
    2) The next output "Enter one of the following: +, -, *, or /:" happens on a new line
    3) In the next extraction, "*" is extracted and assigned to op and user is not asked for any input. "7\n" still remains in the buffer.
    4) The next output "Enter a double value:" does not happen on a new line. 7 is assigned to y and the final output again does not happen on a new line

    Queries
    What I understand is that for any new output from cout to happen on a new line, one needs a "\n" or an endl statment after the previous one. If I do not have a "\n" or an endl between 2 cout operations but the user is asked for an input, that also makes the output from the 2nd cout to be displayed on a new line. Why does this happen ?. If the user enters something like "23 324 \n", then that newline is at the last of the buffer. It cannot be the one forcing the next output on a new line and also it is in the input buffer not the output buffer. Or is it just that any time an input is received, the next output is forced on a new line as per the rules of the language ?

    Thanks
    Shekhar

    • Alex

      Consoles normally echo input back to the user, so the user can see what they’re typing. Consequently, as soon as you hit “enter” to submit your input, the echo of that \n gets sent to the console and forces a newline. Does that make sense, and does it answer your question?

  • Ritter G

    My seem like a dumb question but

    What does cin.clear() and cin.fail() do exactly? I mean behind the scenes to be more specific.

    • Alex

      cin.fail() returns a boolean indicating whether the internal state of cin has been set to “failure mode”.
      cin.clear() clears “failure mode” so you can resume normal operation.

  • Pip

    Hi Alex,

    I was wondering if there was a way to prevent extraneous input (printing an error and prompting the user to try again) instead of just ‘ignoring’ it, while using post-entry validation?
    In your example (Error case 2) where the user enters "5*7n," 5 is assigned to x and the rest is ignored - the program continues without the user being notified that their input was truncated.

    Thank you very much for the excellent tutorials.

    • Alex

      The easiest way to do this is to put your variable input into a loop. If there is no ignorable input, exit the loop. Otherwise, print an error message and loop so the user can try again.

      • Pip

        I understand the idea of the loop, but am not sure how you can detect that there has been an ignorable input? Since if there isn’t any such input, there will still be the newline character hanging around.

        • Pip

          I’ve been trying a few different ways to be able to detect extraneous input, and I think I’ve managed it! This method (see below code) using a istringstream isn’t particularly elegant but it appears to be fit for purpose, unless you can see any flaws or have a far more efficient way of achieving the same check?

          Cheers,

          Pip

        • Alex

          It’s definitely possible, I’m not sure what the best way to do it is, since C++ doesn’t exactly make it easy.

          I think you could do an std::cin.ignore(1, ‘\n’) to remove the newline. Then the question is whether there is any more input. I think this might work:

          But C++ I/O isn’t my specialty, so there might be edge cases where this doesn’t work as intended.

  • Dwaze

    Enter a number between -32768 and 32767: 40000
    Enter another number between -32768 and 32767: The sum is: -26216
    In the above case, we got both overflow and …

    When I try this, I don’t get overflow. The value of x is whatever happens to be in memory (uninitialized variable).
    In this case, visual studio initializes x to -13108 while in debug build configuration. x + y = -26216.
    Or am I wrong?

    • Alex

      You’re totally right -- not sure how I missed that. I’ve updated the example slightly to make it easier to explain, removing the reference to overflowing. Nice catch.

      • Dwaze

        No problem and thank you for the tutorials. They’re really helpful!

      • Turya

        Hi!
        In that case, I am not getting 0 as output. I am getting 32767. In fact, if I do, cout<<x<<" "<<y;  I am getting 32676 0. (Using cpp.h online compiler)

        • Alex

          Sounds like maybe a bug in that compiler. When insertion fails, the original value of the argument is supposed to be left alone, not set to the largest value it can hold.

  • Omar Farrag

    In problem two, when the user entered 5*7, shouldn’t I be telling him that this is wrong input instead of extracting ‘5’ and ignoring the rest? I mean can we check if there is data in the input buffer, and by their existence we know that the user’s input was wrong ?

    • Alex

      Sure, you could do this as well, but trying to validate input tends to be a little more challenging. We cover input validation in lesson 13.5.

  • KIRPAL SINGH

    Can you help me to explain what is wrong with my code…
    it gets stuck at enter your job line….

    • Alex

      When you enter your age, you’re typing something like “23\n”. The 23 gets extracted to variable age, but the “\n” stays in the stream. getline() then reads the “\n” and assumes you intended job to be just a newline.

      To get this to work, after you read in age, add the following line:

  • KIRPAL SINGH

    how to type a name with more than one words? if i use char, it will only display the first word and not the rest.
    if i input “Micheal Jackson” … the variable will only store “Micheal” …

  • bert

    Great section!

    Typo:
    If they don’t, we ask them to try again until their(they) either give us valid input, shutdown the program, or destroy their computer.

  • Dragos

    Man, I love your little jokes. They are like easter eggs to me, always puts a smile on my face and gives me a little mental break.

    Awesome tutorials, all the lessons are on point. Keep up the good work!

  • Matt

    Under  "Error case 4: Extraction succeeds but the user overflows a numeric value",
    you wrote: "… std::cin goes immediately into “failure mode”, so it assigns a value to variable x. Consequently, x is left with the initialized value of 0."

    I am really confused by this. If std::cin is able to extract the larger-than-expected number into the variable x in some form, why does std::cin enter failure mode? Also, why does std::cin assign x a value after it enters failure mode?(does std::cin always assign the target variable a value when it enters failure mode, or is that act specific to this example for some reason?) Lastly, how can x be "left with" it’s original value of zero, after having just received a value from std::cin?

    • Alex

      Yes, you should be confused because the sentence as written made no sense. It should be “so it does NOT assign a value to variable x”. I’ve updated the text. Sorry about that.

      I guess the inventors of std::cin thought it would be better to let the user explicitly know that the user entered a value that could not be handled than fail silently and return a value that isn’t what was expected. Your other questions should be answered by the updated sentence. Again, my apologies for the typo -- that was a bad one.

  • Matt

    The last sentence in this lesson doesn’t compile for me.

    • Alex

      I’m not sure which sentence you’re referring to: The std::cin.ignore(32767, ‘\n’); code snippet? The note about the lack of input validation for future lessons? Something else? Can you clarify? Thanks!

      • Matt

        You wrote:
        "This is because input validation had a tendency to obscure the logic, we try to keep the examples as simple and to the point as possible."

        Maybe I’m misinterpreting, but this sentence doesn’t make sense to me. I think it would be better if you changed "had" to "has", and replace "this is because" with "since". For example:

        "Since input validation has a tendency to obscure the logic, we try to keep the examples as simple and to the point as possible."

  • alex1997

    " shutdown the program, or ‘destroy their computer’." the last part made me laugh so hard. Just try to imagine it ;))

Leave a Comment

Put C++ code inside [code][/code] tags to use the syntax highlighter