Search

D.2.4 — Integers

An integer type (sometimes called an integral type) variable is a variable that can only hold non-fractional numbers (e.g. -2, -1, 0, 1, 2). C++ has five different fundamental integer types available for use:

Category Type Minimum Size Note
character char 1 byte
integer short 2 bytes
int 2 bytes Typically 4 bytes on modern architectures
long 4 bytes
long long 8 bytes C99/C++11 type

Char is a special case, in that it falls into both the character and integer categories. We’ll talk about the special properties of char later. In this lesson, you can treat it as a normal integer.

The key difference between the various integer types is that they have varying sizes -- the larger integers can hold bigger numbers. Note that C++ only guarantees that integers will have a certain minimum size, not that they will have a specific size. See lesson 2.3 -- variable sizes and the sizeof operator for information on how to determine how large each type is on your machine.

Defining integers

Defining some integers:

While short int, long int, and long long int are valid, the shorthand versions short, long, and long long should be preferred. In addition to being more typing, adding the int suffix makes the type harder to distinguish from variables of type int. This can lead to mistakes if the short or long modifier is inadvertently missed.

Identifying integer

Because the size of char, short, int, and long can vary depending on the compiler and/or computer architecture, it can be instructive to refer to integers by their size rather than name. We often refer to integers by the number of bits a variable of that type is allocated (e.g. “32-bit integer” instead of “long”).

Integer ranges and sign

As you learned in the last section, a variable with n bits can store 2n different values. But which specific values? We call the set of specific values that a data type can hold its range. The range of an integer variable is determined by two factors: its size (in bits), and its sign, which can be “signed” or “unsigned”.

A signed integer is a variable that can hold both negative and positive numbers. To explicitly declare a variable as signed, you can use the signed keyword:

By convention, the keyword “signed” is placed before the variable’s data type.

A 1-byte signed integer has a range of -128 to 127. Any value between -128 and 127 (inclusive) can be put in a 1-byte signed integer safely.

Sometimes, we know in advance that we are not going to need negative numbers. This is common when using a variable to store the quantity or size of something (such as your height -- it doesn’t make sense to have a negative height!). An unsigned integer is one that can only hold positive values. To explicitly declare a variable as unsigned, use the unsigned keyword:

A 1-byte unsigned integer has a range of 0 to 255.

Note that declaring a variable as unsigned means that it can not store negative numbers, but it can store positive numbers that are twice as large.

Now that you understand the difference between signed and unsigned, let’s take a look at the ranges for different sized signed and unsigned variables:

Size/Type Range
1 byte signed -128 to 127
1 byte unsigned 0 to 255
2 byte signed -32,768 to 32,767
2 byte unsigned 0 to 65,535
4 byte signed -2,147,483,648 to 2,147,483,647
4 byte unsigned 0 to 4,294,967,295
8 byte signed -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
8 byte unsigned 0 to 18,446,744,073,709,551,615

For the math inclined, an n-bit signed variable has a range of -(2n-1) to 2n-1-1. An n-bit unsigned variable has a range of 0 to (2n)-1. For the non-math inclined… use the table. :)

New programmers sometimes get signed and unsigned mixed up. The following is a simple way to remember the difference: in order to differentiate negative numbers from positive ones , we typically use a negative sign. If a sign is not provided, we assume a number is positive. Consequently, an integer with a sign (a signed integer) can tell the difference between positive and negative. An integer without a sign (an unsigned integer) assumes all values are positive.

Default signs and integer best practices

So what happens if we do not declare a variable as signed or unsigned?

Category Type Default Sign Note
character char Signed or Unsigned Usually signed
integer short Signed
int Signed
long Signed
long long Signed

All integer variables except char are signed by default. Char can be either signed or unsigned by default (but is usually signed for conformity).

Generally, the signed keyword is not used (since it’s redundant), except on chars (when necessary to ensure they are signed).

Best practice is to avoid use of unsigned integers unless you have a specific need for them, as unsigned integers are more prone to unexpected bugs and behaviors than signed integers.

Rule: Favor signed integers over unsigned integers

Overflow

What happens if we try to put a number outside of the data type’s range into our variable? Overflow occurs when bits are lost because a variable has not been allocated enough memory to store them.

In lesson D.2.1 -- Fundamental variable definition, initialization, and assignment, we mentioned that data is stored in binary format.

In binary (base 2), each digit can only have 2 possible values (0 or 1). We count from 0 to 15 like this:

Decimal Value Binary Value
0 0
1 1
2 10
3 11
4 100
5 101
6 110
7 111
8 1000
9 1001
10 1010
11 1011
12 1100
13 1101
14 1110
15 1111

As you can see, the larger numbers require more bits to represent. Because our variables have a fixed number of bits, this puts a limit on how much data they can hold.

Overflow examples

Consider a hypothetical unsigned variable that can only hold 4 bits. Any of the binary numbers enumerated in the table above would fit comfortably inside this variable (because none of them are larger than 4 bits).

But what happens if we try to assign a value that takes more than 4 bits to our variable? We get overflow: our variable will only store the 4 least significant (rightmost) bits, and the excess bits are lost.

For example, if we tried to put the decimal value 21 in our 4-bit variable:

Decimal Value Binary Value
21 10101

21 takes 5 bits (10101) to represent. The 4 rightmost bits (0101) go into the variable, and the leftmost (1) is simply lost. Our variable now holds 0101, which is the decimal value 5.

Note: At this point in the tutorials, you’re not expected to know how to convert decimal to binary or vice-versa. We’ll discuss that in more detail in section 3.7 -- Converting between binary and decimal.

Now, let’s take a look at an example using actual code, assuming a short is 16 bits:

What do you think the result of this program will be?

x was: 65535
x is now: 0

What happened? We overflowed the variable by trying to put a number that was too big into it (65536), and the result is that our value “wrapped around” back to the beginning of the range.

For advanced readers, here’s what’s actually happening behind the scenes: the number 65,535 is represented by the bit pattern 1111 1111 1111 1111 in binary. 65,535 is the largest number an unsigned 2 byte (16-bit) integer can hold, as it uses all 16 bits. When we add 1 to the value, the new value should be 65,536. However, the bit pattern of 65,536 is represented in binary as 1 0000 0000 0000 0000, which is 17 bits! Consequently, the highest bit (which is the 1) is lost, and the low 16 bits are all that is left. The bit pattern 0000 0000 0000 0000 corresponds to the number 0, which is our result.

Similarly, we can overflow the bottom end of our range as well, resulting in “wrapping around” to the top of the range.

x was: 0
x is now: 65535

Overflow results in information being lost, which is almost never desirable. If there is any suspicion that a variable might need to store a value that falls outside its range, use a larger variable!

Also note that the results of overflow are only predictable for unsigned integers. Overflowing signed integers or non-integers (e.g. floating point numbers) may result in different results on different systems.

Rule: Do not depend on the results of overflow in your program.

Integer division

When dividing two integers, C++ works like you’d expect when the result is a whole number:

This produces the expected result:

5

But let’s look at what happens when integer division causes a fractional result:

This produces a possibly unexpected result:

1

When doing division with two integers, C++ produces an integer result. Since integers can’t hold fractional values, any fractional portion is simply dropped (not rounded!).

Taking a closer look at the above example, 8 / 5 produces the value 1.6. The fractional part (0.6) is dropped, and the result of 1 remains.

Rule: Be careful when using integer division, as you will lose any fractional parts of the result

What is size_t?

Consider the following code:

Pretty simple, right? We can infer that operator sizeof returns an integral value -- but what type of integer is that value? An int? A short? The answer is that sizeof (and many functions that return a size or length value) return a value of type “size_t”. size_t is an unsigned, integral value that is typically used to represent the size or length of objects.

Amusingly, we can use sizeof (which returns a value of type size_t) to ask for the size of size_t itself:

Compiled as a 32-bit (4 byte) console app on the author’s system, this prints:

4

Much like an integer can vary in size depending on the system, size_t also varies in size. size_t is guaranteed to be unsigned and at least 16 bits, but on most systems will be equivalent to the address-width of the application. That is, for 32-bit applications, size_t will typically be a 32 bit unsigned integer, and for a 64-bit application, size_t will typically be a 64-bit unsigned integer. size_t is defined to be big enough to hold the size of the largest object creatable on your system (in bytes). For example, if size_t is 4 bytes, the largest object creatable on your system can’t be larger than the largest number representable by a 4 byte unsigned integer (per the table above, 4,294,967,295).

By definition, any object larger than the largest value size_t can hold is considered ill-formed (and will cause a compile error), as the sizeof operator would not be able to return the size without overflow.

Incidentally, the _t suffix means “type”, and it is common to see this naming convention applied to the newly defined types from newer iterations of C and C++.

D.2.4a -- Fixed-width integers and the unsigned controversy
Index
D.2.3 -- Variable sizes and the sizeof operator

196 comments to D.2.4 — Integers

  • Alireza

    Hello dear,
    What is the ' size_t ' exactly ?
    I've read the above texts, but I haven't understood what the 'size_t' does exactly.
    Can it be a data type ?

    Output is 0, it's a 4-byte unsigned type.

    • nascardriver

      Hi!

      See my answer here
      https://www.learncpp.com/cpp-tutorial/24-integers/comment-page-3/#comment-390187

      > it's a 4-byte unsigned type
      On your system, yes. It might not be somewhere else.

  • > Is that [] considered overflow
    Yes

    > Is that technically considered overflow
    No. Unsigned types don't overflow, they explicitly wrap around.

    • Alex

      Integer overflow is defined by Wikipedia as, "[occuring] when an arithmetic operation attempts to create a numeric value that is outside of the range that can be represented with a given number of digits."

      Based on that definition, unsigned types do overflow. They just exhibit guaranteed wrap-around behavior when they do.

      • The range of representable values for the unsigned type is 0 to 2 N − 1
        (inclusive); arithmetic for the unsigned type is performed modulo 2 N . [Note: Unsigned arithmetic does not overflow. Overflow for signed arithmetic yields undefined behavior (7.1). — end note]

        N4791 § 6.7.1 2

  • Jeremy

    So if i understand correctly (from reading multiple definitions of size_t), size_t returns the maximum width an integer data type can store in bytes in memory? and this is used to understand the maximum value that an integer can be stored can be across unique systems before overflowing occurs giving us an undesired result? and is mainly used for portability?

    • @std::size_t doesn't return anything, it's a type, like int, float, etc.
      It can store the number that describes the size of the largest type.
      Let's say your compiler allows an object to be 4294967295 bytes (maximum on a 32bit system) in size.
      @std::size_t can store that number.
      This is unrelated to integers. You can check how large an integer is using @sizeof

      which probably print 4.
      @std::size_t is required, because you need a type that can store sizes. Looking at @sizeof, you need a type that can store the return value. This is @std::size_t.

      • Jeremy

        Okay, i got that one really wrong.

        So i think i understand it better. size_t is an integer data type that can store the largest object in size(in bytes) that the compiler will allow. It's useful as a return value for sizeof because it's unsigned(memory can only be positive integer value) and can work with the largest objects allowed so it's guaranteed to safely return a value of all data types. Plus it is also useful for array indexing and for loops in place of int as it can store the largest value in bytes so there is less chance of overflow?

        Is this correct? Is there a way to check for integer overflow just in case or is that not really needed if using size_t?

        • Right.
          Be careful when using it as an iterator in loops. It's less likely to overflow, but if you don't watch out, it might underflow (go below 0).

          > Is there a way to check for integer overflow just in case or is that not really needed if using size_t?
          If you're working close to min/max values, you should verify that the over/underflow won't occur _before_ doing the calculation.

      • cren

        Hi,
        I have a question on size_t. I am sure my system is 64 bits, I expected to see std::cout << sizeof(size_t); print out 8 on console.

        But when I ran above line in Eclipse, the output is 8, the output from code blocks is 4, what I am doing wrong here?

        Thanks!

        • On a 64 bit system, you can compile in 32 or 64 bit. If you're using g++ or clang++, the flag you're looking for is -m64
          If you're using something else or don't know how to set compiler flags, you'll have to search through code::block's settings.

          • cren

            Thank you for the reply.

            Yes, if I run g++ with -m64 from command line. it works OK.
            In CodeBlocks, I set /Build options/Compiler settings/Compiler Flags/"Target x_86_64(64bit) [-m64]" to true.
            It turn out compile error - "sorry, unimplemented: 64-bit mode not compiled in".

            Are there other settings need to turn on? Looks like CodeBlocks issue.

            • It looks like your Code::Blocks is using a different compiler or version. If it works on the command line, it should work in your IDE (Unless the compiler is different).
              I don't have Code::Blocks so I won't be of much help from here on.

  • Jeremy

    I understand that using and unsigned short and going from it's max limit of 65,535 (1111 1111 1111 1111) to 65,536 (1 0000 0000 0000 000) results in only being able to keep the right most 16 bits ( 1 [0000 0000 0000 0000] ) subsequently returning a 0.

    But how do we end up with the binary value of 1111 1111 1111 1111 (65,535) from the binary value of 0 if we have and unsigned short assigned as -1?

    Why does it “wrap around” the top of the range like that?

    • Hi Jeremy!

      Lesson 3.7 covers binary representation of integers (You're looking for two's complement).

      • Jeremy

        Thanks. I finally made it to 3.7

        So if i understand correctly, it's not that it wraps around, it's just -1 is signed and represented as 1111 1111 1111 1111 in binary when using two's complement.

        So from the overflow example above:

        Is that technically considered overflow since the compiler is just storing -1 (1111 1111 1111 1111) into memory? It's only since the data type x in the example is declared unsigned that the compiler interprets the stored binary value of 1111 1111 1111 1111 as 65,535?

        EDIT** I get why the example is considered overflow now, it's that we overflowed/stepped out of our range when we went below 0 decrementing by 1. It's not that we directly assigned a negative number to an unsigned data type/

        Two's complement is really making my brain melt. But i think i am starting to get it.

        • After editing a comment, the syntax highlighter will work after refreshing the page.
          My reply to your original comment is https://www.learncpp.com/cpp-tutorial/24-integers/comment-page-3/#comment-391303

  • Dimitri

    Hi, dear teacher!

    Please correct

    "per the table above, 4,294,967,295 bytes" - it's not bytes

    Thanks for great site!

  • Bastiaan

    Hello,
    I just have one quick question:
    Why would we use long if the normal int is as well 4 bytes? Is it because the size of int is unsure? If so should we then stop using int, or is int fine for the learning part of C++?
    Thanks!

    *edit: With the next chapter I see that the long double can be(and is with my labtop) as long as the double. So am I right to assume that there is more difference between the variable?
    again Thanks!

    • > normal int
      There is no such thing as a "normal" int. The size of an int is decided by your compiler. "long" suggests that the compiler should use a larger type, but it doesn't have to.
      Your compiler might use the same type for long int/int and long double/double, another compiler or other compiler settings might behave differently.

  • Hi
    I am wondering about the  value for 1 byte that it should be -127 instead of 128.

    1 byte signed     -128 to 127
    I think it should be from -127 to +127.
    -111 1111 TO +111 1111
    -127      TO 127
    can you please show how you get these ranges and tell me which bit is used for sign ?
    Thanks

    • -127 to 127 (inclusive) means that there are 255 possible numbers. There must be 2^8 = 256 numbers, making your statement impossible.
      0000'0000 to 0111'1111 (inclusive) are used to represent non-negative numbers (0 to 127 (inclusive))
      1000'0000 to 1111'1111 (inclusive) are used to represent negative numbers (-1 to -128 (inclusive))

      Lesson 3.7 goes into more detail about binary representations of integers.

  • Clapfish

    Since I'm coding a 32bit Console application, that means size_t = 4, as you also demonstrated in this lesson.

    Shouldn't that then mean long long integers (being 64bit) can't be used in such an application?

    I tested it however, and it seems like they can...

    • Clapfish

      On further investigation, interestingly:

      ... compiles and runs fine. However, if I go over this number (which, as far as I understand it, should be half of the range available to this variable):

      ... I get the following compile error:

      error: integer constant is so large that it is unsigned
           unsigned long long x { 9223372036854775808 };
                                  ^
      Is this because of the 32bit limitation, and size_t being 4?

      Using an implicitly 'signed' long long seems to work fine though:

      ... and ...

      ... both compile without issue.

      This, to me, suggests that the full 8 byte range of long long is available to this 32bit console application, but only half the range of the 8 byte 'unsigned' long long is available (4 bytes worth).

      Is it because the 'negative version' of the number can be stored in 4 bytes, just like the positive version, but with the 'two's complement flipping' of the bit values? I suppose that would explain it, because going over this number (either positive or negative) would then need that extra bit which is unavailable in this 32bit application...

      I'd really appreciate an explanation of what's going on here!

      Cheers :)

      • Hi!

        > Shouldn't that then mean long long integers (being 64bit) can't be used in such an application?
        No. You can use variables as big as you like, but using 64bit variables in a 32bit application is slower than using using 64bit variables in a 64bit application or using 32bit variables in a 32bit application.

        > integer constant is so large that it is unsigned
        > Is this because of the 32bit limitation, and size_t being 4?
        No. @x is an unsigned long long, but the number you're initializing @x with is a long, or at least that's what it's supposed to be, but it can't, because it's too large.
        C++ determines the type of the value before it considers the type of the variable you're initializing with said value.
        You need to explicitly make the number an unsigned long long by appending "ull"

        This is covered in lesson 2.8

        • Clapfish

          Hi nascardriver,

          Many thanks for your reply! I've tested the 'ull' appendage and that works as you suggest.

          I'm still a little confused though, since the lesson material seems to suggest that 8 byte objects shouldn't be usable when size_t is 4:

          "size_t is an unsigned, integral value that is typically used to represent the size or length of objects, and it is defined to be big enough to hold the size of the largest object createable on your system."

          and

          "By definition, any object larger than the largest value size_t can hold is considered ill-formed (and will cause a compile error), as the sizeof operator would not be able to return the size without overflow."

          Isn't using a long long, of size 8 bytes, considered 'ill-formed' (as per the above statement) in a 32-bit application where size_t = 4, and therefore shouldn't I be getting a compile error as the statement suggests?

          That's why I wanted to test it in the first place, and being able to use long long integers made me want to investigate further...

          Cheers!

          • I see why you're confused as the quoted text can be understood two ways. What it means to say is that the size of any object in bytes cannot exceed the maximum value @std::size_t can store.
            Let's the maximum value of @std::size_t is (2^64)-1 = 18446744073709551615, which is the case in 64bit applications. Then the maximum size of an object is (2^64)-1 bytes = 16EiB.

            • Clapfish

              Thanks for the clarification! I think I see now...

              So, in a 64-bit application the maximum size of a single object is over 18 billion Gigabytes?

              Can one even make an object that large? To me that seems to be a limit with no practical application or meaning, at least from my current (very possibly naive) position...

              • > size of a single object is over 18 billion Gigabytes?
                Yes

                > Can one even make an object that large?
                If you have enough and fast memory, sure.

                > that seems to be a limit with no practical application or meaning
                It doesn't mean that you're supposed to create objects that large.
                64 bit memory can address (2^64)-1 bytes. We're far from reaching that limit, but theoretically a 64 bit system could have that much memory, meaning that it'd be possible to create an object that large (assuming nothing else is taking up memory). When an object of that size can be created, there needs to be a variable that can store the size of that object. Hello @std::size_t.

          • Alex

            Good feedback. I've updated the lesson text to try and clarify this better. Let me know if it's still unclear.

  • vbot

    Hi,

    So, there are no any compiler-errors / warnings when an int overflow occurs, just undesired values, right?

    I did find some information on methods concerning detecting / testing for overflows though, what do you think about those in general? How common are such techniques?

    Thanks! :)

  • lucusgod

    "size_t is guaranteed to be unsigned and at least 16 bits, but on most systems will be equivalent to the largest unsigned integer that your architecture supports. For 32-bit applications, size_t will typically be a 32 bit unsigned integer, and for a 64-bit application, size_t will typically be a 64-bit unsigned integer."

    Largest unsigned integer on 32-bit platform is 64 bits(long long), but the size of size_t is 32 bits?

    • Alex

      Good catch. I've updated the lesson to remove the reference to the largest unsigned integer your architecture supports, as the introduction of long long makes this untrue on many 32-bit systems.

  • Hi Aakash!

    16 bytes would be huge, it should be 16 bits (or 2 bytes). What worries me is the 16, I can't find any source stating this number.

  • Aakash

    "Much like an integer can vary in size depending on the system, size_t also varies in size. size_t is guaranteed to be unsigned and at least 16 bites,"

    at end, it must be "bytes" resulting in

    "Much like an integer can vary in size depending on the system, size_t also varies in size. size_t is guaranteed to be unsigned and at least 16 bytes,"

Leave a Comment

Put all code inside code tags: [code]your code here[/code]