Search

D.2.3 — Variable sizes and the sizeof operator

As you learned in the lesson D.2.1 -- Fundamental variable definition, initialization, and assignment, memory on modern machines is typically organized into byte-sized units, with each unit having a unique address. Up to this point, it has been useful to think of memory as a bunch of cubbyholes or mailboxes where we can put and retrieve information, and variables as names for accessing those cubbyholes or mailboxes.

However, this analogy is not quite correct in one regard -- most variables actually take up more than 1 byte of memory. Consequently, a single variable may use 2, 4, or even 8 consecutive memory addresses. The amount of memory that a variable uses is based on its data type. Fortunately, because we typically access memory through variable names and not memory addresses, the compiler is largely able to hide the details of working with different sized variables from us.

There are several reasons it is useful to know how much memory a variable takes up.

First, the more memory a variable takes up, the more information it can hold. Because each bit can only hold a 0 or a 1, we say that bit can hold 2 possible values.

2 bits can hold 4 possible values:

bit 0 bit 1
0 0
0 1
1 0
1 1

3 bits can hold 8 possible values:

bit 0 bit 1 bit 2
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

To generalize, a variable with n bits can hold 2n (2 to the power of n, also commonly written 2^n) possible values. With an 8-bit byte, a byte can store 28 (256) possible values.

The size of the variable puts a limit on the amount of information it can store -- variables that utilize more bytes can hold a wider range of values. We will address this issue further when we get into the different types of variables.

Second, computers have a finite amount of free memory. Every time we declare a variable, a small portion of that free memory is used for as long as the variable is in existence. Because modern computers have a lot of memory, this often isn’t a problem, especially if only declaring a few variables. However, for programs that need a large amount of variables (eg. 100,000), the difference between using 1 byte and 8 byte variables can be significant.

The size of C++ basic data types

The obvious next question is “how much memory do variables of different data types take?”. You may be surprised to find that the size of a given data type is dependent on the compiler and/or the computer architecture!

C++ guarantees that the basic data types will have a minimum size:

Category Type Minimum Size Note
boolean bool 1 byte
character char 1 byte May be signed or unsigned
Always exactly 1 byte
wchar_t 1 byte
char16_t 2 bytes C++11 type
char32_t 4 bytes C++11 type
integer short 2 bytes
int 2 bytes
long 4 bytes
long long 8 bytes C99/C++11 type
floating point float 4 bytes
double 8 bytes
long double 8 bytes

However, the actual size of the variables may be different on your machine (particularly int, which is more often 4 bytes). In order to determine the size of data types on a particular machine, C++ provides an operator named sizeof. The sizeof operator is a unary operator that takes either a type or a variable, and returns its size in bytes. You can compile and run the following program to find out how large some of your data types are:

Here is the output from the author’s x64 machine (in 2015), using Visual Studio 2013:

bool:           1 bytes
char:           1 bytes
wchar_t:        2 bytes
char16_t:       2 bytes
char32_t:       4 bytes
short:          2 bytes
int:            4 bytes
long:           4 bytes
long long:      8 bytes
float:          4 bytes
double:         8 bytes
long double:    8 bytes

Your results may vary if you are using a different type of machine, or a different compiler. Note that you can not take the sizeof the void type, since it has no size (doing so will cause a compile error).

If you’re wondering what ‘\t’ is in the above program, it’s a special symbol that inserts a tab (in the example, we’re using it to align the output columns). We will cover ‘\t’ and other special symbols when we talk about the char data type.

You can also use the sizeof operator on a variable name:

x is 4 bytes

We’ll discuss the size of different types in the upcoming lessons, as well as a summary table at the end.

D.2.4 -- Integers
Index
D.2.2 -- Void

173 comments to D.2.3 — Variable sizes and the sizeof operator

  • Jules

    I hope someone like nascardriver or Alex sees this,
    so I quite don't understand the relationship between number of bytes that a datatype has v/s actual values it can hold.
    for eg:
    An standard int is of 4 bytes, so the number of values it can hold are 2^4 = 16.
    but the value range an integer is -
    -32,768 to 32,767 or -2,147,483,648 to 2,147,483,647

    Also what is up with there being different datatypes of char?,if a char holds only one character why are there different types of chars like - char,wchar_t,char16_t and char32_t, If i wanted to store a string wont i just declare an array?

    i'm kinda lost here.

    • Hi Jules!

      > An standard int is of 4 bytes, so the number of values it can hold are 2^4 = 16.
      Both nope. An int doesn't have to be 4 bytes and your calculation is off.
      Assuming a 4 byte int:
      4 bytes are 32 bits. That makes 2^32 possible values. The range is -(2^31) to (2^31)-1

      > char
      1 byte

      > wchar_t
      2 bytes (No guarantees, I'm doing this from memory)

      > char16_t
      2 bytes

      > char32_t
      4 bytes

      The larger char types are required to store non-ascii characters, because those can take up more than 1 byte.
      You can store unicode strings in char arrays, but the individual characters will be split into multiple chars.

      • Jules

        Hi nascar!

        thanks for replying on such a short notice, it seems i mistook bits for bytes here.
        as you said assuming that an integer has 4 bytes, the possible values would be 2^32 = 4294967296
        now wont the range be: -2147483648 to +2147483647, your definition would have the range as -4294967296 to +4294967296, wont it?

        also what do you mean by -
        >but the individual characters will be split into multiple chars.
        thanks for your time.

        ~Jules.

        • > your definition would have the range as -4294967296 to +4294967296, wont it?
          It won't. I used 2^31 in my range, not 2^32

          > also what do you mean by

          The lightning symbol has a value of 0x21AF ( https://unicode-table.com/en/21AF/ ). Too much for a single char.

          Running the code and giving a lightning as input results in

          input: ↯
          length: 3
          0: � (-30)
          1: � (-122)
          2: � (-81)

          We input a single character, but we need 3 chars to store it. Each char on it's own doesn't make a whole lot of sense. But my terminal (and presumably yours too) knows how to print the 3 successive chars as a single character.

  • Quyết

    Hi, i have a question...
    look gt1 function and gt2 function...

    which good ???
    Thanks!

  • Hi Alex!

    I have doubts about your "C++ guarantees that the basic data types will have a minimum size" table.
    The standard only states
    "There are five standard signed integer types : “signed char”, “short int”, “int”, “long int”, and “long long int”. In this list, each type provides at least as much storage as those preceding it in the list."
    N4762 § 6.7.1 (2)

    To me this sounds like a long long int could be a 1 byte sized type.

    • Alex

      The C++ standard does only explicitly state as you say. However, the C++ standard apparently references the C standard in this regard, and the C standard implies a minimum range of numbers that each type must be able to hold. Implicitly, that implies a minimum size.

      Here's the minimum sizes from the C standard: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf (see page 22)

      Also see https://stackoverflow.com/questions/50930891/what-is-the-minimum-size-of-an-int-in-c for some discussion about this.

      Finally, note that https://en.cppreference.com/w/cpp/language/types corroborates this understanding.

Leave a Comment

Put all code inside code tags: [code]your code here[/code]