Navigation



2.3 — Variable sizes and the sizeof operator

As you learned in the lesson on basic addressing, memory on modern machines is typically organized into byte-sized pieces, with each piece having a unique address. Up to this point, it has been useful to think of memory as a bunch of cubbyholes or mailboxes where we can put and retrieve information, and variables as names for accessing those cubbyholes or mailboxes.

However, this analogy is not quite correct in one regard — most variables actually take up more than 1 byte of memory. Consequently, a single variable may use 2, 4, or even 8 consecutive memory addresses. The amount of memory that a variable uses is based on it’s data type. Fortunately, because we typically access memory through variable names and not memory addresses, the compiler is largely able to hide the details of working with different sized variables from us.

There are several reasons it is useful to know how much memory a variable takes up.

First, the more memory a variable takes up, the more information it can hold. Because each bit can only hold a 0 or a 1, we say that bit can store 2 values. 2 bits can store 4 different values:

bit 0 bit 1
0 0
0 1
1 0
1 1

3 bits can store 8 values. n bits can store 2^n values. Because a byte is 8 bits, a byte can store 2^8 (256) values.

The size of the variable puts a limit on the amount of information it can store — variables that are bigger can hold larger numbers. We will address this issue further when we get into the different types of variables.

Second, computers have a finite amount of free memory. Every time we declare a variable, a small portion of that free memory is used as long as the variable is in existence. Because modern computers have a lot of memory, this often isn’t a problem, especially if only declaring a few variables. However, for programs that need a large amount of variables (eg. 100,000), the difference between using 1 byte and 8 byte variables can be significant.

The obvious next question is “how much memory do variables of different data types take?”. The size of a given data type is dependent on the compiler and/or the computer architecture. On most 32-bit machines (as of this writing), a char is 1 byte, a bool is 1 byte, a short is 2 bytes, an int is 4 bytes, a long is 4 bytes, a float is 4 bytes, and a double is 8 bytes.

In order to determine the size of data types on a particular machine, C++ provides an operator named sizeof. The sizeof operator is a unary operator that takes either a type or a variable, and returns its size in bytes. You can compile and run the following program to find out how large your data types are:

#include <iostream>

int main()
{
    using namespace std;
    cout << "bool:\t\t" << sizeof(bool) << " bytes" << endl;
    cout << "char:\t\t" << sizeof(char) << " bytes" << endl;
    cout << "wchar_t:\t" << sizeof(wchar_t) << " bytes" << endl;
    cout << "short:\t\t" << sizeof(short) << " bytes" << endl;
    cout << "int:\t\t" << sizeof(int) << " bytes" << endl;
    cout << "long:\t\t" << sizeof(long) << " bytes" << endl;
    cout << "float:\t\t" << sizeof(float) << " bytes" << endl;
    cout << "double:\t\t" << sizeof(double) << " bytes" << endl;
    cout << "long double:\t" << sizeof(long double) << " bytes" << endl;
    return 0;
}

Here is the output from the author’s Pentium 4 machine, using Visual Studio 2005 Express:

bool:           1 bytes
char:           1 bytes
wchar_t:        2 bytes
short:          2 bytes
int:            4 bytes
long:           4 bytes
float:          4 bytes
double:         8 bytes
long double:    8 bytes

Your results may vary if you are using a different type of machine, or a different compiler.

If you’re wondering what \t is in the above program, it’s a special symbol that inserts a tab. We will cover \t and other special symbols when we talk about the char data type.

Interestingly, the sizeof operator is one of only three operators in C++ that is a word instead of a symbol. The other two are new and delete.

You can also use the sizeof operator on a variable name:

    int x;
    cout << "x is " << sizeof(x) << " bytes"<<endl;
x is 4 bytes

Now you know enough about variables that we can start discussing the different data types!

2.4 — Integers
Index
2.2 — Keywords and naming identifiers

58 comments to 2.3 — Variable sizes and the sizeof operator

  • Abhishek

    Never heard of wchar_t before……is that a new data type?What kind of data does it hold?

    • There’s more information about wchar_t on wikipedia. In short, it was meant to be used to hold “wide characters” (eg. those that take more than 8 bits to represent). However, the size varies depending on platform (and can be as small as 8 bits), so I’m not sure I see the practical use.

      • chris

        Pocket PC/Windows Mobile only uses wide characters. Beyond that the practical use is for foreign languages. Some languages (like Chinese) have a lot more than 128 characters.

  • Nikki

    “On most 32-bit machines …. ”
    I am not sure if this is the right place to ask this question .
    What is a “32-bit machine ” ?
    Thanks,
    Nikki

    • Computers work by moving binary digits (bits) around. However, most computers do not work with individual bits — rather, they move data around in chunks. This chunk size is called a “word”. Typically, when we speak of the bit-ness of a machine, we speak of the size of a word. Thus, a 32-bit machine has a 32-bit word size, which means it moves information around 32-bits at a time.

      Typically, modern computers use one word to address memory. With a 32-bit word, this means there are about 2^32 (4 billion) unique memory addresses that can be addressed. This is why 32-bit machines generally can’t make use of more than 4GB of memory.

      • Frederik

        The amount of memory a machine can address has nothing to do with the “bitness” of the machine — in fact there is no real consensus on what it means to be of a certain “bitness”.

        Some 32 bits machines (such as the Pentium Pro and later 32 bit x86s) can address more than 4 GiB, but they are still only considered 32 bit computers.

        I (personally) only consider a machine to be an n bit machine, if it is able to (at least) address up to 2^n individual bytes, is able to hold at least n bits in all registers and can do all operations on n bit (or larger) registers.

      • JD

        So, if I understand this correctly, the computer moves data around in 32-bit chunks (4 bytes), but in my C++ program I can assign variables to a single byte of memory. This seems contradictory to me.

        I guess my question boils down to: Say I have a program that uses a large number of variables, so space matters. One byte will suffice for the data I need to carry so to save space I assign them as chars (1 byte). But, if the smallest “chunk” that the computer passes around is 4 bytes, does this actually save any space? Or would this effectively use the same amount of memory as if I made my variables ints (4 bytes, or one full word)?

        This is a somewhat subtle question, so let me know if I’m not being clear.

        PS – Thanks for a clear, organized, and well-written tutorial! I’ve really enjoyed it so far.

      • zingmars

        It would be easier to explain the whole thing using the architecture name x86(x86_64) rather than calling them ’32-bit’ machines. Confuses people.

  • vader347

    my long double takes 12 bytes

  • [...] 2.3 — Variable sizes and the sizeof operator [...]

  • [...] 2.3 — Variable sizes and the sizeof operator [...]

  • John

    In the code above, all the “\t” characters are showing as “\\t” in my browser. This makes it show up as “\\t\\t” when you run the program instead of tabs. I don’t know if this is a browser displaying that code wrong (I’m running Firefox) or if you just typed that in incorrectly. Easily fixed by deleting the extra backslash.

    • earthHunter

      Looks like it’s just a mistake. I’d expect anything containing “\\t” or “\\n” to purposely have two backslashes so they don’t become actual tabs or newlines, but the apparent “output” is there. I still think it’s a mistake.

      • Greg

        Each

        \\t

        above should actually be

        \t

        , otherwise a literal ‘\t’ will be printed rather than the intended tab character.

        • anonymous

          Greg, means you display the symbol of ‘\’, like ? or ! if ? or ! don’t work already. t shows , then the letter t. The reason is because ‘\’ isn’t a valid symbol for a string; the compiler seems to ignore it.

          And also, a makes a beating sound. :) /

          Sorry, can’t figure out why the backslash isn’t showing….

          • rameye

            (backslash)a translates to the ASCII value for an alert beep. How this is implemented depends on your operating system. On my computer

            cout << '\a';

            does nothing. On yours it triggered the default system alert sound.

  • Syed

    Hi, I would like to know what is the sizeof(long double); ? In this tutorial it is mentioned 8 bytes. Where as i tried in Dev C++ compiler it is giving 12.

    Thanks,
    syed

    • The size of a long double can vary from machine to machine. On most machines, it is either 8 or 12. The only way to know for sure is to use sizeof(long double) just as you have.

  • Mitul Golakiya

    My int takes 2 bytes.
    I am working with tourbo C .
    My PC is 32-bit.

  • Justin

    for some reason when trying to compile this I get

    Permission denied

    Id returned 1 exit status

  • Julian

    Wow, imagine how many variables I could store in a kilobyte :D

    I know that years ago when they used to use punchtape in stuff like CNC machines, having a strip of tape that was over 12 kilobytes of information was very impracticle :P

    Now that I think about it, punchtape was just a lot like bytes, where each row on the tape would have 8 circles, some of which were punched out (which I guess would be the equivalent to a bit with value 1) and the computer would read the values off it… At least I think that’s how it worked ;S

    • rameye

      Back in the late 1970s we had an air show at the Air Force base I was stationed at. We had a paper tape punch machine set up so that attendees could type in their name or whatever and have it punched out on the tape as the actual letters, not the binary. Was a hit.

  • Adam

    What do you mean when you use ^ and n in your equations? Are these standard operators and variables?

  • James

    Copy this code and use it on windows to keep the console screen up long enough for you to see the size of the data types… Also, I removed one \t from “wchar_t:\t\t” cuz a second tab caused that line to be out of alignment with the rest. y is the number of seconds you want the progam to wait before it ends.

    #include <iostream>
    #include <windows.h>
    int main()
    {
    	using namespace std;
    	cout << "bool:\t\t" << sizeof(bool) << " bytes" << endl;
    	cout << "char:\t\t" << sizeof(char) << " bytes" << endl;
    	cout << "wchar_t:\t" << sizeof(wchar_t) << " bytes" << endl;
        cout << "short:\t\t" << sizeof(short) << " bytes" << endl;
        cout << "int:\t\t" << sizeof(int) << " bytes" << endl;
        cout << "long:\t\t" << sizeof(long) << " bytes" << endl;
        cout << "float:\t\t" << sizeof(float) << " bytes" << endl;
        cout << "double:\t\t" << sizeof(double) << " bytes" << endl;
        cout << "long double:\t" << sizeof(long double) << " bytes" << endl;
    	int y = 30, x = y * 1000; //assign y value of sleep time in seconds convert x to seconds
    	Sleep(x);
    	return 0;
    }
    

    Oh, and alex, thanks for this tutorial. It’s the most complete, comprehensive, explanatory, and easy to follow tut on c++ I’ve ever ran across. Thanks for the time u spent in creating it, making it public, and taking time to answer people’s comments. You rock!

  • Ranjan

    what the resone behind it, why you can’t overload the size of operator ?

    REgards,
    Ranjan

    • Why would you want to? :)

      Stroustrup says here:

      Sizeof cannot be overloaded because built-in operations, such as incrementing a pointer into an array implicitly depends on it. Consider:

      X a[10];
      X* p = &a[3];
      X* q = &a[3];
      p++; // p points to a[4]
      // thus the integer value of p must be
      // sizeof(X) larger than the integer value of q

      Thus, sizeof(X) could not be given a new and different meaning by the programmer without violating basic language rules.

  • Anon

    So does it really matter what variable type you use? Dont get me wrong i want to use the right one, but it seems a task to remember all of the maximum values etc.

    is there a trick to it?

    There’s an awful lot to choose from.

    • PReinie

      It depends on the use or application of the code and your engineering design. (You can’t put 5# of @&*!*^ in a 1# bag.) As Alex said above “However, for programs that need a large amount of variables (eg. 100,000), the difference between using 1 byte and 8 byte variables can be significant.” that’s an 8-times increase of memory.

      If you use the maximum word size for each piece of memory, your “product” may cost more because you need more memory to hold or run your program. The additional memory may also require larger circuit boards to hold the memory chips and power for the boards/memory which increases the weight and may require a fan (which takes power) to cool the components. This might be important when go to market or have to carry it to a space station, a battlefield, hiking up a mountain or as a cell phone or iPod.

      If you’re just providing code for existing computers it may not be that big a deal.

      (Sorry for the long winded explanaition. I’ve programmed to the bit level in assembly and in C for limited memory machines.)

  • This tutorial is so easy. It makes learning
    programing so easy. The problem comes once
    you get out of C++ and try to understand
    msdn help. These guys from Microsoft are from
    outer space. Just try looking up some of the
    things you learn here and see if you can get
    anything that explains it in common english.
    I could sure use a tutorial on how to use
    Microsoft help, once you get to the help
    you need. Some times they just go round and
    round.

  • kemawalker

    I am on a MAC (64bit) so my values are significantly larger for 4 of the types.

    QUESTION – how do you manage the risk that you might code for too large a type that won’t run on other machines? In other words, if I code using “long double” which is 16 bytes on my machine, but only 8 bytes on yours, will there be an issue when the program runs or no?

  • prafull.badyal

    some token error is given at cout statement …relating end1; in below progg.

    1
    int x;
    2
    cout << "x is " << sizeof(x) << " bytes"<<endl;

  • daksh


    bool: 1 bytes
    char: 1 bytes
    wchar_t: 2 bytes
    short: 2 bytes
    int: 4 bytes
    long: 4 bytes
    float: 4 bytes
    double: 8 bytes
    long double: 8 bytes

    Your results may vary if you are using a different type of machine, or a different compiler.

    I can understand that it depends on the machine.. But how does it depend on the compiler??

    • rameye

      Most modern compilers have options you can specify on the command line to target different architectures.

      There is a lot of information in the g++ documentation about this.

  • sanjeev_e

    Hi Alex/Members,

    Is there any other way to find the size of a variable without using sizeof(operator).

    I have seen in some of the websites that the below way we can find the same.

    int i = 1;

    size_t size1 = (char*)(&i+1)-(char*)(&i);

    size_t size2 = (int*)(&i+1)-(int*)(&i);

    cout<<size1<<"\t"<<size2<<endl;

    Output: 4 1

    Why it is varying here by typecasting with different datatypes?

    I want to know where it is restricted to use sizeof() operator anywhere? and also whether the above code internally uses sizeof() operator like

    size1 = address-diff/sizeof(int); size2 = address-diff/sizeof(char);

    Please clarify.

    Thanks,
    Sanjeev.

    • rameye

      Any time you use mathematical operators with pointer types, you then are working with pointer arithmetic, and the expressions will evaluate differently depending on the size of the type pointed to.


      int i;
      size_t size1 = &i+1; // size1 is assigned address of next byte in storage space of the integer i
      size_t size2 = (int*)(&i+1); // size2 is assigned address of the next following integer, perhaps in an array.

      In the case of the size2 assignment in your posting, pointer arithmetic will resolve the size of the type, by taking the difference of the addresses of two identical data types that are adjacent in memory.

  • jimbo

    First of all thanks Alex for this amazing tutorial…your awesome!!

    I was just wandering if one bit can store a 0 or a 1 so two values and two bits can store 4 different values i.e 0101. How come it says 3 bits can store 8 values i.e 01010101? Surely 3 bits would store 6 different values if 1 bit stores 2 different values. 3*2 = 6. If I am missing the point can someone please try and explain where I am going wrong!?
    Many thanks in advance.

  • jimbo

    It’s ok I just worked it out. one bit can store two values i.e a 0 or a 1 i.e 2^1 2*1=2. 2 bits can store 4 different values i.e 2^2 2*2= 4 and 3 bits can store 8 different values i.e 2^3 2*2*2= 8.

    One more thing..just wondering what to the nth power means? is it like 2^9 or 2*2*2*2*2*2*2*2*2? Many thanks in advance.

    • tsdrifter

      n doesn’t stand for 9, it just stands for any integer. what he is saying is if you have a given number of bits, you can store 2^(number of bits). if you had nine bits, then n would equal 9, but this is not always the case.

    • rameye

      Since ^ is being so freely used here in the comments, I must mention to be careful using ^ as an operator in C++ expressions.

      C++ has NO built-in operator for exponentiation. The ^ operator in C++ is a binary (operating on two values) operator performing a bitwise XOR.

      For example what does 2^3 resolve to in C++?

      int x = 2^3;

      x will not contain the value 8, it will contain the value 1

      Why is this?

      Assuming 8 bit wide integers this is what happened:


      00000010 ^
      00000011
      --------
      00000001 <== bitwise XOR of the two integers results in value of 1

  • [...] is that they have varying sizes — the larger integers can hold bigger numbers. You can use the sizeof operator to determine how large each type is on your [...]

  • bacia

    Using long doubles I made a program which can compute factorials up to 170!, more than my TI-831!

  • [...] is that they have varying sizes — the larger integers can hold bigger numbers. You can use the sizeof operator to determine how large each type is on your [...]

  • machello

    Hi Alex!

    I must say you have highly systematic tutorials which are very pedagogical written!

    I just have one remark which is worth to mention.
    If trying to determine the size of a structure or in general any object, you probably
    may not get the amount of bytes used by the object as you probably expected.
    Example:

    struct Something
    {
    int nX, nY, nZ;
    double dX;
    };

    Something sSomeStuff;

    You expect (64bit, gcc 4.7.2): sizeof( sSomeStuff ) == 20 (= 3*4 + 8)
    (i.e. sizeof(int) == 3, sizeof(double) == 8 ).
    The object uses 20 bytes of memory or at least
    that must is allocated for each variable in the structure. However you won’t that value!
    Instead you get: sizeof( sSomeStuff ) == 24.

    I won’t explain in details (internet is used for this purpose) but it is related with
    how CPU can most efficienty collect variables (blocks) or more techical, is related with
    data structure alignment or internal padding.

    In the example, the higest variable size (double) is sizeof( double ) == 8. Total
    effective size of the structure (without padding) is 20. Structure block size must
    be (or should, for most systems) multiple of the highest variable size in that structure,
    in the example, multiple of 8: 8, 16, 24, 32,…
    Optimal structure size is not 16, which is to small,
    but 24, which is higher than effective structure size.

    With proper order of type declarations, additional padding bytes could be minimized.

    With regards,
    M

You must be logged in to post a comment.