As you learned in the lesson on basic addressing, memory on modern machines is typically organized into byte-sized pieces, with each piece having a unique address. Up to this point, it has been useful to think of memory as a bunch of cubbyholes or mailboxes where we can put and retrieve information, and variables as names for accessing those cubbyholes or mailboxes.
However, this analogy is not quite correct in one regard — most variables actually take up more than 1 byte of memory. Consequently, a single variable may use 2, 4, or even 8 consecutive memory addresses. The amount of memory that a variable uses is based on it’s data type. Fortunately, because we typically access memory through variable names and not memory addresses, the compiler is largely able to hide the details of working with different sized variables from us.
There are several reasons it is useful to know how much memory a variable takes up.
First, the more memory a variable takes up, the more information it can hold. Because each bit can only hold a 0 or a 1, we say that bit can store 2 values. 2 bits can store 4 different values:
| bit 0 | bit 1 |
|---|---|
| 0 | 0 |
| 0 | 1 |
| 1 | 0 |
| 1 | 1 |
3 bits can store 8 values. n bits can store 2^n values. Because a byte is 8 bits, a byte can store 2^8 (256) values.
The size of the variable puts a limit on the amount of information it can store — variables that are bigger can hold larger numbers. We will address this issue further when we get into the different types of variables.
Second, computers have a finite amount of free memory. Every time we declare a variable, a small portion of that free memory is used as long as the variable is in existence. Because modern computers have a lot of memory, this often isn’t a problem, especially if only declaring a few variables. However, for programs that need a large amount of variables (eg. 100,000), the difference between using 1 byte and 8 byte variables can be significant.
The obvious next question is “how much memory do variables of different data types take?”. The size of a given data type is dependent on the compiler and/or the computer architecture. On most 32-bit machines (as of this writing), a char is 1 byte, a bool is 1 byte, a short is 2 bytes, an int is 4 bytes, a long is 4 bytes, a float is 4 bytes, and a double is 8 bytes.
In order to determine the size of data types on a particular machine, C++ provides an operator named sizeof. The sizeof operator is a unary operator that takes either a type or a variable, and returns its size in bytes. You can compile and run the following program to find out how large your data types are:
#include <iostream>
int main()
{
using namespace std;
cout << "bool:\t\t" << sizeof(bool) << " bytes" << endl;
cout << "char:\t\t" << sizeof(char) << " bytes" << endl;
cout << "wchar_t:\t" << sizeof(wchar_t) << " bytes" << endl;
cout << "short:\t\t" << sizeof(short) << " bytes" << endl;
cout << "int:\t\t" << sizeof(int) << " bytes" << endl;
cout << "long:\t\t" << sizeof(long) << " bytes" << endl;
cout << "float:\t\t" << sizeof(float) << " bytes" << endl;
cout << "double:\t\t" << sizeof(double) << " bytes" << endl;
cout << "long double:\t" << sizeof(long double) << " bytes" << endl;
return 0;
}
Here is the output from the author’s Pentium 4 machine, using Visual Studio 2005 Express:
bool: 1 bytes char: 1 bytes wchar_t: 2 bytes short: 2 bytes int: 4 bytes long: 4 bytes float: 4 bytes double: 8 bytes long double: 8 bytes
Your results may vary if you are using a different type of machine, or a different compiler.
If you’re wondering what \t is in the above program, it’s a special symbol that inserts a tab. We will cover \t and other special symbols when we talk about the char data type.
Interestingly, the sizeof operator is one of only three operators in C++ that is a word instead of a symbol. The other two are new and delete.
You can also use the sizeof operator on a variable name:
int x;
cout << "x is " << sizeof(x) << " bytes"<<endl;
x is 4 bytes
Now you know enough about variables that we can start discussing the different data types!
2.4 — Integers
|
Index
|
2.2 — Keywords and naming identifiers
|
2.4 — Integers
Index
2.2 — Keywords and naming identifiers
[...] 2.3 — Variable sizes and the sizeof operator [...]
Never heard of wchar_t before……is that a new data type?What kind of data does it hold?
There’s more information about wchar_t on wikipedia. In short, it was meant to be used to hold “wide characters” (eg. those that take more than 8 bits to represent). However, the size varies depending on platform (and can be as small as 8 bits), so I’m not sure I see the practical use.
Pocket PC/Windows Mobile only uses wide characters. Beyond that the practical use is for foreign languages. Some languages (like Chinese) have a lot more than 128 characters.
“On most 32-bit machines …. ”
I am not sure if this is the right place to ask this question .
What is a “32-bit machine ” ?
Thanks,
Nikki
Computers work by moving binary digits (bits) around. However, most computers do not work with individual bits — rather, they move data around in chunks. This chunk size is called a “word”. Typically, when we speak of the bit-ness of a machine, we speak of the size of a word. Thus, a 32-bit machine has a 32-bit word size, which means it moves information around 32-bits at a time.
Typically, modern computers use one word to address memory. With a 32-bit word, this means there are about 2^32 (4 billion) unique memory addresses that can be addressed. This is why 32-bit machines generally can’t make use of more than 4GB of memory.
The amount of memory a machine can address has nothing to do with the “bitness” of the machine — in fact there is no real consensus on what it means to be of a certain “bitness”.
Some 32 bits machines (such as the Pentium Pro and later 32 bit x86s) can address more than 4 GiB, but they are still only considered 32 bit computers.
I (personally) only consider a machine to be an n bit machine, if it is able to (at least) address up to 2^n individual bytes, is able to hold at least n bits in all registers and can do all operations on n bit (or larger) registers.
So, if I understand this correctly, the computer moves data around in 32-bit chunks (4 bytes), but in my C++ program I can assign variables to a single byte of memory. This seems contradictory to me.
I guess my question boils down to: Say I have a program that uses a large number of variables, so space matters. One byte will suffice for the data I need to carry so to save space I assign them as chars (1 byte). But, if the smallest “chunk” that the computer passes around is 4 bytes, does this actually save any space? Or would this effectively use the same amount of memory as if I made my variables ints (4 bytes, or one full word)?
This is a somewhat subtle question, so let me know if I’m not being clear.
PS – Thanks for a clear, organized, and well-written tutorial! I’ve really enjoyed it so far.
It would be easier to explain the whole thing using the architecture name x86(x86_64) rather than calling them ’32-bit’ machines. Confuses people.
my long double takes 12 bytes
I bet you say that to all the ladies.
now THAT is funny!!
This page should have a like button! :D
As does mine.
[...] 2.3 — Variable sizes and the sizeof operator [...]
[...] 2.3 — Variable sizes and the sizeof operator [...]
In the code above, all the “\t” characters are showing as “\\t” in my browser. This makes it show up as “\\t\\t” when you run the program instead of tabs. I don’t know if this is a browser displaying that code wrong (I’m running Firefox) or if you just typed that in incorrectly. Easily fixed by deleting the extra backslash.
Looks like it’s just a mistake. I’d expect anything containing “\\t” or “\\n” to purposely have two backslashes so they don’t become actual tabs or newlines, but the apparent “output” is there. I still think it’s a mistake.
Each
above should actually be
, otherwise a literal ‘\t’ will be printed rather than the intended tab character.
Greg, means you display the symbol of ‘\’, like ? or ! if ? or ! don’t work already. t shows , then the letter t. The reason is because ‘\’ isn’t a valid symbol for a string; the compiler seems to ignore it.
And also, a makes a beating sound. :) /
Sorry, can’t figure out why the backslash isn’t showing….
Hi, I would like to know what is the sizeof(long double); ? In this tutorial it is mentioned 8 bytes. Where as i tried in Dev C++ compiler it is giving 12.
Thanks,
syed
The size of a long double can vary from machine to machine. On most machines, it is either 8 or 12. The only way to know for sure is to use sizeof(long double) just as you have.
My int takes 2 bytes.
I am working with tourbo C .
My PC is 32-bit.
Turbo C only produces 16 bit DOS programs and is not C++ complaint. Consider upgrading to DJGPP, MinGW or any other modern compiler
for some reason when trying to compile this I get
Permission denied
Id returned 1 exit status
Wow, imagine how many variables I could store in a kilobyte :D
I know that years ago when they used to use punchtape in stuff like CNC machines, having a strip of tape that was over 12 kilobytes of information was very impracticle :P
Now that I think about it, punchtape was just a lot like bytes, where each row on the tape would have 8 circles, some of which were punched out (which I guess would be the equivalent to a bit with value 1) and the computer would read the values off it… At least I think that’s how it worked ;S
What do you mean when you use ^ and n in your equations? Are these standard operators and variables?
2^n means 2 to the nth power. eg. 2^3 = 2 * 2 * 2 = 8. 2^4 = 2 * 2 * 2 * 2 = 16.
Copy this code and use it on windows to keep the console screen up long enough for you to see the size of the data types… Also, I removed one \t from “wchar_t:\t\t” cuz a second tab caused that line to be out of alignment with the rest. y is the number of seconds you want the progam to wait before it ends.
#include <iostream> #include <windows.h> int main() { using namespace std; cout << "bool:\t\t" << sizeof(bool) << " bytes" << endl; cout << "char:\t\t" << sizeof(char) << " bytes" << endl; cout << "wchar_t:\t" << sizeof(wchar_t) << " bytes" << endl; cout << "short:\t\t" << sizeof(short) << " bytes" << endl; cout << "int:\t\t" << sizeof(int) << " bytes" << endl; cout << "long:\t\t" << sizeof(long) << " bytes" << endl; cout << "float:\t\t" << sizeof(float) << " bytes" << endl; cout << "double:\t\t" << sizeof(double) << " bytes" << endl; cout << "long double:\t" << sizeof(long double) << " bytes" << endl; int y = 30, x = y * 1000; //assign y value of sleep time in seconds convert x to seconds Sleep(x); return 0; }Oh, and alex, thanks for this tutorial. It’s the most complete, comprehensive, explanatory, and easy to follow tut on c++ I’ve ever ran across. Thanks for the time u spent in creating it, making it public, and taking time to answer people’s comments. You rock!
Why couldn’t you say;
Or better yet, just;
system("pause"); return 0;I present a better solution to this problem in lesson 0.7 — a few common cpp problems
what the resone behind it, why you can’t overload the size of operator ?
REgards,
Ranjan
Why would you want to? :)
Stroustrup says here:
So does it really matter what variable type you use? Dont get me wrong i want to use the right one, but it seems a task to remember all of the maximum values etc.
is there a trick to it?
There’s an awful lot to choose from.
It depends on the use or application of the code and your engineering design. (You can’t put 5# of @&*!*^ in a 1# bag.) As Alex said above “However, for programs that need a large amount of variables (eg. 100,000), the difference between using 1 byte and 8 byte variables can be significant.” that’s an 8-times increase of memory.
If you use the maximum word size for each piece of memory, your “product” may cost more because you need more memory to hold or run your program. The additional memory may also require larger circuit boards to hold the memory chips and power for the boards/memory which increases the weight and may require a fan (which takes power) to cool the components. This might be important when go to market or have to carry it to a space station, a battlefield, hiking up a mountain or as a cell phone or iPod.
If you’re just providing code for existing computers it may not be that big a deal.
(Sorry for the long winded explanaition. I’ve programmed to the bit level in assembly and in C for limited memory machines.)
This tutorial is so easy. It makes learning
programing so easy. The problem comes once
you get out of C++ and try to understand
msdn help. These guys from Microsoft are from
outer space. Just try looking up some of the
things you learn here and see if you can get
anything that explains it in common english.
I could sure use a tutorial on how to use
Microsoft help, once you get to the help
you need. Some times they just go round and
round.
I am on a MAC (64bit) so my values are significantly larger for 4 of the types.
QUESTION – how do you manage the risk that you might code for too large a type that won’t run on other machines? In other words, if I code using “long double” which is 16 bytes on my machine, but only 8 bytes on yours, will there be an issue when the program runs or no?
some token error is given at cout statement …relating end1; in below progg.
1
int x;
2
cout << "x is " << sizeof(x) << " bytes"<<endl;
bool: 1 bytes
char: 1 bytes
wchar_t: 2 bytes
short: 2 bytes
int: 4 bytes
long: 4 bytes
float: 4 bytes
double: 8 bytes
long double: 8 bytes
Your results may vary if you are using a different type of machine, or a different compiler.
I can understand that it depends on the machine.. But how does it depend on the compiler??
Visit The IT Blog for eBooks N Solutions with Lots of Hacking Tricks
Hi Alex/Members,
Is there any other way to find the size of a variable without using sizeof(operator).
I have seen in some of the websites that the below way we can find the same.
int i = 1;
size_t size1 = (char*)(&i+1)-(char*)(&i);
size_t size2 = (int*)(&i+1)-(int*)(&i);
cout<<size1<<"\t"<<size2<<endl;
Output: 4 1
Why it is varying here by typecasting with different datatypes?
I want to know where it is restricted to use sizeof() operator anywhere? and also whether the above code internally uses sizeof() operator like
size1 = address-diff/sizeof(int); size2 = address-diff/sizeof(char);
Please clarify.
Thanks,
Sanjeev.
First of all thanks Alex for this amazing tutorial…your awesome!!
I was just wandering if one bit can store a 0 or a 1 so two values and two bits can store 4 different values i.e 0101. How come it says 3 bits can store 8 values i.e 01010101? Surely 3 bits would store 6 different values if 1 bit stores 2 different values. 3*2 = 6. If I am missing the point can someone please try and explain where I am going wrong!?
Many thanks in advance.
It’s ok I just worked it out. one bit can store two values i.e a 0 or a 1 i.e 2^1 2*1=2. 2 bits can store 4 different values i.e 2^2 2*2= 4 and 3 bits can store 8 different values i.e 2^3 2*2*2= 8.
One more thing..just wondering what to the nth power means? is it like 2^9 or 2*2*2*2*2*2*2*2*2? Many thanks in advance.
[...] is that they have varying sizes — the larger integers can hold bigger numbers. You can use the sizeof operator to determine how large each type is on your [...]