Navigation



7.14 — Ellipses (and why to avoid them)

In all of the functions we’ve seen so far, the number of parameters a function will take must be known in advance (even if they have default values). However, there are certain cases where it would be useful to be able to pass a variable number of parameters to a function. C provides a special specifier known as ellipses (aka “…”) that allow us to do precisely this.

Because ellipses are rarely used, dangerous, and we strongly recommend avoiding their use, this section can be considered optional reading.

Functions that use ellipses take the form return_type function_name(argument_list, ...). argument_list is one or more fixed parameters, just like normal functions use. The ellipses (which are represented as three periods in a row) must always be the last parameter in the function. Any arguments passed to the function must match the argument_list. The ellipses capture any additional arguments (if there are any). Though it is not quite accurate, it is conceptually useful to think of the ellipses as an array that holds any additional parameters beyond those in the argument_list.

The best way to learn about ellipses is by example. So let’s write a simple program that uses ellipses. Let’s say we want to write a function that calculates the average of a bunch of integers. We’d do it like this:

#include <cstdarg> // needed to use ellipses
// The ellipses must be the last parameter
double FindAverage(int nCount, ...)
{
    long lSum = 0;

    // We access the ellipses through a va_list, so let's declare one
    va_list list;

    // We initialize the va_list using va_start.  The first parameter is
    // the list to initialize.  The second parameter is the last non-ellipse
    // parameter.
    va_start(list, nCount);

    // Loop nCount times
    for (int nArg=0; nArg < nCount; nArg++)
         // We use va_arg to get parameters out of our ellipses
         // The first parameter is the va_list we're using
         // The second parameter is the type of the parameter
         lSum += va_arg(list, int);

    // Cleanup the va_list when we're done.
    va_end(list);

    return static_cast<double>(lSum) / nCount;
}

int main()
{
    cout << FindAverage(5, 1, 2, 3, 4, 5) << endl;
    cout << FindAverage(6, 1, 2, 3, 4, 5, 6) << endl;
}

This code prints

3
3.5

As you can see, this function takes a variable number of parameters! Now, let’s take a look at the components that make up this example.

First, we have to include the cstdarg header. This header defines va_list, va_start, and va_end, which are macros that we need to use to access the parameters that are part of the ellipses.

We then declare our function that uses the ellipse. Remember that the argument list must be one or more fixed parameters. In this case, we’re passing in a single integer that tells us how many numbers to average. The ellipses always comes last.

Note that the ellipses parameter has no name! Instead, we access the values in the ellipses through a special type known as va_list. It is conceptually useful to think of va_list as a pointer that points to the ellipses array. First, we declare a va_list, which we’ve called “list” for simplicity.

The next thing we need to do is make list point to our ellipses parameters. We do this by calling va_start(). va_start() takes two parameters: the va_list itself, and the name of the last non-ellipse parameter in the function. Once va_start() has been called, va_list points to the first parameter in the ellipses.

To get the value of the parameter that va_list currently points to, we use va_arg(). va_arg() also takes two parameters: the va_list itself, and the type of the parameter we’re trying to access. Note that va_arg() also moves the va_list to the next parameter in the ellipses!

Finally, to clean up when we are done, we call va_end(), with va_list as the parameter.

Why ellipses are dangerous

Ellipses offer the programmer a lot of flexibility to implement functions that can take a variable number of parameters. However, this flexibility comes with some very dangerous downsides.

With regular function parameters, the compiler uses type checking to ensure the types of the function arguments match the types of the function parameters (or can be implicitly converted so they match). This helps ensure you don’t pass a function an integer when it was expecting a string, or vice versa. However, note that ellipses parameters have no type declarations. When using ellipses, the compiler completely suspends type checking for ellipses parameters. This means it is possible to send arguments of any type to the ellipses! However, the downside is that the compiler will no longer be able to warn you if you call the function with ellipses arguments that do not make sense. When using the ellipses, it is completely up to the caller to ensure the function is called with ellipses arguments that the function can handle. Obviously that leaves quite a bit of room for error (especially if the caller wasn’t the one who wrote the function).

Lets look at an example of a mistake that is pretty subtle:

    cout << FindAverage(6, 1.0, 2, 3, 4, 5, 6) << endl;

Although this may look harmless enough at first glance, note that the second argument (the first ellipse argument) is a double instead of an integer. This compiles fine, and produces a somewhat surprising result:

1.78782e+008

which is a REALLY big number. How did this happen?

As you have learned in previous lessons, a computer stores all data as a sequence of bits. A variable’s type tells the computer how to translate that sequence of bits into a meaningful value. However, you just learned that the ellipses throw away the variable’s type! Consequently, the only way to get a meaningful value back from the ellipses is to manually tell va_arg() what the expected type of the next parameter is. This is what the second parameter of va_arg() does. If the actual parameter type doesn’t match the expected parameter type, bad things will usually happen.

In the above FindAverage program, we told va_arg() that our variables are all expected to have a type of int. Consequently, each call to va_arg() will return the next sequence of bits translated as an integer.

In this case, the problem is that the double we passed in as the first ellipse argument is 8 bytes, whereas va_arg(list, int) will only return 4 bytes of data with each call. Consequently, the first call to va_arg will only read the first 4 types of the double (producing a garbage result), and the second call to va_arg will read the second 4 bytes of the double (producing another garbage result). Thus, our overall result is garbage.

Because type checking is suspended, the compiler won’t even complain if we do something completely ridiculous, like this:

    int nValue = 7;
    cout << FindAverage(6, 1.0, 2, "Hello, world!", 'G', &nValue, &FindAverage) << endl;

Believe it or not, this actually compiles just fine, and produces the following result on the author’s machine:

1.79766e+008

This result epitomizes the phrase, “Garbage in, garbage out”, which is a popular computer science phrase “used primarily to call attention to the fact that computers, unlike humans, will unquestioningly process the most nonsensical of input data and produce nonsensical output” (wikipedia).

So, in summary, type checking on the parameters is suspended, and we have to trust the caller to pass in the right type of parameters. If they don’t, the compiler won’t complain — our program will just produce garbage (or maybe crash).

As if that wasn’t dangerous enough, we run into a second potential problem. Not only do the ellipses throw away the type of the parameters, it also throws away the number of parameters in the ellipses! This means we have to devise our own solution for keeping track of the number of parameters passed into the ellipses. Typically, this is done in one of two ways:

  1. One of the fixed parameters is used as a parameter count (this is the solution we use in the FindAverage example above)
  2. The ellipse parameters are processed until a sentinel value is reached. A sentinel is a special value that is used to terminate a loop when it is encountered. For example, we could pick a sentinel value of 0, and continually process ellipse parameters until we find a 0 (which should be the last value). Sentinel values only work well if you can find a sentinel value that is not a legal data value.

However, even here we run into trouble. For example, consider the following call:

For example:

    cout << FindAverage(6, 1, 2, 3, 4, 5) << endl;

On the authors machine at the time of writing, this produced the result:

699773

What happened? We told FindAverage() we were going to give it 6 values, but we only gave it 5. Consequently, the first five values that va_arg() returns were the ones we passed in. The 6th value it returns was a garbage value somewhere in the stack. Consequently, we got a garbage answer.

When using a sentinel value, if the caller forgets to include the sentinel, the loop will run continuously until it runs into garbage that matches the sentinel (or crashes).

Recommendations for safer use of ellipses

First, if possible, do not use ellipses at all! Oftentimes, other reasonable solutions are available, even if they require slightly more work. For example, in our FindAverage() program, we could have passed in a dynamically sized array of integers instead. This would have provided both strong type checking (to make sure the caller doesn’t try to do something nonsensical) while preserving the ability to pass a variable number of integers to be averaged.

Second, if you do use ellipses, do not mix expected argument types within your ellipses if possible. Doing so vastly increases the possibility of the caller inadvertently passing in data of the wrong type and va_arg() producing a garbage result.

Third, using a count parameter as part of the argument list is generally safer than using a sentinel as an ellipses parameter. This forces the user to pick an appropriate value for the count parameter, which ensures the ellipses loop will terminate after a reasonable number of iterations even if it produces a garbage value.

8.1 — Welcome to object-oriented programming
Index
7.13 — Command line arguments

16 comments to 7.14 — Ellipses (and why to avoid them)

  • Tom

    Hello Alex -

    Interesting. Could you elaborate on what is meant by:

    “Finally, to clean up when we are done, we call va_end(), with va_list as the parameter.” ??

    What does va_end() actually do, does it just reset the pointer into the list to the default value? (Wouldn’t that be the same as resetting the pointer va_list to 0? Or, is there more to it?) What is the practical effect of calling va_end()? If we reference the list pointed to by va_list after calling va_end(), do we just get the first element in the ellipses again?

    For example:

    va_end();
    va_start(list, nCount);
    int nX = va_arg(list, int);
    

    Would that set nX equal to the first element of the ellipses?

    • My understanding (and I may be wrong about this) is that the implemention of va_start() and va_args() is left up to the compiler. If that’s actually the case, then va_end() could do any necessary cleanup.

      I looked at how va_end() was implemented in Microsoft Visual Studio and this is how it is defined:

      #define va_end(ap) ap = (va_list)0
      

      As you can see, it’s actually a macro function that just sets the ap parameter to 0. So, at least with Microsoft Visual Studio, there’s no real practical effect of calling va_end(), outside of maybe NULLing your list in case you inadvertently try to use it again without calling va_start().

      In your example, you’d have to pass list into va_end(), but va_start() should cause the list to start at the beginning of the ellipses again — so yes, nX would be the first element of the ellipses.

  • [...] « 12.6 — Pure virtual functions, abstract base classes, and interface classes | Home | 7.14 — Ellipses (and why to avoid them) » Friday, February 15th, 2008 at 4:06 [...]

  • [...] it uses the ellipses operator (…) as the type to catch. If you recall from lesson 7.14 on ellipses and why to avoid them, ellipses were previously used to pass arguments of any type to a function. In this context, they [...]

  • Ben

    I think it might be worth mentioning the ellipses’ value in formatted-string functions such as the printf() family.
    It would be ideal for implementing a date() function similar to that in PHP, for instance, where the number and type of parameters is explicitly defined in the format string. Obviously this is also dangerous (possibly even more so) but it would demonstrate handling mixed-type variable arguments. Of course it might also demonstrate How To Break Your Stack (TM).

  • a

    “Consequently, the first call to va_arg will only read the first 4 types of the double (producing a garbage result), ”

    shouldn’t that read “first 4 bytes“?

  • baldo

    Hey Alex, I suggest cleaning many of the spam or useless comments because they clutter the rest of comments that are really useful. Thanks!

  • AndresJak

    Why is it when I use big numbers, for example FindAvarage(71,73,85), or any other big numbers it gives me bizarre answers every time(first i thought it was my programs code was wrong, but than i used your code and it still gave me weird answers) i run the program or, or i missed something you mentioned.

    • abhishekchauhan

      I know this is a very late reply, but here’s the answer for benefit of those who are visiting this page for the first time…

      The first argument to your function FindAverage() should have been the number of variable list arguments. You’ve passed 71 there.

      This means, the program will keep on looking at 71*sizeof(int) bytes of memory and will do your job, the exact issue pointed out in this article.

      Really guys, don’t use “ellipsed” functions as far as possible. Try to use the concept of passing an array of pointers instead(the way they are passed in main()).

  • [...] catch, it uses the ellipses operator (…) as the type to catch. If you recall from lesson 7.14 on ellipses and why to avoid them, ellipses were previously used to pass arguments of any type to a function. In this context, they [...]

  • ellipses programs give Wayo Answer

  • earl_k

    So, do we conclude that the use of ellipsis in creating functions with variable list of parameters really dangerous? Personally, if I will base my conclusion on the author’s article, I will still use the ellipsis as far as it offers me convenience and flexibility because the dangers cited here are all due to the recklessness of the user calling the function (and stupidity if you are the one who created that specific function and you don’t know what you are doing).

    cout << FindAverage(6, 1, 2, 3, 4, 5) << endl;

    And if I was the one who created this function and let the public use it without me explaining how to correctly use it, then that's irresponsibility.

    Calling prinf and its cousins without really understanding it, well, I no longer know what you are.

    But anyway, with or without ellipsis, misuse of functions really has its perils.

You must be logged in to post a comment.