Suggested speed improvement when defining string with value immediately, instead of delaying

241 Views Asked by At

I'm currently reading in "The C++ Programming Language: Special Edition" by Bjarne Stroustrup and on page 133 it states the following:

For user-defined types, postponing the definition of a variable until a suitable initializer is available can also lead to better performance. For example:

string s;  /* .... */ s = "The best is the enemy of the good.";

can easily be much slower than

string s = "Voltaire";

I know it states can easily, which means it won't necessarily be so, however let's just say it does occur.

Why would this make a potential performance increase?

Is it only so with user-defined types (or even STL types) or is this also the case with int, float, etc?

7

There are 7 best solutions below

1
On

I'd say this is mainly about types with non-trivial default constructors, at least as far as performance is concerned.

The difference between the two approaches is that:

  • In the first version, an empty string is first constructed (using the default constructor); then the assignment operator is used to effectively throw away the work done by the default constructor, and to assign the new value to the string.
  • In the second version, the required value is set right away, at the point of construction.

Of course, it is really hard to tell a priori how big a performance difference this would make.

1
On
  1. It takes time to execute a default constructor. Overriding what it initialized the string to in the subsequently invoked assignment operator takes time, too.

  2. The execution might never reach the assignment, when the function is (due to a return statement or an exception) left between the invocations of the default constructor and the assignment operator. In that case, the object was default-initialized unnecessarily.

  3. Implementations might waste performance to make sure the destructor of the object is called if an exception is thrown. If the object is initialized in a subsequent scope that's never reached, that isn't needed either.

0
On

Because:

string s;  /* .... */ s = "The best is the enemy of the good.";    

Involves two operations: Construction and Assignment

While:

string s = "Voltaire";   

Involves only construction.

This is equivalent to choosing Member Initializer lists over Assignment in Constructor body.

1
On

The class has three ways to initialize the string:

string  s;         // Default constructor
string  s = "..."; // Default constructor followed by operator assignment
string  s("...");  // Constructor with parameters passed in

The string class needs to allocate memory. It is better to allocate it once it knows how much memory it needs.

2
On

That's a good question. You're right, this only occurs with complex types. I.e. classes and structs, std::string is such an object. The real issue involved here has to do with the constructor.

When an object is created, i.e.

std::string s;

It's constructor is called, it probably allocates some memory, does some other variable initialization, gets itself ready for use. In fact, a large amount of code could be executed at this point in the code.

Later on you do:

s = "hello world!";

This causes the class to have to throw away most of what it's done and get ready to replace it's contents with a new string.

This is actually reduced to a single operation if you set the value when the variable is defined, i.e.:

std::string s = "Hello world";

will in fact, if you watch the code in a debugger, execute a different constructor once instead of constructing the object and then, separately, setting a value. In fact, the previous code works out to be the same as:

std::string s("Hello world");

I hope that helped to clear things up a bit.

0
On

Consider what happens in both cases. In the first case:

  • default constructor called for "s"
  • assignment operator called for "s"

In the second case, first consider that with copy elision this is equivalent to string s("Voltaire"), thus:

  • c-string constructor called

Logically the first approach requires the abstract machine to do more work. Whether this actually translates to more real code depends on the actual type and how much the optimizer can do. Though note that for all but trivial user types the optimizer might have to assume the default constructor has side-effects, thus can't simply remove it.

This additional cost should apply only to user-types as the cost is in the default constructor. For any primitive type like int, or in fact any with a trivial constructor/copy, there is no cost for the default constructor -- the data simply won't be initialized (when in a function scope).

0
On

Why would this make a potential performance increase?

The first case involves default initialisation, followed by assignment; the second involves initialisation from the value. Default initialisation might do some work that later has to be redone (or even undone) by assignment, and so the first case might involve more work than the second.

Is it only so with user-defined types (or even STL types) or is this also the case with int, float, etc?

It's only so with user-defined types; and then it depends on what the constructors and assignment operator actually do. For scalar types, default initialisation does nothing, and assignment does the same thing as initialisation from a value, so both alternatives will be equivalent.