share
Stack OverflowHidden Features and Dark Corners of STL?
[+104] [30] vehomzzz
[2009-10-20 17:29:26]
[ c++ stl boost tips-and-tricks ]
[ http://stackoverflow.com/questions/1596139/hidden-features-and-dark-corners-of-stl ] [DELETED]

C++ developers, all know the basics of C++: Declarations, conditionals, loops, operators, etc.

Some of us even mastered the stuff like templates, object model, complex I/O [1], etc.

But what are the most hidden features or tricks or dark corners of C++/ STL [2] that even C++ fans, addicts, and experts barely know?

I am talking about a seasoned C++ programmer (be she/he a developer, student, fan, all three, etc), who thinks (s)he knows something 99% of us never heard or dreamed about. Something that not only makes his/her work easier, but also cool and hackish. After all, C++ is one of the most used programming languages in the world, thus it should have intricacies that only a few privileged know about and want to share with us.

Boost [3] is welcome too!

One per post with an example please

P.S Examples are important for other developers to copy and paste!

(1) mine is more specific to STL... a good pointer! - vehomzzz
(6) Hidden feature: it's not called the STL. - Lightness Races in Orbit
(1) @TomalakGeret'kal Unless he really is talking about just the STL. :) - muntoo
(1) @muntoo: Seems unlikely, with the accepted answer referencing the namespace std. - Lightness Races in Orbit
[+180] [2009-10-21 20:09:05] vehomzzz [ACCEPTED]

You wouldn't believe how many C++ programmers don't know this simple trick, even though it's rather obvious when you know how iterators and how initializing a container with an iterator range work.

int main(int argc, char* argv[])
{
    std::vector<std::string> params(argv, argv+argc);
    // Now all the command-line parameters are in the 'params' vector

}

In fact, I have seen quite some C++ programs out there which don't take advantage of initializing (or assigning) containers with iterator ranges even though they could. It seems that many C++ programmers simply haven't understood the usefulness of this simple thing.


(46) I'd extend that argument to saying: "It seems that many C++ programmers simply haven't understood the concept of iterators". - MP24
(4) Nice trick for simple cases, but generally I would use Boost.Program_options for parsing command line parameters, it has many useful features. - parti3an
Be more specific on "but generally I would use Boost.Program_options". Does your company allow the use of Boost? - vehomzzz
(1) @Andrei: In my experience, we use Boost. I've been asked about Boost experience on interviews. Is your experience in places where you're not allowed to use it? Do you know why not? - David Thornley
cuz I work in a fund, where every third-party code/libraries go through a rigorous compliance scrutiny. And to have Boost meet the criteria would be an almost impossible task, coupled with enormous time and effort invested. That's my life! In a personal programming endeavors, which is now rather limited, I employ boost. - vehomzzz
(2) While I think Boost is a bloated pig, it is still very useful to have at your disposal, provided you really need all of it's functionality. Otherwise I would go for the minimal library approach. - Brock Woolf
(7) You can just cut out the parts of Boost that you want, and discard those that you don't want. That's part of the beauty of Boost. - coppro
You can and you can't. There's a lot of cross-compiler "crud" in Boost that if you only target one OS makes reading through Boost a PITA. In an environment supporting many Platforms, it's a godsend, I'm sure. But in the general case, it is just annoying. I'd love to see a streamlined version for popular platforms. - jmucchiello
1
[+54] [2009-10-20 18:55:05] the_mandrill

I'd vote for std::bind [1] (or boost::bind() [2] if you can't use C++11) as a very powerful idiom that many people don't tend to come across. Conceptually it's a means to convert a function into an object that can be copied around and called at a later point, so you can use it for deferred callbacks. For instance you can use it with boost::thread to wrap the thread function:

void MyStaticThreadFunc(Data* someData) {
...
}

Data* myData=new Data;
t=new std::thread(std::bind(&MyStaticThreadFunc,myData));

Another very powerful use becomes evident when combined with its overloads for operator== so that you can combine with find_if (or copy_if, remove_if, etc) to call any member function of a class in a container:

struct Record {
   string  GetEmployeeName();
   int     GetWage();
};

using namespace std::placeholders;

std::vector<Record>  records;
std::vector<Record>::iterator it;
it=std::find_if(records.begin(), records.end(), std::bind(&Record::GetEmployeeName, _1) == "John Smith");
[1] http://en.cppreference.com/w/cpp/utility/functional/bind
[2] http://www.boost.org/doc/libs/1_51_0/libs/bind/bind.html

+10 Great example!!!! - vehomzzz
(11) boost::bind isn't part of the STL... - Tom
(16) > "Boost is welcome too! " - the_mandrill
Didn't even think of this. +1 to you sir. - wheaties
The threading tip rocks. Sure boost is not part of the STL, but I think it warrants being on this list because boost.bind is simply a shortcut to a wondrous implementation using STL. - Allbite
I find that bind is an enabling technology that can be used to get the most out of the STL algorithms, eg see the example with find_if() above. It transforms regular functions into function object so they can be used with most of the STL algorithm functions. - the_mandrill
I wouldn't call this hidden, since that (along with lambdas) is one of the most important things to learn when looking at the algorithms library. People who don't know std::bind probably also haven't heard about std::find, yet. - Christian Rau
2
[+36] [2009-10-20 17:41:52] Loki Astari

Though not techncially part of the STL.
The streams library is part of the standard C++ libs.

For streams:

Locales.

Very few people actually bother to learn how to correctly set and/or manipulate the locale of a stream.

The second coolest thing is the iterator templates.
Most specifically for me is the stream iterators, which basically turn the streams into very basic containers that can then be used in conjunction with the standard algorithms.

Examples:

  • Did you know that locales will change the '.' in a decimal number to any other character automatically.
  • Did you know that locales will add a ',' every third digit to make it easy to read.
  • Did you know that locales can be used to manipulate the text on the way through (ie conversion from UTF-16 to UTF-8 (when writting to a file).

etc.

Examples:

[1] http://stackoverflow.com/questions/3479485/is-there-a-built-in-function-that-comma-separates-a-number-in-c-c-or-javascri/3479520#3479520
[2] http://stackoverflow.com/questions/2648364/how-to-print-a-number-with-a-space-as-thousand-separator
[3] http://stackoverflow.com/questions/1422151/how-to-print-a-double-with-a-comma
[4] http://stackoverflow.com/questions/3291440/how-to-write-only-regularly-spaced-items-from-a-char-buffer-to-disk-in-c/3292451#3292451
[5] http://stackoverflow.com/questions/2226045/c-decimal-output-formatting
[6] http://stackoverflow.com/questions/2066126/counting-the-number-of-characters-that-are-output/2067723#2067723
[7] http://stackoverflow.com/questions/1391746/how-to-easily-indent-output-to-ofstream/1397588#1397588
[8] http://stackoverflow.com/questions/207662/writing-utf16-to-file-in-binary-mode/208431#208431

(9) An example would help. I'd add one if I had ever dealt with locales myself... :p - Chris Charabaruk
I second the motion - what specific aspect of locales are you after? - coppro
(2) The question was what are the dark corners. ie what is not used often. Nearly every text book shows you how to use a stream but very few show you how to inbue a stream with local or how to build your own local settings. The fact that you want an example shows the truth nobdy knows much on that area of the streams library. - Loki Astari
3
[+26] [2009-10-20 17:40:28] 280Z28

Using std::fill and std::partial_sum, you can generate a range of numbers.

size_t count = 10;
// **This is assuming the vector was already constructed somewhere earlier**
std::vector<int> values(count);
std::fill(values.begin(), values.end(), 1);
// if the first number is not 1:
// values[0] = start;
std::partial_sum(values.begin(), values.end(), values.begin());

Edit: obviously if you are constructing the vector specifically for the range, you can use the following and leave out the call to std::fill:

std::vector<int> values(count, 1);

(2) what is std::partial_sum do? - vehomzzz
(1) +1 great example!!! - vehomzzz
(1) The SGI implementation defines an iota() function that does this. Your version is nice, though, in that you can use a different value for fill() to get different increments. - Boojum
(4) I had to downvote this. What on earth is this useful for, vs. a simple: for(int i=0; i<count; i++) values[i] = i; - Andy Ross
(37) @Andy Ross: "had to downvote"? Really? Someone needs to freshen up on the theory behind the STL imo... To give an answer to your question, a for loop like you suggest would only be useful for the subset of containers that can provide counts in O(1) time and have random access to elements. - Matthew 'Cogwheel' Orlando
(2) If you can use the container's constructor for specifying count elements with value 1 then you don't need the std::fill() call. Even if you can't (i.e. the container is already constructed) it might be the case that you only have input iterators (as opposed to forward iterators), so you only have a single pass. Then, generate() with a stateful functor might be a better solution: you fill the values in a single pass. - wilhelmtell
(1) In C++11, fill + partial_sum (two passes!) for this purpose is not needed - there is std::iota. - Evgeny Panasyuk
4
[+25] [2009-10-20 17:32:21] Doug T.

swap()

Allows your code to avoid needless copies by swapping the internal representation of 2 containers.

 std::vector<unsigned char> src;
 std::vector<unsigned char> dest;

 //... fill in src

 // Destination takes on source's buffer
 dest.swap(src);

(8) I think that in case of swap more important then preventing needless copying is non-throw guarantee. Canonical assignment operator implementation relays on this. And I don't think that swap is a "hidden feature". - Adam Badura
(9) I suppose "hidden feature" is relative. - Doug T.
+1; I recently 'discovered' this - it was certainly hidden to me since I never thought of using swap and then discarding the source object, totally ignoring what dest originally contained. - Frerich Raabe
5
[+23] [2009-10-26 22:34:28] fmuecke

A feature that is often forgotten (and therefore implemented by hand in various ways) are the little iterator functions distance and advance.

They both operate on all standard containers and container adapters and use the most efficient implementation for each container, depending on the supported iterator type (random access, bidirectional, forward, ...).

  • distance calculates the distance of two iterators
  • advance can be used to advance the iterator by x elements. This is extremely useful for example, if you want to access the nth element of a list<>. (Although this may indicate that you should use an other container...)

Example:

  typedef list<int> MyList;
  MyList myList;
  myList.push_back(5);
  myList.push_back(1);
  myList.push_back(7);
  myList.push_back(9);
  myList.push_back(3);

  // print the 3rd element from the end
  MyList::const_iterator iter = myList.end();
  advance(iter, -3);
  if(myList.end() != iter)
      cout << *iter << endl; // prints "7"

(7) advance is nice, but don't forget its siblings boost::next and boost::prior, which don't modify the iterator, but simply return the next/previous one. I've found those to be much more useful than advance. - jalf
For random access iterators, iter -= 3 also works. - Felix Dombek
All of these are great points. One sidenote: while I like distance, I think it's sometimes more concise not to use distance. To be pedantic, distance(it1, it2) is more characters than it2 - it1. - solvingPuzzles
6
[+22] [2009-10-20 18:53:40] quark

Streambufs and streambuf iterators [1]. Most people don't know about streambufs, only about streams. The latter is in fact a high-level wrapper around the former that provide formatting and locale-specific functions. While formatting is often what you want, for low-level I/O where speed is essential you want to bypass the stream and use the stream buffer. (In C terms, this is like the difference between printf/scanf and fputc/fgetc.)

An example of use from Dr. Dobb's Journal (the source of the above link). Note that this can be considerably faster than using istream::operator<<. It is for byte-level control, not for text-processing.

#include <istream>
#include <fstream>

using namespace std;

void read_buffer(streambuf * buffer)
{
    istreambuf_iterator<char> i(buffer);
    istreambuf_iterator<char> end;
    while (i != end)
    {
        char c = *i;
        // Do something with c
    }

}

int main()
{
    // Use a stream
    ifstream   in("filename.txt");
    streambuf* buf = in.rdbuf();
    read_buffer(buf);

    // Directly use a buffer
    filebuf    buf("filename.txt");
    read_buffer(&buf);
}
[1] http://www.ddj.com/cpp/184401357

(2) It's worth noting that one doesn't need an ifstream at all - filebuf can be created directly, and associated with a file via its member open. - Pavel Minaev
(4) Actually, it should rather be the buffer that’s not needed: there’s an istreambuf_iterator constructor that takes an arbitrary input stream as its parameter, hence the iterator can be initialized and used directly: for(isb_i<char> i(in); i != isb_i<char>(); ) { char c = *i; … }. Shorter and arguably more idiomatic. - Konrad Rudolph
I expanded the example (from the DDJ original) to show both stream use and direct filebuf use. Feel free to add further cases :) - quark
It's also worth noting that if you truly care about performance, you'll be writing to the syscall layer and not using an I/O abstraction library at all. - Andy Ross
(2) Depending, of course, on whether you "truly care" about portability as well. - Matthew 'Cogwheel' Orlando
7
[+21] [2009-10-30 13:07:47] vehomzzz

I like the little-known operator "-->", also know as "goes to."

Here's an example:

#include <stdio.h>
int main()
{
     int x = 10;
     while( x-->0 ) // x goes to 0
     {
       printf("%d ", x);
     }

}

(5) It's "goes down to" really, unfortunately. Other than that, a pretty creative idea. :-) - Frerich Raabe
(4) +1 because it made me smile - McTrousers
Silly thats not an operator. - thyrgle
Even though I have seen this numerous times it stills makes me smile. It's just so clever in every way! :) - monoceres
(5) On par with the WTF operator: sizeof(int) == 3 ??!??! puts("this is quite a non-standard system"); - Christoffer
(2) That would be so much clearer if it were while ( x-- > 0 ). This is equivalent to while ( x++ < 20 ) - Jeff Ferland
(6) This has nothing to do with the STL. - FredOverflow
8
[+16] [2009-10-20 19:09:09] Konrad Rudolph

I really like the inner_product [1] function. Normally, it calculates the dot product of a vector and a transposed vector, given as iterator ranges (what else?) but it can be properly parametrized to perform about any accumulation on two ranges.

Although, to be honest, I am not sure using it is a good idea for code clarity. Still, it has a true ring of function-oriented programming.

Consider this one-line1) implementation of the Hamming distance [2]:

unsigned int hamming_distance(std::string const& a, std::string const& b) {
    return std::inner_product(
        a.begin(), a.end(), b.begin(), 0, std::plus<unsigned int>(),
        compose<unsigned int>(
            std::bind1st(std::minus<unsigned int>(), 1),
            std::equal_to<unsigned int>()));
}

Admittedly, I had to create a (trivially implemented) compose function to handle function composition – and unfortunately I had to make it use two parameters (which isn’t quite proper) because C++ doesn’t understand currying.

For comparison, here’s the almost exact equivalent in Python (taken from the abovementioned article), and it’s easy to see that the C++ method is absolutely superior in every aspect:

def hamming_distance(s1, s2):
    return sum([ch1 != ch2 for ch1, ch2 in zip(s1, s2)])


1) Well, it’s really a “one-expression” implementation.

[1] http://www.cppreference.com/wiki/stl/algorithm/inner%5Fproduct
[2] http://en.wikipedia.org/wiki/Hamming%5Fdistance

+ 1 I've just had a discussion in the complier's inlining of std::inner_product -- stackoverflow.com/questions/1596053/inlining-stdinnerproduct ... which in effect prompt me to start this thread. thx - vehomzzz
(2) Damn, didn’t see that Boost was welcome. Obviously, Boost would make this a bit nicer. - Konrad Rudolph
9
[+13] [2009-10-21 11:11:29] Matthieu M.

I would say that in general, the use of meta-programming technics is quite foreign to most of the C++ developers I know...

But a library that changed my life is incontestably Boost.MultiIndex [1] (and its adapter, Boost.BiMap [2]). Before that trying to maintain multiple indexes on a structure was really hackish and error-prone.

EDIT: example (shamelessly taken from Bimap Reference [3])

struct id   {}; // Tag for the identification number
struct name {}; // Tag for the name of the person

typedef bimap <
       set_of < tagged< int        , id   > >,
  multiset_of < tagged< std::string, name > >
> People;

People people;

// ...

int user_id;
std::cin >> user_id;

People::map_by<id>::const_iterator id_iter = people.by<id>().find(user_id);
if( id_iter != people.by<id>().end() )
{
  std::cout << "name: " << id_iter->get<name>() << std::endl
            << "id:   " << id_iter->get<id>()   << std::endl;
}
else
{
  std::cout << "Unknown id, users are:" << std::endl;

  for( People::map_by<name>::const_iterator
        name_iter = people.by<name>().begin(),
        iend      = people.by<name>().end();
       name_iter != iend; ++name_iter )
  {
    std::cout << "name: " << name_iter->get<name>() << std::endl
              << "id:   " << name_iter->get<id>()   << std::endl;
  }
}

As you can see, I can easily traverse the Bimap either ordered by id or by name.

Also note the various ways of indexing:

  • set_of: ordered, unique
  • multiset_of: ordered, non-unique
  • unordered_set_of: non-ordered, unique (hash...)
  • unordered_multiset_of: non-ordered, non-unique (hash...)
  • list_of: sequence (ordered by your insertions / removal)
  • vector_of: same as list_of, but provide random access
  • unconstrained_set_of: no constraint at all, makes your bimap mimics a classic map

Of course, this uses MPL behind the covers, but it's so much more practical (notably the 'tagged' option), and then you also get some really nice possibilities due to the interaction with other Boost libraries (such as Boost.Range), for example, taken from another page of the reference [4]:

typedef bimap< std::string, unconstrained_set_of<int> > bm_type;
typedef bm_type::left_map map_type;

bm_type bm;
map_type & m = bm.left;

// ... filling the map

typedef map_type::const_iterator const_iterator;
typedef std::pair<const_iterator,const_iterator> const_range;

const_range r = m.range( "one" <= _key, _key <= "two" );
for( const_iterator i = r.first; i != r.second; ++i )
{
  std::cout << i->first << "-->" << i->second << std::endl;
}

The range is very nicely defined, using a 'lambda' like syntax, which makes for effortless 'equal_range'.

[1] http://www.boost.org/doc/libs/1%5F40%5F0/libs/multi%5Findex/doc/index.html
[2] http://www.boost.org/doc/libs/1%5F40%5F0/libs/bimap/doc/html/index.html
[3] http://www.boost.org/doc/libs/1%5F40%5F0/libs/bimap/doc/html/boost%5Fbimap/the%5Ftutorial/bimaps%5Fwith%5Fuser%5Fdefined%5Fnames.html
[4] http://www.boost.org/doc/libs/1%5F40%5F0/libs/bimap/doc/html/boost%5Fbimap/the%5Ftutorial/unconstrained%5Fsets.html

that sounds interesting, would you please provide an example? thx - vehomzzz
10
[+12] [2009-10-20 17:55:02] pythonic metaphor

Once you start using the STL algorithms and need unary functions, knowing how to use bind is incredibly helpful. It lets you easily convert binary functions into the unary functions.

Edit Suppose you want to multiply all the numbers is a container by 5. Rather than looping or defining your one off functor, use bind1st:

transform(container.begin(), container.end(), container.begin(),
          bind1st(multiplies<double>(),5.0));

(1) Please show us an example with some rudimentary explanation. thx - vehomzzz
+1 - really useful and cool :) - Jacob
Currying in C++! - Viktor Dahl
11
[+12] [2009-10-20 17:59:55] Jerry Coffin

Locales/facets and valarray/slice/gslice/ are probably the most obvious, at least to me.

In the other direction, I think mem_fun/mem_fun_ref/bind1st/bind2nd, etc., usually produce code that looks obscure and is often nearly impossible to read, but it seems that nearly everybody who starts using algorithms overuses these severely for at least a while. There are really two culprits: for_each (which is severely overused itself), and lack of lambdas (in the current language). The FOR_EACH from boost can help quite a bit, and the lambda support in C++ 0x will help a lot more.

Edit to add examples:

Locales: to get output in readable format with iostreams:

#include <iostream>
#include <locale>

int main() { 
    std::locale loc("");
    std::cout.imbue(loc);

    std::cout << 1234.56;
    return 0;
}

Produces:
1,234.56
for me, but for a typical German user (for example) would produce:
1.234,56
As he'd expect.

valarray: for example, to compute the variance of elements in an array:

// Warning: only minimally tested
template <class T>
T const variance(T *values, size_t size) {
        std::valarray<T> const v(values, size);
        T average = v.sum() / static_cast<T>(v.size());
        std::valarray<T> diffs = v-average;   
        diffs *= diffs;

        // unsigned arithmetic is fine here -- all numbers are positive.
        return diffs.sum()/diffs.size();

}

Doing the same with std::vector and the standard algorithms (e.g. std::inner_product, std::accumulate) produces code that's substantially longer and harder to read or understand.

Edit: more about for_each (in response to comment). The position of for_each among algorithms is similar to the position of goto among control structures. If you advocate it, each can be seen as flexible, versatile, and usable in almost every possible circumstance.

That flexibility becomes the Achilles heal of each: using it tells the reader virtually nothing about what you're really trying to accomplish. Its usage imparts essentially no information to the reader -- the actual intent of each can be ascertained only by careful examination of the surrounding code to figure out what it does.


(2) I find the use of *for_each more intuitive and easier to work with than rolling my own loops with iterators...there is a lot of room for abuse, but if you know what you're doing, for_each would make things better, nicer and visually gratifying. - vehomzzz
T average = v.sum() / v.size() - ouch, this is a classic example of a signed/unsigned arithmetic mismatch! size() is unsigned, therefore if T is signed, the result will still be unsigned. So if valarray consists of, say, two elements -2 and 1, the average will be computed as (unsigned)-1 / 2... - Pavel Minaev
(2) Hand-rolled loops are rarely the right alternative to for_each. From your comment, right now your at the level of deciding whether to use an algorithm or not. From that perspective, for_each may well be the right answer. The next step, however, is progressing from simply replacing hand-rolled loops, to choosing the right algorithm for the job. At the risk of sounding condescending, when you start to do that, you'll find that for_each is almost never it. - Jerry Coffin
@Pavel: good point, at least in theory. In reality, we expect that T is floating point. Nonetheless, I've added the cast to fix the potential problem. IMO, though, this demonstrates how much good it can do to use even a relatively obscure library -- I doubt you use valarray often at all, but I suspect it still took less time to spot that problem than it would have using most alternatives... - Jerry Coffin
12
[+12] [2009-10-27 18:43:24] Jerry Coffin

Differences between std::sort (which many people use anytime they need sorting done) and std::stable_sort. Ostensibly, the primary difference is that stable_sort guarantees that elements that compare equal remain in their original order. In reality, there's another important difference: stable_sort also gives guarantees on worst-case performance that sort does not -- stable_sort is never worse than O(N lg N) complexity. sort gives no explicit guarantee on worst case performance, though it would certainly be surprising it it were worse than O(N^2).

This lack of a guarantee was chosen to allow std::sort to be implemented as a Quicksort, which has extremely good average performance, but can have O(N^2) complexity in a few pathological cases. Most recent implementations of std::sort, however, are implemented with the Introsort algorithm. Introsort starts out as basically a Quicksort, but keeps track of its recursion depth. When/if the recursion depth exceeds a chosen constant (indicating that it may be encountering a pathological case), it switches to using another sort algorithm instead.

Tracking the recursion depth adds minuscule overhead, giving an equally minuscule speed loss, but ensures that complexity remains roughly O(N lg N) even in the worst case.


(1) There's several other sort functions that are really handy. - David Thornley
13
[+11] [2009-10-27 16:17:01] twk

size() on a list is O(N) on some platforms. Use empty() if you can, it should be O(1).


14
[+10] [2009-10-20 19:18:53] Martin Beckett

The stupid 'swap' trick to actually free memory when you clear a vector.
I really want to find the guy that thought that was a better alternative to a free() function.

std::vector<int> vi;
vi.reserve(1000000); // don't ever do this
std::copy(
    std::istream_iterator<int>(cout),
    std::istream_iterator(),
    std::back_inserter(vi) );

vi.size(); // let's say 10
vi.capacity(); // 1000000

std::vector<int>(vi).swap(vi); // almighty swap trick
// vi.free() or vi.reserve(0) both would have been clearer

vi.size(); // still 10
vi.capacity(); // 10 Q.E.D.

Are you referring to the same trick that is used to force a vector to release its memory (e.g. to reduce its capacity)? Perhaps a free()-like function doesn't make sense in some cases, particularly for objects that manage their own memory. Please provide some context. - Void
I suppose the C++0X std::vector<>::shrink_to_fit() could help in this case (it's not a binding request), although one of course would still need the swap trick to force complete deallocation of the underlying buffer if destruction of the vector itself is not an option. That said, when would complete deallocation of the underlying buffer but not the vector itself ever be necessary? Such a requirement seems rather brittle, but I suppose it could be necessary in some cases. Admittedly, I can't think of any, however. BTW, the "Q.E.D." was a nice touch. :) - Void
15
[+9] [2009-10-22 11:12:16] vehomzzz

This one is coming from James Kanze [1]

Hardly a dark corner, but I find a lot of programmers aren't aware of std::accumulate (perhaps because it's in the math section of the standard, rather than the algorithms). Define a custom accumulator type, and it can be used for a lot of things. For example, if I want to append an MD5 checksum to a string, I'll use something like:

   s += std::accumulate(s.begin(), s.end(), MD5Hasher());

(where s is a string, and MD5Hasher is an accumulator which generates the MD5 checksum, and has an implicit conversion to string which returns the hash generated so far).

Practically, accumulate is usable just about any time you want to generate a single value which depends on a sequence of values.

[1] http://groups.google.com/groups/profile?enc%5Fuser=ldWUIxUAAABueGoyXwiEvsOCcmTGwZe%5F9h3i3SmjGmAJbX05nZ-8fQ

(1) +1. I just used boost's acumulator_set to do some statistics on sets of values. Had I known about this, I'd never have brought boost into that project. Of course, it is all wrapped in #ifdef DEBUG sections. - KitsuneYMG
(1) +1 SO really needs a scrapbook feature - McTrousers
16
[+9] [2009-10-27 20:44:27] parti3an

Keyword mutable.

Sometimes you need to change a member variable inside a const member function. Usually, this happens when you want to do lazy initialization of a value on first access. Consider example:

struct File {
   explicit File(const std::string& name) : name_(name), size_(-1ul) {...}

   // The member is declared to be const because logically taking size of an object
   // doesn't modify it's state. We could calculate the size in constructor as well, 
   // but that would be waste of time in cases when application doesn't want to get
   // size of constructed File object.
   unsigned long size() const {
      if (size_ == -1ul) {
          size_ = /* use some API call to calculate size of the file */;
      }
      return size_;
   }
private:
   std::string name_;
   ...
   mutable unsigned long size_; // Using mutable keyword here !!!
};

This is very simplified example that came first to my mind, but it allows to grasp the idea. Sometimes contract requires a method to be const, but internal implementation due to some reasons needs to modify state of the object, in this case mutable is your friend.


@Mankarse: It appears you are right, using const_cast like I suggested is actually a bad idea; I'll remove my original comment to avoid misleading readers. - Frerich Raabe
17
[+9] [2009-10-30 14:51:55] Frerich Raabe

You can often use a std::vector instead of new'ing or malloc'ing an array dynamically. If you need pass a pointer to the array around, use the address to the first element in the vector (&v[0]). This makes code easier to write, easier to read and safer. What's not to like? Consider:

const int len = computeLength( inputData );
char *convertedData = new char[ len ];
if ( !checkDataForValidity( convertedData ) ) {
    delete [] convertedData;
    return false;
}
processData( convertedData );
delete [] convertedData;
return true;

vs.

const int len = computeLength( inputData );
vector<char> buf( len );
if ( !checkDataForValidity( &convertedData[0] ) )
    return false;
processData( &convertedData[0] );
return true;

The second variant using std::vector needs less code, releases the acquired memory automatically (even if exceptions are thrown), and it auto-initializes the array to zero automatically.


Yes and no. Arguably this is much more confusing unless the staff is use to calling legacy C/WinAPI interfaces with vector<char>s. - jmucchiello
@jmucchiello care to elaborate? What is the relevance of WinAPI? - McTrousers
+1 I love this "trick", and I use it all the time. - luke
18
[+8] [2009-10-25 02:13:29] KitsuneYMG

Allocators

This article [1] has some information about allocators. I don't understand them too well myself, so ,for me, they are a dark corner. This could allow you to use std::vector but force it to create shared_ptrs or use a preallocated/specific (and possible shared/file-mapped/etc) block of memory.

Edit: found another SO post [2] relevant to my answer.

[1] http://www.codeguru.com/Cpp/Cpp/cpp%5Fmfc/stl/article.php/c4079/
[2] http://stackoverflow.com/questions/826569/compelling-examples-of-custom-c-stl-allocators

19
[+7] [2009-10-28 22:53:31] Splinter of Chaos

Sometimes when programming algorithms on sequences (with iterators), you can write a more efficient algorithm if the iterators are random access (std::vector, std::dequeue) as apposed to sequential (std::list, std::map). Instead of going with the most general form, you can use a technique called tag dispatching to decide at compile time which version can be used.

struct random_access_iterator_tag {}; // Defined in the standard.
struct bidirectional_iterator_tag {};

template< class Iterator, class Distance >
Iterator advance( Iterator it, Distance d, random_access_iterator_tag )
{
    return it + d; // Would be illegal for bidirectional iterators.
}

template< class Iterator, class Distance >
Iterator advance( Iterator it, Distance d, bidirectional_iterator_tag )
{
    // This would all be unnecessary with random access iterators.
    while( d > 0 ) {
        it++; d--;
    }
    while( d < 0 ) {
        it--; d++;
    }
    return it;
}

template< class Iterator, class Distance >
Iterator advance( Iterator it, Distance d )
{
    // Use iterator traits to find out what category an iterator is. 
    typename iterator_traits<Iterator>::iterator_category category;
    advance( it, d, category ); // No instance of category is created.
}

It's also worth noting that if you write your own iterator class, you should have it inherit from std::iterator so iterator_traits is done for you and you can call functions like this with no trouble. Otherwise, specializing iterator_traits is not a bad idea.


20
[+5] [2009-10-21 11:19:40] Test

A C++ one, placement

CObject *p = (CObject*)malloc(sizeof *p);
...
p = new(p) CObject;
p->DoSomthing();
...

what is CObject? - vehomzzz
@Andrei: it's irrelevant to the example - Matthew 'Cogwheel' Orlando
(look up "placement new") - Matthew 'Cogwheel' Orlando
(9) Don’t forget p->~CObject(); and free(p);! - Konrad Rudolph
Is there any benefit to the above example over doing regular new/delete? Or is the example just showing the syntax, and not meant to demonstrate a real world use? - Jeremy Friesner
@Jeremy: Istantiating in shared memory, serialization, any case where you want to control exactly where the bits of CObject are in memory. - KitsuneYMG
You can do this without malloc. - Cam
(4) The example should replace malloc() with a pseudo allocation function like mySharedMemoryAlloc() - jmucchiello
21
[+5] [2009-11-11 12:17:28] Krisztian

Using std::auto_ptr for pimpl design pattern: Just assume a simple class which uses pimpl mechanism to hide its implementation. The pointer to impl is implemented by std::auto_ptr which will ensure to delete implementation after it is no longer to be needed.

The header file looks like this:

#include <memory>

class SomeClass
{
 struct Impl;
 std::auto_ptr<Impl> pimpl;
public:
 SomeClass();
 void someFunction();
};

The source file:

#include "SomeClass.h"
#include <iostream>

struct SomeClass::Impl
{
 ~Impl()
 {
    std::cout << "destructed";
 }
};

SomeClass::SomeClass()
: pimpl(new Impl())
{}

void SomeClass::someFunction()
{}

However, if we would like to use this class, g++ won't compile it. At least, we get the error message:

memory:259: note: neither the destructor nor the class-specific operator delete will be called, even if they are declared when the class is defined.

The main problem is that the compiler is unable the generate the ~SomeClass destruction. Because we do not provide it, the compiler-generated default destruction will be used. In this case the deletion of pimpl provided by the auto_ptr destructor. Because only the forward declaration of Impl is present in the header file, the compiler won't be able to generate the destruction.

The solution is to add an empty destructor of SomeClass. In this case, the compiler does not have to generate it. Just let the body of destruction empty, the pimpl's destructor will be called, and that point (compiling SomeClass.cpp) the Impl is known.

And how is it connection with STL? If we use e.g. boost's shared_ptr instead of auto_ptr, this error will not occur, because the shared_ptr has its own structure for deletion.


22
[+4] [2009-10-20 19:06:39] quark

I usually don't like to just list books, but there's one that really targets the "dark corners" of the STL, and that's Scott Meyer's "Effective STL". [1]. Lots of gotchas and unexpected features covered, for example:

  • Copying containers
  • Iteration invalidation
  • Erasing containers (you wouldn't think this was tricky but...)
  • Swaps
  • Subtleties of memory management (many, many subtleties, sigh)

Check out the table of contents.

[1] http://rads.stackoverflow.com/amzn/click/0201749629

23
[+3] [2009-10-22 11:30:32] sep

Those who know what allocator::rebind does, please put your hands up.

I write my own allocator, and I override the allocator::pointer so that it points to my object instead of a normal pointer. And I use this when I construct an STL container.

I use shared memory to maintain a look-up table that can be accessed from multiple applications. Since different processes will map the shared memory to a different base address, pointer code in STL containers don't work. I wrote my own pointer class that actually stores the offset and when dereferenced will automatically apply the base address.

While, this is within the standard, this is so obscure that it only worked with MSVC7.0-7.1 and icpc7-9. g++ never supported it and even the latest version of MSVC don't support it. Most STL libraries simply need to assume that you are working with a regular pointer.

This is what a custom allocator looks like (M::ptr contains the base address):

template<class T, class M>
class var_allocator
{
public:
   typedef T value_type;
   typedef size_t size_type;
   typedef ptrdiff_t difference_type;
   typedef relative_ptr<T,M> pointer;
   typedef const_relative_ptr<T,M> const_pointer;
   typedef T& reference;
   typedef const T& const_reference;

   var_allocator() {}
   var_allocator(const var_allocator<T,M> &) {}
   template<class U>
   var_allocator(const var_allocator<U,M> &) {}
   pointer address(reference ref) const { return &ref; }
   const_pointer address(const_reference ref) const { return &ref; } 
   void* raw_allocate( size_type size , size_type n , void* )
   {
      return mm_malloc(M::Ptr, size * n);
   }
   void raw_deallocate( void* ptr )
   {
      mm_free(M::Ptr, ptr);
   }
   pointer allocate( size_type n , const void* )
   {
      return (pointer)raw_allocate( sizeof(T), n, NULL);
   }
   pointer allocate( size_type n )
   {
      return (pointer)raw_allocate( sizeof(T), n, NULL);
   }
   void* _Charalloc(size_type size)
   {
     return raw_allocate(size, 1, NULL);
   }
   void deallocate( pointer ptr, size_type )
   {
      raw_deallocate(ptr);
   }
   void deallocate( pointer ptr )
   {
      raw_deallocate(ptr);
   }
   void deallocate( void* ptr, size_type )
   {
      raw_deallocate(ptr);
   }
   void deallocate( void* ptr )
   {
      raw_deallocate(ptr);
   }
   void construct(const relative_ptr<T,M> &ptr, const T& val)
   {
      new ((void *)ptr) T (val);
   }
   template <class U> void construct(U* ptr, const U& val) 
   { 
      new (ptr) U (val); 
   }
   template <class U> void construct(U *ptr, int)
   {
      new (ptr) U;
   }
   template <class U> void construct(relative_ptr<U,M> &ptr, const U& val)
   {
      new ((void *)ptr) U (val);
   }
   template <class U> void construct(relative_ptr<U,M> &ptr, int)
   {
      new ((void *)ptr) U;
   }
#ifdef _WIN32
#pragma warning(disable: 4100)
#endif
   template <class U> void destroy(U *ptr)
   {
      (*ptr).~U();
   }
#ifdef _WIN32
#pragma warning(default: 4100)
#endif
   template <class U> void destroy(relative_ptr<U,M> &ptr)
   {
      ptr->~U();
   }
   size_type max_size() const
   {
      return mm_core_size(M::Ptr) / sizeof(T);
   }
   template <class U>
   struct rebind 
   {
      typedef var_allocator<U,M> other;
   }; 
};

template<class M>
class var_allocator<void,M>
{
  typedef relative_ptr<void,M> pointer;
};

(1) example please...... - vehomzzz
Wow this is sick. Basically this is tapping into an "extra layer of indirection" built into the standard library, which otherwise goes unused. - chisophugis
24
[+2] [2009-10-28 12:55:41] vehomzzz

when a temporary is assigned to a const reference inside a {} block, it's lifetime is extended to the lifetime of a reference.That made ScopeGuard possible. ScopeGuard FTW


25
[+1] [2009-10-26 22:51:59] fmuecke

std::vector can be used to create a dynamically sized buffer on the stack.

Like in this example (which shows how to convert a std::string into a std::wstring)

#include <vector>
#include <locale>
#include <string>
using namespace std;

string narrowStr("narrow");
wstring wideStr;
locale loc("german");

vector<wchar_t> wBuffer( narrowStr.length() );  // = wchar_t[narrowStr.length()]

use_facet<ctype<wchar_t> >(loc)._Widen_s( 
    narrowStr.c_str(),                       
    narrowStr.c_str() + narrowStr.length(),
    &wBuffer[0],
    wBuffer.size()
);
wideStr = wstring( &wBuffer[0], wBuffer.size() );

How do you know that it's on the stack and not abstracting away a pointer to a buffer allocated on the heap? - Joseph Garvin
What i know is that the vector will be created on the stack and therefore no delete or anything must be called to free the actual buffer. - fmuecke
(1) While the vector is on the stack, the memory it allocates isn't. Only the bookkeeping data associated with the vector is on the stack. If you want a std::vector to allocate stack memory, you'd need to write a custom allocator in terms of alloca or similar. - jskinner
std::transform might be a better choice for string/wstring conversions - DanDan
26
[+1] [2009-10-27 17:06:08] IvyMike

One thing that surprised me in the last year was discovering the block allocation size for the deque class. It's obviously implementation-specific, but I believe the default Visual Studio size is 128 bytes, with a minimum of one object. You should track down the value your deque implementation uses.

The reason this can matter: If you create a deque with a lot of objects, it may internally make a ton of calls to new/delete, enough so that performance suffers. It is easy enough to create a modified deque class with a different block size, although the tradeoff is that it will waste more space.


27
[+1] [2009-10-31 05:57:37] Drahakar

When you want to clone a container of object that are polymorphic, and that exposes a Clone() methode, here is how a procede :

std::transform(e.begin(), e.end(), std::back_inserter(exp_), std::mem_fun(&PolymorficObj::Clone));

where e is the source container of polymorphic items and exp_ is the dest of cloned object.


28
[0] [2009-10-20 18:38:50] DevFred

Google "most vexing parse" [1]. C++ doesn't get much darker than that ;-)

[1] http://www.gotw.ca/gotw/075.htm

(3) Haha... good find. How does it relate to STL? hmm - vehomzzz
A complete program that describes this "vexing parse" <code> #include <iostream> using namespace std; struct S { S(){} S(const S&){} }; S s(S()) { cout << "hello" << endl; } S f() {}; int main(int argc, char** argv) { s(f); return 0; } </code> - Murali VP
It's explained in Scott Meyer's "Effective STL" book? - DevFred
29
[-9] [2009-10-20 18:17:30] alex tingle

Programming is not about proving how clever you are. It's rarely even about how well your compiled code performs. Usually, the most important thing is how well it can be understood by the next guy who has to extend or debug it.

In general you shouldn't use unknown corners of libraries. If they are unknown, then they are little used, and readers of your code will struggle to understand them.


(6) Good advice but it doesn't really address the OP's question. Certain technique, if used properly, can significantly increase productivity and code... we are sharing 'em. - vehomzzz
(1) Andrei: You are right of course, but I think it needed to be said. - alex tingle
(1) This has a minus score? This is the correct answer. - wheaties
(23) It's not a correct answer, it's an offtopic comment. The topic is definitely interesting, and I do not see how this is related (after all, nowhere does the question suggests that you should use any of these techniques, and it specifically uses the word "hackish"; caveat emptor). - Pavel Minaev
(2) I haven't voted it down, but IMO, its score should be zero or negative. While it contains a bit of wisdom, it's far from universally true. Good use of even the most obscure library can make code much more readable and understandable. I'd point to the code I posted using valarray to compute variance. Even though I use vector and such much more often than valarray, I still find this code much easier to read and understand. I'd expect that anybody who understands C++ even a little bit, and knows the algorithm can verify its correctness almost at a glance. - Jerry Coffin
Wheaties, the green check means the person who asked the question has accepted the answer as the best. That doesn't mean it's the correct answer, especially for questions like this, which have no correct answer at all. Furthermore, the person who asks the question isn't always a good judge of what the best answer is, as we've all witnessed with this question. - Rob Kennedy
(1) What we're discussing are surely pearls of wisdom. Things that are not obvious, but should be. - olive
(2) Knowing dark corners implies deeper knowledge of the language, that means there will be less code out there that you won't understand. Also every developer should know where usage of certain pattern/syntax is justified and where it is not. - parti3an
(4) So what this answer is basically saying is "never try to learn your language. Because if you do so, then people who didn't won't be able to read your code". Wonderful logic. Makes me wonder why we don't all write in C now and forever. - jalf
@vehomzzz The writer of this comment likes how the OP talks of themselves as they would of someone else. - muntoo
30