templates in C++

eLiu · Feb 10, 2012

Hi all,
Just wanted to get some discussion/opinions going on templates in C++. I'm thinking about functions but the same stuff applies to classes. Until like an hour ago, I (and as far as I can tell, a bunch of code example sites on the internet) was under the impression that template functions must have declaration/definition in the same file. Moreover this file must be #include'd by any file who wishes to use the template. This is b/c C++ only generates code for templated functions upon instantiation. So if I have
template<typename T> void foo(){ ...code...};
then code is only generated once I call foo():
foo<double>();
Thus the compiler has to be able to see the definition and the callsite to generate code in order to know what the template is & what to fill in for typename.

Putting template definitions in "header" files seems to lead to 2 related problems:
1) every time you #include a file containing source code (e.g., template defn), the compiler has to recompile that code. Include it 100 times? Compile it 100 times. This is bad if you only ever use like 1 or 2 different typenames for your template b/c then you're compiling 100 identical things.
2) Compiling 100 identical things then results in 100 copies of the same object code. Suddenly a million code caches cried out in terror and were suddenly silenced.

It's also not necessary. Enter "explicit instantiation":
http://www.parashift.com/c++-faq-lite/templates.html#faq-35.13
http://publib.boulder.ibm.com/infoc...language/ref/clrc16explicit_instantiation.htm

It solves 1) & 2). But now your template isn't so general b/c explicit instantiation requires you to specify every possible typename usage of your template. If you template over multiple parameters, this could be a lot of things. As pointed out here,
http://yosefk.com/c++fqa/templates.html#fqa-35.13
this solution makes templates more like glorified macros.

So I dunno. Have you guys used/heard of explicit instantiation? Are there any other pros/cons? Do people avoid this or are they ignorant of it? The object code duplication seems not so bad when you aren't in performance-critical regions (but if you don't care about performance, might as well be using managed code??). But increased compile times are always at least annoying.

Lastly, I think basically all of this discussion is just a specialized case of the issue griped about here:
http://yosefk.com/c++fqa/defective.html#defect-3
As you can see this guy clearly dislikes C++, haha. But anyway I was wondering why is compilation in C++ broken down into translation units that are basically individual files & also basically independent of other translation units? (Not counting linking b/c the compiler never sees the source code nor object code of externally linked functions.) I'm guessing this is leftover from the old days when huge multifile projects were just not a thing.

What is hard about having the compiler be able to associate declarations with unique, global definitions? When compiler runs across some class X in this translation unit & it has no idea what X is, the compiler has to be able to see the definition of X so it can compile the code & do its thing. It may have done the exact same operations in hundreds of other source files but the compiler has no way of knowing about that!

yosefk also mentions that "modern" languages do have this "look up" ability. Anyone know which languages those are & how they've implemented it?

-Eric

degibson · Feb 10, 2012

If you need to instantiate templates, my first thought is that you're probably doing them wrong.

- Including something you don't use is no big deal. Slows down the compile, creates a symbol in the namespace. Hard to get excited about.

- If you write a templatized method and call it from 100 different places with the same type, why did you templatize? You can *always* do this, if you like:

Code:

template<typename T> void myTemplatizedFunction(T* t) { ... }

void ThatFunctionHardSpecializedForT(T* t) {
  myTemplatizedFunction<T>(t);
}

... if you really do have 100 call sites with identical types.

- If you actually specialize a template with 100 different types in 100 different places, you really do need 100 different copies. Templates are macros. Not even glorified. They allow you to create different object code based on a type.

Also ignore that parashift link. I'm not exactly sure why it's written that way, but it's either ignorance or horrible disregard for convention. Either way, that's the dark side. My 2c's.

.. as for the other questions, well, that's one of the great things about C/C++. You never need to look in any other file (after pre-processing) to compile a file. It's a clean transformation from single source file to single object file (ok, usually single object file).

One example of a "modern" language with "look up" ability is Python. But python does it by exposing the names of other source files.

Code:

import my_awesome_package    # this had better be in my_awesome_package.py, or else

... this in turn forces the python interpreter to go hunt down my_awesome_package.py at runtime. Compiled C++ never has to do this, nor does the C++ compiler itself.

eLiu · Feb 10, 2012

degibson said:
If you need to instantiate templates, my first thought is that you're probably doing them wrong.

- Including something you don't use is no big deal. Slows down the compile, creates a symbol in the namespace. Hard to get excited about.

- If you write a templatized method and call it from 100 different places with the same type, why did you templatize? You can *always* do this, if you like:

Code:

template<typename T> void myTemplatizedFunction(T* t) { ... } void ThatFunctionHardSpecializedForT(T* t) { myTemplatizedFunction<T>(t); }

... if you really do have 100 call sites with identical types.

first bullet: agree
2nd bullet: "100" was an exaggeration, lol. If I use my template function in 100 different places and I have 10 different usages, that's still quite a bit of duplication. On a finite element code I've had some interaction with, I was told that removing some oft-used templates' definitions from the header files & using explicit instantiation reduced compile time from 30min to 20min. Pretty good savings for a very simple operation if you ask me. (Evidently most of the rest of that 20min can be attributed to Boost.)

code example: I fail to see how this is any different than explicit instantiation. Just seems like more code to get the same result.

- If you actually specialize a template with 100 different types in 100 different places, you really do need 100 different copies. Templates are macros. Not even glorified. They allow you to create different object code based on a type.

Also ignore that parashift link. I'm not exactly sure why it's written that way, but it's either ignorance or horrible disregard for convention. Either way, that's the dark side. My 2c's.

first bullet: Yeah I know. I wouldn't object to having 100 different usages generate 100 different versions of object code, even similar object code. They're different. What I was complaining about was recompiling for the same instantiation repeatedly. At the time, I also didn't realize that modern compilers (=icc and prob gcc too) are good enough to remove this object duplication at link time. In my old "example," each of 100 object files might have 100 copies of the same template function, but an executable linking to all of them will only have 1 (or at least, few) copy.

But regardless of whether or not the compiler/linker can remove duplicate object code, it seems like explicit instantiation is a good way for you to help it make that happen. I've always been of the mindset that I should do as much as possible to make the compiler's job as easy as possible.

2nd bullet: so I'm basically a C++ newb, having never really programmed a major project in C++. So I don't yet have the ability to fully discern people who do/don't know what they're talking about. What exactly is wrong with the parashift page? What convention/dark side? lol

Code:
.. as for the other questions, well, that's one of the great things about C/C++. You never need to look in any other file (after pre-processing) to compile a file. It's a clean transformation from single source file to single object file (ok, usually single object file).

One example of a "modern" language with "look up" ability is Python. But python does it by exposing the names of other source files.

Code:

import my_awesome_package # this had better be in my_awesome_package.py, or else

... this in turn forces the python interpreter to go hunt down my_awesome_package.py at runtime. Compiled C++ never has to do this, nor does the C++ compiler itself.

I dunno, for my money, as soon as your header files start to have anything beyond extern declarations, you're dealing with multiple source files. Seeing as how you can't even access cout or "<<" without doing this, single source -> single object might be a bit of a stretch?

And I'm not convinced that single source -> single object is always a good thing. I mean gcc has that whole 'wpo' (i think that's what it's called, whole program optimization?) thing and icc has ipo (intra procedural optimization?). As I understand it, the purposes of those features is to provide cross-source file optimization akin to what the compiler could do if all the code were pasted into 1 translation unit. Having some kind of 'look up' ability sure would make the rules for inlining a lot less confusing, amongst other things. Like I might have an external math library with some matrixy things or whatnot; as it stands, even if I could tell the compiler that my array sizes would only ever be 3x3, 4x4, or 5x5, it couldn't do anything w/that info since the function call is external & at compile time, it knows nothing about it.

If what python is doing is the equivalent of #include'ing all your .c files into the file containing "main()", then maybe this "look up" ability isn't the awesome power I hoped it would be.

edit: oh yeah, but if templates are just glorified macros, then good. Gives me a familiar way to think about using them, heh.

Markbnj · Feb 10, 2012

eLiu said:
I dunno, for my money, as soon as your header files start to have anything beyond extern declarations, you're dealing with multiple source files. Seeing as how you can't even access cout or "<<" without doing this, single source -> single object might be a bit of a stretch?

No, it really is single source -> single object file, in all but certain specialized cases. Headers should in the vast majority of cases contain only declarations, not definitions, and regardless of how many you include by the time the compiler processes the source they are all in scope. I always referred to this "virtual" source file, comprising the .cc/.cpp source and all included headers, as the "translation unit" but I don't know if that term is still used. If the headers refer to external symbols from other modules those modules don't get included at compile time. That's dealt with at link time.

degibson · Feb 11, 2012

Eliding some stuff for brevity with ...

eLiu said:
first bullet: agree
2nd bullet: "100" was an exaggeration, lol. ... Pretty good savings for a very simple operation if you ask me. (Evidently most of the rest of that 20min can be attributed to Boost.)

Depends on who you talk to. I knew a guy who insisted on using templates instead of inheritance, for speed. I never saw any data either way.

code example: I fail to see how this is any different than explicit instantiation. Just seems like more code to get the same result.

With my example, you know exactly what you're getting. One single global function, in that case, with a specific definition. Who knows what you're getting with template instantiation? E.g., if you instantiate the same template in multiple sources, then link together? Either they are the same name (link error) or they're not (repeated code).

... What I was complaining about was recompiling for the same instantiation repeatedly.

It's not really "recompiling". The compiler reads the template once, stores it in a symbol table. The heavy lifting of compiling comes in subsequent phases, after things like templates have been materialized on-demand.

And again, I find it hard to get excited about making the compiler's life easier. The compiler is evil. It must be punished.

At the time, I also didn't realize that modern compilers (=icc and prob gcc too) are good enough to remove this object duplication at link time...

Careful here. Some of those optimizations might be against strict interpretations of C's, or even C++'s, ABI. I.e., you can't rely on them in general.

2nd bullet: so I'm basically a C++ newb, having never really programmed a major project in C++. So I don't yet have the ability to fully discern people who do/don't know what they're talking about. What exactly is wrong with the parashift page? What convention/dark side? lol

After reading it, and noticing the laizzes-faire attitude about things like #include "meh.cpp", makes me suspect that perhaps the author doesn't understand that file extentions don't matter in C++, or maybe the author simply doesn't understand preprocessor invocation.

The convention to which I was referring was to name files that contain declarations and templates as .h. #including a .cpp means the same thing, but tends to complicate linking in nasty ways. Simple examples work for toys.

Sorry, it's hard to be both brief and concrete on this topic.

I dunno, for my money, as soon as your header files start to have anything beyond extern declarations, you're dealing with multiple source files. Seeing as how you can't even access cout or "<<" without doing this, single source -> single object might be a bit of a stretch?

Headers can tell you what your link environment will look like, before your object ever reaches the linker. That's the point. For cout, <iostream>'s header tells you that ostream objects have a method called operator<< -- that's all you need to know to make an object file. You don't need to know what the heck operator<< does.

This does have a point: you can start doing things like dynamically linking pre-built binaries together from different vendors, different languages (that have ways to specify C-sytle linkage), even different toolchains, just by having a simple header definition. Pretty powerful stuff. Other languages have more hoops to do this: JNI for instance, or even SWIG (shudder).

And I'm not convinced that single source -> single object is always a good thing... it couldn't do anything w/that info since the function call is external & at compile time, it knows nothing about it.

C has a very well-defined set of rules for when code and data are emitted, what their format is, etc. Other languages trade off this control for intra-procedural optimization, JIT, and other things. Pick your poison. Somebody will think you're wrong, no matter what you pick, and often for no reason better than preference

edit: oh yeah, but if templates are just glorified macros, then good. Gives me a familiar way to think about using them, heh.

I wouldn't call them glorified. Templates are horrible. C++ really missed the boat there.

iCyborg · Feb 11, 2012

degibson said:
With my example, you know exactly what you're getting. One single global function, in that case, with a specific definition. Who knows what you're getting with template instantiation? E.g., if you instantiate the same template in multiple sources, then link together? Either they are the same name (link error) or they're not (repeated code).

Wouldn't compilation of .cpp that has template generate the symbols for that particular instantiation like for any regular class/function and linker will link the same one for any source that includes it? It does look like the same thing as what you're doing, but somewhat less "hacky".

The part that is ugly/unconventional is including .cpp file. He doesn't like putting definition in .h, so he puts it into .cpp and then includes .cpp

. Ok, he still gets .cpp with source and .h with interface which is something.

I wouldn't call them glorified. Templates are horrible. C++ really missed the boat there.

I think it's useful for containers. It was designed for STL, and I can't see a better way to write it without it, or even any other feasible way, really. Like std::vector or std::map for any type, including your own that they can't know about in advance. What's the alternative - C/P-ing the same implementation every time you needed some fairly standard container for your class?
And it's not like only C++ missed the boat, C# and Java have generics, pretty much the same thing conceptually.

degibson · Feb 11, 2012

iCyborg said:
Wouldn't compilation of .cpp that has template generate the symbols for that particular instantiation like for any regular class/function and linker will link the same one for any source that includes it? It does look like the same thing as what you're doing, but somewhat less "hacky".

What I'm not sure about here is what would happen with a template explicit instantiation in file A, and template non-explicit instantiation in file B. To make a legal object file for B, the compiler has to generate code for the template, which would make it pretty hard to change that code at link time, when A's definition shows up. What if peephole analysis of B has changed the bytes?

My admittedly hacky approach has a consistent entry point to a type-specialized template.

The part that is ugly/unconventional is including .cpp file. He doesn't like putting definition in .h, so he puts it into .cpp and then includes .cpp . Ok, he still gets .cpp with source and .h with interface which is something.

Those names are only, only for programmer convenience. Breaking that convention isn't changing the way C++ works, only the way it's perceived.

Specifically, this approach doesn't change the fact that the template has to be properly defined before it can be instantiated. It just changes the name of the file. "Compling" a .cpp file that contains only a template definition shouldn't produce any useful object code.

I think it's useful for containers. It was designed for STL, and I can't see a better way to write it without it, or even any other feasible way, really. Like std::vector or std::map for any type, including your own that they can't know about in advance. What's the alternative - C/P-ing the same implementation every time you needed some fairly standard container for your class?
And it's not like only C++ missed the boat, C# and Java have generics, pretty much the same thing conceptually.

Sure, templates are powerful. No denying that. They're just ugly.

Sorry, I don't know Java or C# well enough to comment on generics.

Ken g6 · Feb 11, 2012

C++ templates can get really crazy. In college, I found something about "template programming", and made at least one program with absurdly complicated templates. Although they seemed really elegant to me at the time.

Anyway, at least in Java, generics are much simpler. Back in the v1.4 days when I learned Java, Containers stored Objects. Of course, you could put any subclass of Object into a Container, but on retrieving it you had to alias it to that same subclass again. (And preferably check that it was an instanceof that class first.)

Generics simply provide a way to pass a class name to a Container. This allows the Container to verify that any Object it's passed is an instanceof that class (which is better than checking it on retrieval later), and to return Objects pre-aliased back to the generic class. There's no crazy template programming in Java. I gather C# is similar.

Markbnj · Feb 11, 2012

Ken g6 said:
Generics simply provide a way to pass a class name to a Container. This allows the Container to verify that any Object it's passed is an instanceof that class (which is better than checking it on retrieval later), and to return Objects pre-aliased back to the generic class. There's no crazy template programming in Java. I gather C# is similar.

Runtime type information is a useful thing. Most of what makes C++ templates so nuts is that they have to be instantiated at compile time.

iCyborg · Feb 11, 2012

degibson said:
What I'm not sure about here is what would happen with a template explicit instantiation in file A, and template non-explicit instantiation in file B. To make a legal object file for B, the compiler has to generate code for the template, which would make it pretty hard to change that code at link time, when A's definition shows up. What if peephole analysis of B has changed the bytes?

My admittedly hacky approach has a consistent entry point to a type-specialized template.

I'm not an expert on this, so maybe I'm misunderstanding something here. Here's my view of this:
Compilation of T doesn't produce object code for some instantiation until someone actually instantiates it. What this guy suggests is basically moving this instantiation into T, thus compilation of T must produce object code for that when compiling T, just as it would normally have to if A instantiated it when compiling A. So when we come to A, object code should have been generated, and linker can later find it. If some B wants to use the same template instantiation, it can also see the same one. If B wants to use some other, then it will be created at B compilation time (if it wasn't explicitly instantiated in T), but there's no duplication since it's a different instantiation.
Of course, you must know that someone somewhere will eventually use it when you instantiate in T, or you'll just have useless object code.

As for name convention, yeah, I know it's just a convention, but it's still ugly. You could place source into .h and define interface in .cpp, but it would be ugly even if it's just names

iCyborg · Feb 11, 2012

Ken g6 said:
Generics simply provide a way to pass a class name to a Container. This allows the Container to verify that any Object it's passed is an instanceof that class (which is better than checking it on retrieval later), and to return Objects pre-aliased back to the generic class. There's no crazy template programming in Java. I gather C# is similar.

I see. I have to admit I didn't know how it worked "under the hood", I just knew it existed, and the usage seemed the same: List<int> in C# or vector<int>, pretty much the same thing to me. Even writing simple generic classes looks very similar, although all I ever did was use System.Collections.Generic and stuff like that.

I guess this lack of RTTI places additional burden on the programmer. Just skimmed over MSDN article for generics, and the implementation is quite different. It still seems like a tradeoff, as they say:
Compared to C++ templates, C# generics can provide enhanced safety but are also somewhat limited in capabilities.

eLiu · Feb 12, 2012

degibson said:
Depends on who you talk to. I knew a guy who insisted on using templates instead of inheritance, for speed. I never saw any data either way.

That sounds like overkill to me. As far as I understand things, the only really expensive part of inheritance is polymorphism. The virtual function table can add significant overhead in "performance critical" regions. That and I guess if you have unnecessarily expensive constructors 'high up' in the class hierarchy, that'll hurt too.

With my example, you know exactly what you're getting. One single global function, in that case, with a specific definition. Who knows what you're getting with template instantiation? E.g., if you instantiate the same template in multiple sources, then link together? Either they are the same name (link error) or they're not (repeated code).

With explicit instantiation, you get only 1 global function too with a specific definition. Maybe I'm mistaken on how templates are compiled, but I believe the process goes like this:
say I have:
template<typename T> foo(){ ...code...};
...myclass defn...
...myclass2 defn...

void somefun(){
...code...
foo<myclass>(); //(a)
foo<myclass2>(); //(b)
...code...
}

As it goes thru this file, compiler sees the template function foo instantiated w/"myclass". It then:
1) generates code for foo, replacing "T" with "myclass". If this already exists then presumably the compiler will use the existing stuff. Probably uses name mangling or some other system to track which version of foo<> goes with what typename(s).
2) compile the code generated in 1); set up appropriate function call at location (a).
Same thing for foo<myclass2>

With explicit instantiation, I might have:
template<typename T> void foo(){ ...code...};
...myclass defn...
...myclass2 defn...
template void foo<mylass>();
template void foo<myclass2>();

Now, so step 1) above doesn't have to happen. But I have no idea why you'd ever do this--explicit instantiation in multiple source files.

Instead, in foo.h:
template<typename T> extern void foo();

In foo.c:
template<typename T> void foo(){ ...code...};
template void foo<mylass>();
template void foo<myclass2>();

Then when you need it, #include "foo.h". Now there's only 1 instantiation, ever. Presumbly since it's extern, when you compile foo.c, code will be generated for foo<myclass> and foo<myclass2> and compiled. One instantiation with a unique name (for each usage) will be generated, which can then be matched up by the linker. This naming system must already exist even to deal w/cases where template definition is visible.

The only downside I can see is that you lose the generality of templates in that you have to specify all the possible instantiations. So it's not always possible.

But if we're thinking about templates like macros, then just like macros, the most complex thing I'd write in a header file would be similar to the same kind of simple/short functions that I'd put in a header with the "inline" keyword.

It's not really "recompiling". The compiler reads the template once, stores it in a symbol table. The heavy lifting of compiling comes in subsequent phases, after things like templates have been materialized on-demand.

Huh? How is it not? Say #include "bar.h" in 100 different source files. bar.h contains a template definition. Regardless of how the template in bar.h is used in my source files (and in particular, even if all usages are the same), I will compile some instantiation of the template at least 100 times. This is the "heavy lifting" on the compile side. Since the 100 source files are in different translation units & there's no "look-up" capability, the compiler shouldn't have any way of knowing what work it has already done.

And again, I find it hard to get excited about making the compiler's life easier. The compiler is evil. It must be punished.

Haha. Unless you enjoy writing assembly, after a point, helping the compiler out is the only thing you can do to make your implementation of your algorithm go faster. That and I'd just as soon spend as little time as possible waiting for compilation.

Careful here. Some of those optimizations might be against strict interpretations of C's, or even C++'s, ABI. I.e., you can't rely on them in general.

I never know what to make of compiler-specific optimizations like this. Yeah it's helpful in that there's 1 less thing I need to think about. But it's annoying b/c I never know which operations are compiler specific. Now that I'm trying to run code on some supercomputers, the most often suggested choices are like cray's package or pgi or whatnot. I don't want to have to sit down and make sure all of my pre-exisitng assumptions still hold (not that I could even enumerate them all for checking).

After reading it, and noticing the laizzes-faire attitude about things like #include "meh.cpp",...

...Sorry, it's hard to be both brief and concrete on this topic.

Eh that website skips a lot of details. The target audience appears to be people who are just getting started in C++, have little understanding of computer architecture, etc. I guess this can be dangerous in that your simplified version of life is reality. But at the same time, explaining everything in full detail to the point of information overload for a noob isn't helpful either.

I've found that site to be reasonably useful in terms of helping get started w/understanding why X didn't compile or whatnot. Although unless I'm doing something I don't care about at all, I'd almost never take 1 person's word for anything. That goes double for random websites.

Headers can tell you what your link environment will look like, before your object ever reaches the linker. That's the point. For cout, <iostream>'s header tells you that ostream objects have a method called operator<< -- that's all you need to know to make an object file. You don't need to know what the heck operator<< does.

This does have a point: you can start doing things like dynamically linking pre-built binaries together from different vendors, different languages (that have ways to specify C-sytle linkage), even different toolchains, just by having a simple header definition. Pretty powerful stuff. Other languages have more hoops to do this: JNI for instance, or even SWIG (shudder).

Oh I'm not saying that headers are a bad idea or anything. I know they're very useful for the exact reason that you stated. Being able to do "everything" just based on function signatures and no other info is valuable. This is easily one of the things I miss most when working on a GPU (which as no linker!), god.

But to obtain this level of functionality, you should never have to write source code in header files. The only time I could see including source code in header files is when you want to inline. B/c then recompilation & code-growth are the name of the game. For everything else (like templates), it's more a matter of convenience... to me anyway.

"Hello world" in C++ with "cout" is something like 700KB of source code to be compiled (even with dynamic linking) or so I'm told. That's... a lot. My understanding is that most of this comes from templates... which since the definition is included in header files, results in recompilation.

... Somebody will think you're wrong, no matter what you pick, and often for no reason better than preference

Everything I know about programming and computers has been self taught through trial/error, reading, and posting questions on forums like this (I still have you to thank for giving me a place to start w/how the CPU ticks at a high level, lol). On the outside, it has always seemed that programming has the potential to be an 'elegant' realm since the possibility exists for a "right" way to do everything.

Not so. At all. D: haha

I wouldn't call them glorified. Templates are horrible. C++ really missed the boat there.

Ah yes, I was waiting for someone to say this! I've heard (=read) this comment more than once. I've also heard people gushing their love of templates. As far as templates being a prettier looking macro for generating a certain kind of code, I think they're great. Fuck multi-line macros straight to hell.

But what would've been *better*? What functionality or design principle or whatever did templates skip over that would have made them even more useful?

Markbnj said:
No, it really is single source -> single object file, in all but certain specialized cases. Headers should in the vast majority of cases contain only declarations, not definitions, and regardless of how many you include by the time the compiler processes the source they are all in scope. I always referred to this "virtual" source file, comprising the .cc/.cpp source and all included headers, as the "translation unit" but I don't know if that term is still used. If the headers refer to external symbols from other modules those modules don't get included at compile time. That's dealt with at link time.

I mean, you could #include every source file in your project into the file containing main(). Who even needs multi-file projects?

Since people don't often do that, I'm more inclined to define a "source file" as the file(s) containing the definitions compiled in some translation unit (afaik that term is still common). Most of the time that means (to me) that a 'source file' is each of your .cpp files. If for whatever reason you have all/most definitions in a .h file, then I would argue that the .h file counts as a source file too.

Markbnj said:
Runtime type information is a useful thing. Most of what makes C++ templates so nuts is that they have to be instantiated at compile time.

It's also expensive. Part of what makes C++ fast (or at least gives it the potential to be fast). I like at least having the option to figure out as much stuff at compile-time as possible. But that might be b/c I work in scientific computing.

My understanding of RTTI in C++ is that it's what makes "dynamic_cast<>" and "typeid" possible. At the same time, outside of maybe the debugging phase, I avoid these.

degibson · Feb 12, 2012

I got curious to see what g++ would do:

Code:

  1 #include <iostream>
  2 
  3 template<typename T> void foo(T t) {
  4   std::cout << t << std::endl;
  5 }
  6 
  7 template void foo<int>();
  8 
  9 void foo_materialized_for_int(int t) {
 10   foo<int>(t);
 11 }

Code:

hms-victory(4)% g++ file1.cc -S -o file1.S
file1.cc:7: error: template-id &#8216;foo<int>&#8217; for &#8216;void foo()&#8217; does not match any template declaration

What am I doing wrong? Is it that this instantiation feature isn't supported by g++?

degibson · Feb 12, 2012

Warning: long post is loooooooooooong.

eLiu said:
That sounds like overkill to me...

Definitely. People often make strange claims; they're worth questioning.

With explicit instantiation, you get only 1 global function too with a specific definition. ... ... functions that I'd put in a header with the "inline" keyword.

I wish I could get it to compiler. I'm curious to see what the generated symbols and functions really are. I'm not yet 100% clear on how to properly forward-declare an instantiated template in a way that informs the compiler to create a function call instead of an inline instantiation. extern might be it, but per above, I can't get the basic thing to compiler.

Huh? How is it not? Say #include "bar.h" in 100 different source files. bar.h contains a template definition. Regardless of how the template in bar.h is used in my source files (and in particular, even if all usages are the same), I will compile some instantiation of the template at least 100 times. This is the "heavy lifting" on the compile side. Since the 100 source files are in different translation units & there's no "look-up" capability, the compiler shouldn't have any way of knowing what work it has already done.

I'm just thinking about the internal phases of compilation. Reading the file is trivial and fast. After the symbol table is built, unused templates don't matter anymore. And templates that are used don't go straight to code generation. They're converted into abstract syntax trees, after which, optimization layers take over and iterate on the ASTs until they're well-optimized, then the code generator runs, then the peephole optimizer runs...

So the 'heavy lifting' in my context is everything that happens after constructing the symbol table -- AST manipulation and code generation. Unused templates in pre-processed source stop mattering at that stage, and used templates are just inlined code by the time they become ASTs.

Haha. Unless you enjoy writing assembly, after a point, helping the compiler out is the only thing you can do to make your implementation of your algorithm go faster. That and I'd just as soon spend as little time as possible waiting for compilation.

Sometimes hoops must be jumped, sure. I try to make the compiler jump more hoops for me than I for it. As for waiting for compilation:
- Make small edits,
- Use a build infrastructure that doesn't recompile the world unnecessarily,
- make -j 100,
- and if you HAVE to recompile the world, go get coffee.

Changing code to speed up compiles is making the wrong tradeoff -- engineering time is more important than CPU time.

Eh that website skips a lot of details. The target audience appears to be people who are just getting started in C++, have little understanding of computer architecture, etc....

I'm not worried about you, eLiu. I'm worried about the poor learner who reads that drivel and forms all the wrong internal models of C++.

Oh I'm not saying that headers are a bad idea or anything. ... you should never have to write source code in header files. The only time I could see including source code in header files is when you want to inline. ...

This is a feature I really love (degibson loves something low-level? *shock*). Code in headers often has no legal place to be instantiated than inline; code in source has no legal place to be instantiated than in the source's object file. That's a lot of control. Optimizing compilers get around this, but it's nice strictly from a documentation angle. A programmer can quickly scan the definitions of SomeRandomMethod() if it's in the header, and know that it's cheap to call.

"Hello world" in C++ with "cout" is something like 700KB of source code to be compiled (even with dynamic linking) or so I'm told. That's... a lot. My understanding is that most of this comes from templates... which since the definition is included in header files, results in recompilation.

Code:

#include <unistd.h>
#include <iostream>
int main(int argc, char* argv[]) {
  std::cout << "Hello, world!" << std::endl;
  return EXIT_SUCCESS;
}

146 bytes source, +/- formatting du jour. Are you sure you meant source? 8365 bytes compiled and linked as an executable on Linux.

But what would've been *better*? What functionality or design principle or whatever did templates skip over that would have made them even more useful?

Aside from horrible error messages, which are technically the tools' fault, I'd like to templatize on anything -- not just on types. Templatize on values, other templates, etc. I basically would like macros that aren't pre-processor macros.

... I'm more inclined to define a "source file" as the file(s) containing the definitions compiled in some translation unit (afaik that term is still common). Most of the time that means (to me) that a 'source file' is each of your .cpp files. If for whatever reason you have all/most definitions in a .h file, then I would argue that the .h file counts as a source file too.

I'm inclined to call a source file, in C++ anyway, something that is compiled to an object and then linked against other objects. This is not to say that headers don't contain source, of course.

But anything that appears in builds only as a consequence of preprocessing is a header to me.

iCyborg · Feb 12, 2012

degibson said:
I got curious to see what g++ would do:

Code:

1 #include <iostream> 2 3 template<typename T> void foo(T t) { 4 std::cout << t << std::endl; 5 } 6 7 template void foo<int>(); 8 9 void foo_materialized_for_int(int t) { 10 foo<int>(t); 11 }

Code:

hms-victory(4)% g++ file1.cc -S -o file1.S file1.cc:7: error: template-id foo<int> for void foo() does not match any template declaration

What am I doing wrong? Is it that this instantiation feature isn't supported by g++?

Your template foo has an argument of type T, and your forced instantiation has no arguments. You should do:
template void foo<int>(int t);

degibson · Feb 12, 2012

iCyborg said:
Your template foo has an argument of type T, and your forced instantiation has no arguments. You should do:
template void foo<int>(int t);

Thanks. I would have thought that foo<int>(int) would have been redundant. Silly me.

After compiling, I found the thing that that-unconfuses me! The compiler emits .weak for template instantiations:

Code:

  1   .file "file1.cc"
  2   .section  .text._Z3fooIiEvT_,"axG",@progbits,_Z3fooIiEvT_,comdat
  3   .align 2
  4   .weak _Z3fooIiEvT_
  5   .type _Z3fooIiEvT_, @function
  6 _Z3fooIiEvT_:
  7 .LFB3:
  8   pushq %rbp
  9 .LCFI0:
...

http://ps-2.kev009.com:8081/wisclib...fo/en_US/a_doc_lib/aixassem/alangref/weak.htm

Salient point:
"The binder ignores duplicate definitions for symbols with the same name that are weak"

eLiu · Feb 13, 2012

degibson said:
Warning: long post is loooooooooooong.

I think I'm just as guilty of this... lol

I wish I could get it to compiler... ...I can't get the basic thing to compiler.

Looks like this was addressed. Also, "compiler" might not be a verb. Just sayin'

I'm just thinking about the internal phases of compilation. Reading the file is trivial and fast. After the symbol table is built, unused templates don't matter anymore. And templates that are used don't go straight to code generation. They're converted into abstract syntax trees, after which, optimization layers take over and iterate on the ASTs until they're well-optimized, then the code generator runs, then the peephole optimizer runs...

So the 'heavy lifting' in my context is everything that happens after constructing the symbol table -- AST manipulation and code generation. Unused templates in pre-processed source stop mattering at that stage, and used templates are just inlined code by the time they become ASTs.

Looks like I might've been talking out of my ass a bit then since my mental image of how a compiler does its thing involves way fewer steps than this. :/ Doh.

But I'm still confused though. Does the compiler keep track of the symbol table across all translation units? That seems unlikely. If not, then (back to this again) w/100 #includes of a header file containing template+definition (each in a different translation unit), then wouldn't the compiler have to go through all of these steps 100 different times? Possibly to generate the exact same function call (let's say inlining is off)? I'm not concerned about unused templates; i know compilers shoudl be smart enough to throw that out. But I'm curious about templates that are re-used repeatedly in different translation units. Since you don't have any work carried over across translation units, it seems like you can be opening the door to redundant work.

And as far as I know this doesn't just go for templates; it goes for any source code that you #include (and actually call) in multiple files.

Also are you saying that compilers always inline the code generated from templates? That seems pretty surprising to me.

Sometimes hoops must be jumped, sure. I try to make the compiler jump more hoops for me than I for it. As for waiting for compilation:
- Make small edits,
- Use a build infrastructure that doesn't recompile the world unnecessarily,
- make -j 100,
- and if you HAVE to recompile the world, go get coffee.

Changing code to speed up compiles is making the wrong tradeoff -- engineering time is more important than CPU time.

Well if compilation is fast enough, it opens the door to a lot of "lazy" debugging "techniques" where I can insert prints, screw around with parameters and whatever through editing source. And when you hand off the code to users, they get bitchy if they have to wait too long.

Does "-j 100" help? Maybe you have 100 cores but I only have 4.

Also, going back to what we've been talking about, doesn't including lots of source code in headers lead to long compilation times b/c you would very well be recompiling all of that code once per #include?

I'm not worried about you, eLiu. I'm worried about the poor learner who reads that drivel and forms all the wrong internal models of C++.

146 bytes source, +/- formatting du jour. Are you sure you meant source? 8365 bytes compiled and linked as an executable on Linux.

Nope... got that statistic from someone telling me. Quite possibly they were wrong or mis-stated or who knows what. Nevermind, lol. I'll just stick to the explicit instantiation decreasing compile time from 30 to 20min as my example since I'm certain of that one.

Aside from horrible error messages, which are technically the tools' fault, I'd like to templatize on anything -- not just on types. Templatize on values, other templates, etc. I basically would like macros that aren't pre-processor macros.

Ah, I see. I can see how this would be useful. Though it's also starting to sound more like a functional programming thing. Also I think effectively can template on values by using inline functions?

I'm inclined to call a source file, in C++ anyway, something that is compiled to an object and then linked against other objects. This is not to say that headers don't contain source, of course. But anything that appears in builds only as a consequence of preprocessing is a header to me.

Heh, very well. Still seems weird to me that if you just glue all your source into 1 huge file, that's 1 source file.

degibson said:
After compiling, I found the thing that that-unconfuses me! The compiler emits .weak for template instantiations:
...
http://ps-2.kev009.com:8081/wisclib...fo/en_US/a_doc_lib/aixassem/alangref/weak.htm

Salient point:
"The binder ignores duplicate definitions for symbols with the same name that are weak"

Errrr excuse my even greater ignorance but I just want to make sure I get what's going on here...
1) Normally, template instantiation gets the ".weak" tag. So if multiple 'copies' of the same template are instantiated (w/same type params), only the first one is used.
A) Does this matter if you're only dealing with 1 translation unit (no linking)? I don't see any reason the compiler would wait until this late to throw away duplicate code.
B) If I have multiple files, does this come into play b/c linkers now will often work even if you don't explicitly declare your function signatures? Then when linking to many object files with the same template instantiations, you can run into a boatload of duplicate code... so ".weak" says just take the first one you see & screw it (as long as .globl doesn't exist).

2) If give a "extern" template declaration in a header & then explicitly instantiate in the source file, does cause those functions to get the ".globl" tag? Then regardless of what other code is emitted for that template, only those .globl ones matter?
A) This shouldn't come into play normally right? Like if I only make the template declaration visible, then the only code that can be emitted as a result of the template is code from my explicit instantiations?

degibson · Feb 13, 2012

eLiu said:
Looks like this was addressed. Also, "compiler" might not be a verb. Just sayin'

You're right. I meant compilerization.

But I'm still confused though... Since you don't have any work carried over across translation units, it seems like you can be opening the door to redundant work.

Yes, you are. But you also open the door to some new optimizations. Here's a silly contrived example:

Code:

template <typename T> void foo(T& t) {
  cout << t++;
}

void bar() {
  int x = 0;
  foo<int>(x);
}

... by not instantiating foo, the compiler can turn this into:

Code:

void bar() {
  cout << 0;
}

... which couldn't happen if 'foo' had been instantiated, and is an example of an optimization that can only happen once foo has been de-templatized.

Not that I know for sure, but I imagine that a compiler will expand templates fairly early in the compile cycle, so that optimizations like the above become possible during AST manipulation.

One other thing to keep in mind is that optimizing compilers can do dead code elimination. I.e., they can drop from their internal representation code that isn't used. I.e., templates that aren't used but are #included. So yes, a compiler has to read the file, but no, the compiler does not have to do anything else to an unused template (except, of course, discover that it's not used).

A template that is actually used becomes dead code once the code has been instantiated at all of its call sites.

And as far as I know this doesn't just go for templates; it goes for any source code that you #include (and actually call) in multiple files.

Indeed. I'm not saying including extra stuff doesn't slow down the compile. It does, no doubt. But early optimization passes can eliminate dead code, so subsequent phases need only operate on the important parts.

Also are you saying that compilers always inline the code generated from templates? That seems pretty surprising to me.

Impelmentation dependent. I suppose it could make a file-scope function call for each instantiated type...

Well if compilation is fast enough, it opens the door to a lot of "lazy" debugging "techniques" where I can insert prints, screw around with parameters and whatever through editing source. And when you hand off the code to users, they get bitchy if they have to wait too long.

Well of course. If your compile isn't fast enough to do this, you have bigger problems, my friend.

Does "-j 100" help? Maybe you have 100 cores but I only have 4.

This ends up depending a lot on your environment. If you're building off disk or off a network file system, -j100 might be the right approach. I/O bound jobs don't need -j for CPU parallelism. If all your changes are entirely local on a small tree, you're building out of memory and -j [cpus]+1/2 is probably what you want.

Errrr excuse my even greater ignorance but I just want to make sure I get what's going on here...
1) Normally, template instantiation gets the ".weak" tag. So if multiple 'copies' of the same template are instantiated (w/same type params), only the first one is used.

Multiple copies are generated and compiled. All but the lucky 'first' one is discarded at link-time.

A) Does this matter if you're only dealing with 1 translation unit (no linking)? I don't see any reason the compiler would wait until this late to throw away duplicate code.

Compilers can throw away dead code that they can prove is dead. That's pretty hard, e.g., no global symbol is ever dead. And there's *always* a link step, even if you only input one translation unit, that still has to be linked against libstdc++.

B) If I have multiple files, does this come into play b/c linkers now will often work even if you don't explicitly declare your function signatures?

Linkers just care about symbol names. It's compilers that like to know about signatures... even then, you can often get away without them if you define functions in the right order.

Then when linking to many object files with the same template instantiations, you can run into a boatload of duplicate code... so ".weak" says just take the first one you see & screw it (as long as .globl doesn't exist).

Yup.

2) If give a "extern" template declaration in a header & then explicitly instantiate in the source file, does cause those functions to get the ".globl" tag? Then regardless of what other code is emitted for that template, only those .globl ones matter?

extern in a header probably won't cause some specific instantiation to get a .globl. It might inhibit local instantiation, but I'd have to check the output on that to be sure (I'm stranded without my trusty compiler right now).

A) This shouldn't come into play normally right? Like if I only make the template declaration visible, then the only code that can be emitted as a result of the template is code from my explicit instantiations?

All the compiler can do here is create a function call for the right symbol and hope that the definition shows up at link-time.

Search

templates in C++

eLiu

Diamond Member

degibson

Golden Member

eLiu

Diamond Member

Markbnj

Elite Member <br>Moderator Emeritus

degibson

Golden Member

iCyborg

Golden Member

degibson

Golden Member

Ken g6

Programming Moderator, Elite Member

Markbnj

Elite Member <br>Moderator Emeritus

iCyborg

Golden Member

iCyborg

Golden Member

eLiu

Diamond Member

degibson

Golden Member

degibson

Golden Member

iCyborg

Golden Member

degibson

Golden Member

eLiu

Diamond Member

degibson

Golden Member

TRENDING THREADS