C++ 'Type Erasure' Explained
October 2014
I recently stumbled across this pattern on a Hacker News post. It’s a neat toy, but I had a hard time finding a good explanation (most of the information I found jumped straight into examples before really motivating what was going on). In this post, I’ll try to derive the pattern from first principles instead.
Polymorphism with Interfaces
If you’re fluent in C++, this section should be pretty obvious :)
The typical way to achieve polymorphism in C++ is to define an interface consisting of pure-virtual methods you want to be able to call. Then, for each implementation that you want to use polymorphically, you create a subclass that inherits from the base class and implement those methods.
As an example, let’s implement See ‘n Say. We start with an interface class:
class Animal
{
public:
virtual const char *see() const = 0;
virtual const char *say() const = 0;
};
And add a few concrete implementations:
class Cow : public Animal
{
public:
const char *see() const { return "cow"; }
const char *say() const { return "moo"; }
};
class Pig : public Animal
{
public:
const char *see() const { return "pig"; }
const char *say() const { return "oink"; }
};
class Dog : public Animal
{
public:
const char *see() const { return "dog"; }
const char *say() const { return "woof"; }
};
Now we can use these implementations generically, by coding against the interface:
void seeAndSay(Animal *animal)
{
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
Not rocket science, right?
Polymorphism with Templates
Inheritance is a good solution to problems that require polymorphism, as long
as the concrete types you’re working with (Cow
, Pig
and Dog
in the example
above) all inherit from a common base (Animal
), which exposes all the
required functionality.
But sometimes the concrete types you’re trying to make polymorphic can’t
inherit from a common base.
You may not have control of the concrete types (e.g. think STL types like
std::string
), or it may not even be possible for the concrete type to inherit
(e.g. built-ins like int
).
If you’re in this situation, however, you’re not out of luck! Even if the concrete types don’t share a common base, if they conform to a common interface (that is, they can be used the same way by a caller), we can instead use a template to make the types polymorphic:
template <typename T>
void seeAndSay(T *animal)
{
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
You can call this above method on Cow
s, Pig
s, Dog
s, and anything else
that has zero-argument see()
and say()
methods that return strings.
This works due to the way templates are compiled: when you invoke a template on
a type, the compiler compiles a new overload of the method, specialized for the
concrete type you’re passing in.
Thus, as long as the method would compile with T
replaced with the concrete
type (say, Dog
), the template invocation is valid.
To illustrate this, when you call:
Dog dog;
seeAndSay(&dog);
The compiler compiles the method seeAndSay<Dog>
, by more or less replacing
T
with Dog
.
The body of that method would look something like this:
void seeAndSay<Dog>(Dog *animal)
{
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
If you tried to pass in a type that doesn’t conform to the ‘interface’ (say,
std::string
), the compiler would hit an error when you tried to compile the
method call, complaining that std::string
doesn’t have see
or say
methods.
Drawbacks to Template Polymorphism
Although achieving polymorphism with templates is a neat trick, there are two drawbacks:
First, we can’t shove disparate types into an array.
When we were using interfaces, we could store an instance of each of Cow
,
Pig
and Dog
in an array of Animal
:
void pullTheString()
{
Animal *animals[] = { new Cow(), new Pig(), new Dog() };
size_t len = sizeof(animals) / sizeof(Animal*);
size_t index = rand() % len;
seeAndSay(animals[index]);
}
However, with the template-based polymorphism approach, we couldn’t create this array, because there is no common subtype for the array:
??? animals[] = { new Cow(), new Pig(), new Dog() };
The second drawback is a little more subtle.
Anybody who uses the template-based seeAndSay()
method has two options:
- If the concrete type is known, the method can explicitly specify the concrete type, non-polymorphically.
- Otherwise, the caller must also be a template, to pass along the template
type (
typename T
) toseeAndSay()
.
Since you’re employing polymorphism in the first place, most callers will likely fall into the second group, meaning large swathes of your program will need to be implemented in templates. This can get out of hand quickly, making your program hard to read and hard to organize. Overuse of this technique can make it take longer to compile your program, and can bloat the size of your program, wasting space and making it take longer to start your program at runtime.
Yuck!
Kernel of an Idea
Pretend, for some reason, Cow
, Dog
and Pig
are set in stone, and the
designers originally did not give them a common base class.
We would like to unite them under some common base class ourselves.
And, since we don’t control the implementation of Cow
, Pig
and Dog
, it’s
not possible for us to simply change them to inherit from a base interface.
Here’s a basic plan for fixing this: if we don’t have the inheritance chain we
want, and we can’t change the objects to make them inherit, then we can build
our own inheritance chain out of wrapper objects.
That is, we define our own interface, and implement it multiple times.
Each implementation of the interface wraps a Cow
, Dog
or Pig
, and calls
into that for all the virtual methods.
In this example, our common interface might be:
class MyAnimal
{
public:
virtual const char *see() const = 0;
virtual const char *say() const = 0;
};
Then we create wrapper objects which inherit from MyAnimal
.
Each wrapper does not except but call into the ‘real’ underlying object:
class MyCow : public MyAnimal
{
Cow m_cow;
public:
const char *see() const { return m_cow.see(); }
const char *say() const { return m_cow.say(); }
};
class MyPig : public MyAnimal
{
Pig m_pig;
public:
const char *see() const { return m_pig.see(); }
const char *say() const { return m_pig.say(); }
};
class MyDog : public MyAnimal
{
Dog m_dog;
public:
const char *see() const { return m_dog.see(); }
const char *say() const { return m_dog.say(); }
};
Now we can work with instances of MyAnimal
, each of which wraps one of Cow
,
Pig
or Dog
:
void pullTheString()
{
MyAnimal *animals[] =
{
new MyCow(),
new MyPig(),
new MyDog()
};
size_t len = sizeof(animals) / sizeof(Animal*);
size_t index = rand() % len;
seeAndSay(animals[index]);
}
void seeAndSay(MyAnimal *animal)
{
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
This works, but there’s a glaring drawback: we have to define one wrapper class
(like MyCow
) for every concrete type we want to wrap (like Cow
).
Holy boilerplate, Batman!
However, we’ve already seen an easy way to have the compiler do this work for us: by using templates for polymorphism …
template <typename T>
class AnimalWrapper : public MyAnimal
{
const T *m_animal;
public:
AnimalWrapper(const T *animal)
: m_animal(animal)
{ }
const char *see() const { return m_animal->see(); }
const char *say() const { return m_animal->say(); }
};
Now we can use the single AnimalWrapper
template in lieu of MyCow
, MyPig
and MyDog
, to have the compiler generate the derived class for us:
void pullTheString()
{
MyAnimal *animals[] =
{
new AnimalWrapper(new Cow()),
new AnimalWrapper(new Pig()),
new AnimalWrapper(new Dog()),
};
size_t len = sizeof(animals) / sizeof(Animal *);
size_t index = rand() % len;
seeAndSay(animals[index]);
}
void seeAndSay(MyAnimal *animal)
{
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
The Type Erasure Idiom
What we built above is the basis of the ‘type erasure’ idiom. All that’s left is to hide all this machinery behind a another class, so that callers don’t have to deal with our custom interfaces and templates:
class SeeAndSay
{
// The interface
class MyAnimal
{
public:
virtual const char *see() const = 0;
virtual const char *say() const = 0;
};
// The derived type(s)
template <typename T>
class AnimalWrapper : public MyAnimal
{
const T *m_animal;
public:
AnimalWrapper(const T *animal)
: m_animal(animal)
{ }
const char *see() const { return m_animal->see(); }
const char *say() const { return m_animal->say(); }
};
// Registered animals
std::vector<MyAnimal*> m_animals;
public:
template <typename T>
void addAnimal(T *animal)
{
m_animals.push_back(new AnimalWrapper(animal));
}
void pullTheString()
{
size_t index = rand() % m_animals.size();
MyAnimal *animal = m_animals[index];
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
};
That’s all there really is to it!
This pattern is known as the ‘type erasure’ idiom because we managed to ‘erase’
the concrete types we unified (Cow
, Pig
, Dog
) by hiding them behind a
custom interface (MyAnimal
).
The key to doing so was to implement this interface with a template
(AnimalWrapper
) that forwards the interface’s methods to the wrapped concrete
type.
Since the wrapper is templated, we can generate a wrapper automatically for any
type that corresponds to the correct interface.
Also note that, even though in the example above, AnimalWrapper::see
and
AnimalWrapper::say
both forward to methods called see()
and say()
respectively, there’s no need for MyAnimal
to have the same interface as the
concrete types.
For example, we could instead call see()
‘getAnimalName()
’, and say()
‘getAnimalSound()
’:
class SeeAndSay
{
// The interface
class MyAnimal
{
public:
virtual const char *getAnimalName() const = 0;
virtual const char *getAnimalSound() const = 0;
};
// The derived type(s)
template <typename T>
class AnimalWrapper : public MyAnimal
{
const T *m_animal;
public:
AnimalWrapper(const T *animal)
: m_animal(animal)
{ }
const char *getAnimalName() const
{
return m_animal->see();
}
const char *getAnimalSound() const
{
return m_animal->say();
}
};
// Registered animals
std::vector<MyAnimal*> m_animals;
public:
template <typename T>
void addAnimal(T *animal)
{
m_animals.push_back(new AnimalWrapper(animal));
}
void pullTheString()
{
size_t index = rand() % m_animals.size();
MyAnimal *animal = m_animals[index];
printf("The %s says '%s!'",
animal->getAnimalName(),
animal->getAnimalSound());
}
};
Naming
Both MyAnimal
and AnimalWrapper
have accepted standard names.
MyAnimal
is an example of a type erasure concept.
That is, MyAnimal
captures the concept of an animal, which is shared among
all the concrete types we accept (Cow
, Dog
and Pig
).
In the end, a concept is just the interface we program against internally
(for example, in SeeAndSay::pullTheString()
).
AnimalWrapper
is an example of a type erasure model.
That is, AnimalWrapper
models the concrete types as instances of the concept.
The model is a templated wrapper object, which implements the concept interface
and forwards all concept methods to the underlying concrete type.
In parting, let’s rewrite our original SeeAndSay
type erasure example to use
the standard parlance.
Nothing needs to be changed except a few type names:
class SeeAndSay
{
class AnimalConcept
{
public:
virtual const char *see() const = 0;
virtual const char *say() const = 0;
};
template <typename T>
class AnimalModel : public AnimalConcept
{
const T *m_animal;
public:
AnimalModel(const T *animal)
: m_animal(animal)
{ }
const char *see() const { return m_animal->see(); }
const char *say() const { return m_animal->say(); }
};
std::vector<AnimalConcept*> m_animals;
public:
template <typename T>
void addAnimal(T *animal)
{
m_animals.push_back(new AnimalModel(animal));
}
void pullTheString()
{
size_t index = rand() % m_animals.size();
AnimalConcept *animal = m_animals[index];
printf("The %s says '%s!'",
animal->see(),
animal->say());
}
};
More Information
For more information on C++’s type erasure idiom, try …
- Andrzej’s series on type erasure (Part 1 , Part 2 , Part 3 , Part 4)
- This cplusplus.com article
- Type Erasure with Merged Concepts
Special Thanks
Thanks to Šimon Bařinka for pointing out an object lifetime bug in a previous draft!