Alex Stepanov — the creator of the STL — had a simple litmus test for any programming language:

Look at how it implements swap.

Three lines. The simplest function imaginable. But what C++ lets those three lines become is a beautiful piece of engineering.

The test: a smooth gradient of implementations

Stepanov’s point was not about swap itself. It was about generic programming.

You define what swap means — exchange the values of two objects. Then you ask the language: can you give me a smooth gradient of implementations, from the most generic to the most specialized, without losing efficiency at any level?

One name. One contract. Many optimal implementations. That is the test.

Most languages fail it. They give you the generic version and stop there, or they give you specialization but only through runtime dispatch. C++ gives you the full gradient — and it does it at compile time, with zero overhead at each level.

Layer 1: The generic swap

Here is the swap most people take for granted:

template <typename T>
void swap(T& a, T& b) {
    T tmp = std::move(a);
    a = std::move(b);
    b = std::move(tmp);
}

Works for int. Works for std::string. Works for your custom types. One function, any type, zero overhead.

This already requires something non-trivial: templates that generate real, type-specific code at compile time. When the compiler sees swap(x, y) where x and y are std::string, it instantiates a version of swap specifically for std::string. Not a generic function that operates on opaque pointers. Not a function that boxes the values. Real, specialized machine code for that exact type.

A single definition that works for every type without erasing what the type actually is. In many languages this would be the end of the story. In C++ this is just the first layer.

Layer 2: ADL — the type gets a vote

Now write a generic sort. Somewhere inside, the algorithm needs to swap two elements. Which swap gets called?

If you write std::swap(a, b), you are hardcoding the generic version. The type never gets a say. But in C++, there is an idiom that changes this:

using std::swap;
swap(a, b);

That two-line pattern enables Argument-Dependent Lookup. Here is what happens:

  1. The using declaration makes std::swap visible as a fallback.
  2. The unqualified call swap(a, b) triggers ADL — the compiler searches the namespace of the type of a and b for a function named swap.
  3. If it finds one, it prefers it over the generic std::swap. If it does not, std::swap handles it.

The type itself gets a vote on how it is swapped. The algorithm does not need to know about it.

This is a remarkable design. The generic algorithm remains generic — it never mentions your type, never includes your headers, never changes when you add a new type. But the type can still inject its own optimal implementation into the algorithm’s behavior. The coupling is zero. The customization is total.

You can see ADL in action with a simple example:

namespace geometry {
    struct Point {
        double x, y, z;
    };

    // Point's vote: swap by swapping members directly
    void swap(Point& a, Point& b) noexcept {
        std::swap(a.x, b.x);
        std::swap(a.y, b.y);
        std::swap(a.z, b.z);
    }
}

// Generic code — knows nothing about geometry::Point
template <typename T>
void sort_pair(T& a, T& b) {
    if (b < a) {
        using std::swap;
        swap(a, b);  // ADL finds geometry::swap for Point
    }
}

The sort function never mentions geometry. It never includes the header that defines Point. But it still calls the right swap.

Layer 3: Specialization without losing generality

And types should vote. The generic swap uses three moves — for most well-behaved types with cheap move constructors, that is already fast. But some types can do better.

std::vector provides its own swap:

template <typename T>
void swap(vector<T>& a, vector<T>& b) noexcept {
    a.swap(b);  // swaps three pointers: begin, end, capacity
}

Constant time. Regardless of how many elements the vectors contain. Regardless of whether the element type has a noexcept move constructor. The generic swap would move three vector objects — which means moving their internal buffers, which is already fast. But the specialized version just swaps three pointers. It is the difference between O(1) with a small constant and O(1) with a tiny constant.

ADL finds it automatically. The generic algorithm never changed. And this pattern scales: std::string, std::map, std::unordered_map, std::list — all of them provide their own swap overloads. Every standard container gets a vote.

Layer 4: Concepts — the compiler gets smarter

C++20 pushed this further. With concepts, you can constrain a swap overload for an entire category of types, not just a single type:

template <std::integral T>
void swap(T& a, T& b) noexcept {
    a ^= b;
    b ^= a;
    a ^= b;
}

Now the compiler sees multiple candidates for any given swap call:

  • The fully generic swap<T> — works for anything
  • The vector<T>-specific swap — works for vectors
  • The std::integral-constrained swap — works for int, long, char, and friends

It selects the most constrained match. For an int, the concept-constrained version wins. For a vector<int>, the vector-specific version wins. For a custom Widget, the generic version handles it — unless Widget provides its own via ADL.

This is the language expressing what a type is and choosing an implementation to match — at compile time, at zero cost. No virtual dispatch. No type erasure. No runtime branching. The selection happens entirely in the compiler, and the generated code is exactly what you would write by hand for each specific case.

(A note on the XOR swap: on modern hardware, the three-XOR trick is typically slower than a move-through-temporary for register-sized types, because it introduces data dependencies between operations. The point here is the mechanism — concept-constrained overload resolution — not the XOR trick itself.)

The full gradient

Three lines of code. Four layers of the language working together:

  • Templates give you generality — one definition that works for any type.
  • ADL gives you customization — the type itself injects its optimal implementation.
  • Specialization gives you efficiency — containers swap pointers instead of elements.
  • Concepts give you precision — constrain implementations to categories of types at compile time.

Stepanov’s test asks: does the language let you move smoothly from the most generic to the most specialized, without losing efficiency at any level? C++ passes.

Pick a type in your codebase. Call swap on two instances. Now ask: what would a specialized version look like? What does the gap between the generic version and the optimal one cost you? Write a benchmark. The answer might change how you think about every generic algorithm you use.