An operator fusion primer01 Apr 2020
(Assumed audience: folks familiar with Combine and its operators.)
Tim followed up and asked a question I should’ve covered more directly,
I was hoping to also read why Combine fuses operators, any idea?
I get that it simplifies types, but I have no reason to assume that that gains you anything at runtime […]
Sans working at Apple or poking at the framework with performance tests, it’s hard to tell. But, my guess is back pressure handling. Each intermediary
Publisher conformance likely has its own buffering mechanism and fusion removes redundant overhead in honoring downstream subscribers’ demand.
Nuclear physicists 🤝 Engineers at Apple
Kidding. Well, only sort of.
If anything, the goal of this primer is to be the resource younger Jasdev wish he had.
Fusion—as the word hints—fuses two structures. Which, by extension, means “operator fusion” in the Combine context joins two operators to optimize away an intermediary type.
Publishers in the framework bear two names:
- Their type-erased nickname,
- Or, their fuller, non-erased names like
Publishers.Concatenate<Publishers.Map<Self, Sequence.Element>, Publishers.Sequence<Sequence, Failure>>.
The erased and full names interact with chained operators in nuanced ways—e.g. how does
(2) in the snippet below retain its type after mapping whereas
To illustrate the nuance and performance and aesthetic tradeoffs, let’s walk through an operator I’m contributing to CombineExt—a community overlay of Combine extensions—,
Combine ships with a three deduplicating operators:
They all nix pairwise duplicates from an upstream publisher by comparing the previous and next value events along an
Equatable conformance or (non-)throwing predicate. If they match, the value will be dropped, if not, it’ll be published downstream.
But, what if we need distinct values across all seen so far?
Of course, this operator has a warning label. Deduplicating across known value events means the operator has to track distinct values in memory, which can be problematic for sequences that publish a high number of unique elements. The operator can only free resources after a cancellation or completion event.
We’ll focus in on a specific implementation2 constraining
First, we need to record incoming value events and suppress those we’ve seen already. Maybe we can lean on
We could store all value events in a
Set<Output>, say under the variable name
seen, and determine uniqueness with
seen.insert(incoming).inserted (for an
If you’ve been compiling along, here’s the remaining error:
Cannot convert return expression of type
to return type
We’ve arrived at the “type erase?”, “fuse??”, or “???” juncture.
“To return the full type, erase, or be opaque? That is the question.”
— Wayne Gretzky
— — Michael Scott
There’s two-ish3 routes we can take (I’ll explain the “-ish” soon).
Return the full type
There’s no dancing around it. Returning
Publishers.Filter<Self> leaks implementation details. But, is that such a bad thing?
The only properties on
Publishers.Filter (and likewise, for other publishers) is its upstream and
isIncluded predicate, both of which call sites can’t do much with—and if they do, they’d sidestep reaching for
removeKnownDuplicates in the first place.
Maybe an opaque result type?
Could SE-0244’s opaque result types help?
AnyPublisher<Output, Failure> for
…compiles? That’s fishy—and understandably so. The issue shows up when we interface against the opaqueness.
Opaque return types are a sort of “reverse generics,” in that the callee (
removeKnownDuplicate’s implementation, in our case)—instead of the caller (
deduplicatedEvens) determines the resulting type.
There’s a cycle though. Our operator is stating it’s in control of the resulting type, covered by
some Publisher, yet the associated type for the
Output, is chosen by the caller, causing a stalemate.
This exceptional case is tucked away in the backing evolution proposal. And that’s the “-ish” I hinted at above.
We could stay course with erasure. There’s a tradeoff through, and that’s fusion.
(Sorry it took almost a thousands words to get here. Nuclear physics is no joke.)
Type erasure—as the name implies—removes detail from a specific protocol conformance, leaving behind a type that only covers the protocol’s requirements. The technique is used frequently:
AnyView, and so on.
In Combine, we can ask if that removed detail is useful—to answer this, let’s imagine a consumer tacks on a
filter call to both non-erased and erased forms of
filter calls normally nest the upstream publisher a
Publishers.Filter level. Yet, when filtering
deduplicatedEvens preserved its type.
erasedRemoveKnownDuplicates wipes the type information, Combine has no other choice than to wrap it in another intermediary. On the other hand,
Publishers.Filter can specialize its
filter implementation by returning another instance of the same type, reducing overhead.
We can’t peek behind Apple’s closed source. But, we can guess at the implementation.
“[Tradeoffs] rule everything around me.”
Fusion is wicked. However, it’s not always needed. We have to think about our publisher’s intended use, or at least guess when exposing a public-facing API.
If the operator is meant to be used in between other, chained operators, then it’s better to return the full type and let fusion happen.
However, if the publisher is meant to be used wholesale—i.e. its implementation covers all expected transformations—, then the aesthetic benefit of shortening to an
AnyPublisher (or if possible, an opaque return type) is probably the way to go.
There you have it. We covered building our own composed operator, type erasure, opaque return types, a few memes, and nuclear physics in under 1,200 words. Fusion always felt out of reach and writing this primer distilled it from orbit and into Combine and Swift for me.
I hope it does for others, too.
- if your operator is an intermediary, try to return its full type to benefit from fusion’s optimizations.
- if your publisher ends a chain, lean on either an opaque return type or erasure, in that order.
Related reading, hat tips, and footnotes
⇒ Thomas Visser’s “Why Combine has so many
⇒ Tim Ekl wrote a wonderful post detailing the evolution of Swift generics.
⇒ A diary entry on type erasure with a postfix operator.
⇒ Point-Free subscribers might’ve noticed that Episode #13’s mention that “composition of maps are equivalent to the map of the composition” seems similar to fusion. And in a sense, it is! That’s because the functors we encounter in programming are strictly endofunctors.
I also implemented a predicate-based version
Publisher.removeKnownDuplicates(by:)for when constraining
Hashableisn’t feasible. ↩
It could be argued that there are three-ish, if we introduce a
Publishers.RemoveKnownDuplicatestype. However, that’d preclude fusion from kicking in unless we repeated the work Apple did by extending our new type with fused operators. ↩