Writing a Combine operator: variadic zipping

24 Mar 2020

(Assumed audience: folks familiar with Combine’s syntax. Prior knowledge of zipping isn’t necessary, since I’ll build intuition and link to resources—but, if you’re already familiar with the topic, feel free to skip to “a variadic Publisher.zip.”)

Updates:

5/23/20:

Daniel Williams and I stepped through a crash he ran into with this post’s implementation. Notes from the conversation can be found here.

Zipping is backed by a metaphor that fits quite naturally. If you’ve used a physical zipper, your intuition is already primed.

Specifically, when a type supports a zip implementation, two (or more) of its instances can be paired in the same way teeth on a zipper meet when closing. Let’s ground this with a Standard Library example, Sequence.

Two-ary `Sequence` zipping with `zip(_:_:)`

Sequences are lists of any size, including empty and possibly infinite lengths. And if we have two in hand, what’re some ways to pair them?

pausing, so you can brainstorm.

Maybe return all possible pairs, channeling an inner-Cartesian?

Maybe losing information and pairing the first with a specific member of the second?

Or, maybe walking across both in lockstep.

While each have merits, the Standard Library ships with the third variant under the name zip(_:_:). It walks over two sequences, index-wise, and stops once it reaches the end of the shorter.

Zip2Sequence bookkeeps this traversal (which is why I opted for ≈ in the snippets above to avoid Array-casting).

Let’s search for “zip” across Apple’s documentation.

…aside from the above free function and results from Combine, there isn’t anything else.

And that’s where we’ll fill in some gaps to exercise our intuition. Starting with an arity three zip overload—i.e. a zip(_:_:_:) that accepts three sequences (and for simplicity, returns an AnySequence of triples).

Here’s some scaffolding we’ll work on.

If we apply Swift’s provided zip(_:_:) twice, we can start sketching an implementation.

To transform the unflattened—unsplatted?—form into an (Sequence1.Element, Sequence2.Element, Sequence3.Element) we need to map before finally wrapping everything in an AnySequence.init call.

The ($0.0, $0.1, $1) expression is dense—don’t second-guess yourself if you need to squint at it for a while. I certainly did.

All right, roll call. We’ve read through an existing zip in the Standard library and extended it. What about writing our own? That’s next before crossing over to Combine.

Implementing `Result.zip`

Swift packs its own Result type.

(For the unfamiliar, the linked SE-0235 proposal is an introduction to comb through before reading on.)

If we have two Result values in hand, then, like we did with Sequence, we can consider “pairing” them. To begin, we have to pick a case, Result.success or .failure. Let’s consider the success path, since we often favor (and map over) it.

So, assuming we have a Result<Success, Failure> and Result<OtherSuccess, Failure> we need to return something in the form Result<_, Failure>.

A tuple is a natural fit for the paired, success generic. That is, Result.zip (in method form) would have the following shape:

switching over (self, other) guides the implementation.

Now we can pair up Results when they need to align along .success—e.g. imagine we need to check both a username and password with two validation functions, say checkUsername: (String) -> Result<String, CredentialError> and checkPassword: (String) -> Result<String, CredentialError>. Result.zip lets us chain checkUsername(someUsername).zip(checkPassword(somePassword)), returning a Result<(String, String), CredentialError>.

You may have noticed a downside to this approach (that’s also commented in (4)). We’re dropping checkPassword’s error, if checkUsername is also in the .failure case.

The Point-Free duo covers this in Episode #24 and presents a generalization of Result, Validated, that allows for errors to be accumulated across zippings.

A variadic `Publisher.zip`

Roll call number two.

So far we’ve extended Swift’s free-function zip(_:_:) an argument-length to zip(_:_:_:) and implemented zip on Result.

Now we’re ready for some Combine.

Let’s see which overloads the framework ships with,

Publisher.zip(_:) (and a transform variant).
Publisher.zip(_:_:) (again, a transform variant).
Publisher.zip(_:_:_:) (you know the drill).

And this is a solid set! Having a few, finite zips lets us create higher arity overloads and the transforming versions save us a .map(/* (_, …, _) -> Transformed */) call, while better colocating transformations.

Still, there’s a missing piece. What if we don’t know the publisher count ahead of time?

We ran into this quite often at Peloton, especially when dealing with older, strictly-RESTful APIs.

An endpoint would return a list of identifiers, and then per identifier, we’d have to fire off another request to fetch metadata.

We’d usually treat these requests as all-or-nothing (if one fails, we’d bottom out the attempt) to simplify loading views.

To start the implementation of a variadic zip, we’ll need to extend the Publisher namespace.

When adding Combine operators, we have two options: composing existing operators¹ or, when needed, filling out a Publisher conformance.

If we zoom in on the parameter and return types, clearing the syntactic fog around AnyPublisher and re-writing Publisher’s associated types as a kind of generic, here’s what we’re dealing with:

[Publisher<Output, Failure>] -> Publisher<[Output], Failure>

An array of publishers to be zipped into a single publisher emitting an array of Outputs.

Other communities call this “flipping of containers” or the Haskell synonym, sequence (no worries if that link is incomprehensible, I added it so you can recognize the term when working across languages).

We have an array of publishers and need to…(spoiler alert)…reduce down to a single publisher. Sequence.reduce to the rescue!

The method takes both an initialResult and nextPartialResult argument which brings us to the next question, when zipping arbitrarily many publishers, what’s the “initial result?”

In our case, it’s self (with some form adjustment to match the return type).

Now to reduce the remaining publishers. In pseudocode shorthand, if we have Publisher<[Output], Failure> and Publisher<Output, Failure> arguments, we need to fold the lone publisher into the array-emitting publisher.

Maybe we can two-zip?

Almost there! There’s a wrinkle to iron out: the AnyPublisher<([Output], Output), Failure> and AnyPublisher<[Output], Failure> mismatch.

And it’s a map away.

⌘ + B’ing should finally work again.

Time to recap the path we took.

Combine’s Publisher.zips top out at arity three. Which, works well when we know the publisher count at compile time.
When we don’t, we can lean on the existing overloads and reduce and flatten to build a variadic version.

Now, how would call sites look? Starting with a fabricated—and then more practical—example.

Imagine we had five timer publishers, each emitting with intervals ranging from 0.1–0.5 seconds and we wanted to lockstep emissions to the longest interval².

Even though the publishers in others are faster than first, zipping gates them to emit in tandem. Each array the sink receives will be first and others’s first, second, third, …, nth value events, respectively.

Or, less abstractly, assume we have an array of user identifiers and want to (in an all-or-nothing fashion) make an API request per ID then collect the results.

To make our overload more ergonomic, let’s add a [Publisher].zipped variant.

Looking forward

“Now what?” - Those fish from Finding Nemo, after reading this far.

I landed these overloads in a community collection of Combine extensions, CombineExt. So, it’s a CocoaPod-, Carthage-, or SPM-installation away.

We can then turn towards related combining operators and ask if they have variadic overloads.

Publisher.merge? Turns out it’s tucked away in the Publishers namespace.

Publisher.combineLatest? Like zip, it has an arity three ceiling. But hang tight—that’s what I’m working on next!

■

⇒ It’s no coincidence that Sequence, Result, and Publisher all support a zip operation. A type that supports it is an applicative functor. Here’s a few links on the topic, for the curious:

Type Classes’ ‌Functortown: Applicative series
Julie Moronuki’s “Applicatives are monoidal”
“Why applicatives are monoidal” notebook entry
Lecture 16 (timestamped) of Programming with Categories

Matt Neuburg aptly called this route “composed operators” in their recent book, Understanding Combine. An added benefit of composing existing operators is that back pressure handling comes for free, since each inner operator respects the Publisher contract. ↩
Borrowing from Shai’s example usage. ↩

Distillations About

Writing a Combine operator: variadic zipping

Updates:

5/23/20:

Two-ary Sequence zipping with zip(_:_:)

Implementing Result.zip

A variadic Publisher.zip

Looking forward

Footnotes and related reading

Two-ary `Sequence` zipping with `zip(_:_:)`

Implementing `Result.zip`

A variadic `Publisher.zip`