Separate as a Pattern

Here’s another one of those situations that comes up often enough at work that it seems like it should be a recognized pattern. But then when I try to think how to really generalize it, I come up empty-handed, so I suppose that’s why it isn’t. Feels like a pattern; might not be general enough to be.

Anyway, it’s where you have a list of items that are already in sequence, and you only want a subset of those items. So far so filter, amirite? Or … so far so list comprehension.

So here’s the catch. Rather than returning a single list of all the items that passed my filter, I want the separations preserved. That is, I want a list of lists, each member of which is a list of consecutive items that passed the filter. That’s why it isn’t as simply as using a filter. Or a list comprehension. Because I need to "keep ‘em separated."

I’m going to call this separate, by the way. As an illustrative example, say I have a list of integers (all examples are in Coffeescript and may make use of the excellent RamdaJS library):

nums = [1,2,4,6,3,5,7,9,8,10]

And say I want only the evens. With a normal filter, I’d get this:

nums = [1,2,4,6,3,5,7,9,8,10]
evens = R.filter(even, nums)
# [2,4,6,8,10]

And of course it’s just as easy with a list comprehension:

nums = [1,2,4,6,3,5,7,9,8,10]
evens = (e for e in nums when even(e))
# [2,4,6,8,10]

What I want is a function separate that does this instead:

nums = [1,2,4,6,3,5,7,9,8,10]
evens = separate(even, nums)
# [[2,4,6],[8,10]]

Right out of the gate, we can see this is some form of reduce operation – because it doesn’t just exclude values, it also transforms the output. And it’s not a compose, because it isn’t like we first apply the filter and then the transformation. Nope – the transformation is part of the filter. Or something.

Really, what we need is some kind of stateful reduce – a reduce function that behaves differently when we transition from items in the seuqence that pass the filter to items in the sequence that don’t.

Welp, here’s the obvious first pass:

nums = [1,2,4,6,3,5,7,9,8,10]
make_reducer = (filter_function) ->
  (accumulator, item) ->
    if filter_function(item)
      accumulator.push item
    else
      accumulator

reduce_function = make_reducer(even)
evens = R.reduce(reduce_function, [], nums)
# [2,4,6,8,10]

But of course, this is just filter in another form. To add the state(fulness), I need my accumulator to keep track of two things, really. It needs to keep a record of the final list of lists that will be returned at the end, and it also needs to keep a record of the current series of items that pass the filter, if there is one. Well, that sounds easy! Instead of a single list, the accumulator should be two lists – one that holds the final result, and one that holds the temporary accumulator for subsequences.

nums = [1,2,4,6,3,5,7,9,8,10]
make_reducer = (filter_function) ->
  (accumulator, item) ->
    if filter_function(item)
      accumulator[1].push item
    else if accumulator[1].length > 0
      accumulator[0].push accumulator[1]
      accumulator[1] = []
    accumulator

reduce_function = make_reducer(even)
evens = R.reduce(reduce_function, [[],[]], nums)
# [[[2,4,6]],[8,10]]

Now the logic is a little more complicated. Just like before, if we encounter an item that passes the filter, we simply slap it onto the accumulator. Only now, "the accumulator" is "the temporary accumulator for subsequences," which I’ve arbitrarily decided is the second list.

If we encounter an item that does not pass the filter, well, one of two things could be true. Either we’ve come to the end of a sequence of items that needs to be pushed onto the final accumulator … OR, we haven’t, and we do nothing. The else if clause handles for the case where we’ve just reached the end of a sequence of consecutive filter-passing items.

But this isn’t quite finished, and if you look at the output (in the comment at the bottom) you’ll probably be able to guess why.

Indeed … this code only pushes the temporary accumulator onto the return accumulator after we transition to a sequence of items that don’t pass the filter. Implication being: if you end on an item that passes the filter, it never gets glommed onto the final return accumulator. Oops.

So, we actually need to wrap all this in a function that checks for that.

Here’s my final version:

nums = [1,2,4,6,3,5,7,9,8,10]
make_reducer = (filter_function) ->
  (accumulator, item) ->
    if filter_function(item)
      accumulator[1].push item
    else if accumulator[1].length > 0
      accumulator[0].push accumulator[1]
      accumulator[1] = []
    accumulator

final_reduction = (acc) ->
  if acc[1].length > 0
    acc[0].push acc[1]
  acc[0]

reduce_function = make_reducer(even)
evens = final_reduction(R.reduce(reduce_function, [[],[]], nums))
# [[2,4,6],[8,10]]

Now the answer’s right!

To make it a bit more idiomatic RamdaJS, you might do this:

nums = [1,2,4,6,3,5,7,9,8,10]
make_reducer = (filter_function) ->
  (accumulator, item) ->
    if filter_function(item)
      accumulator[1].push item
    else if accumulator[1].length > 0
      accumulator[0].push accumulator[1]
      accumulator[1] = []
    accumulator

final_reduction = (acc) ->
  if acc[1].length > 0
    acc[0].push acc[1]
  acc[0]

reduce_function = make_reducer(even)
even_sequences = R.compose(final_reduction, R.reduce(reduce_function, [[],[]]))
evens = even_sequences(nums)
# [[2,4,6],[8,10]]

And of course that makes it obvious how to abstract away the particulars to get separate:

separate = (filter_function) -> R.compose(final_reduction, R.reduce(make_reducer(filter_function), [[],[]]))

Now it’s in a form where we just pass in whatever the filter is, and it constructs the function we need! So, the following code would print out the answer from before:

nums = [1,2,4,6,3,5,7,9,8,10]
evens = separate(even)
console.log evens(nums)

(The definition of even was never given and is left as an exercise for the reader. Heh, heh. Don’t stay up too late…)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>