Pipeable data in Ruby
This post comes from some playing around after seeing Hadley Wickham speak about pipeable data in R. In it I try to explore different ways of serially applying a set of transformations to a piece of data.
Say we want to tell a story like the following:
“the bunny Foofoo went to the forest and ate some grass”
We build up the pieces to tell the story:
And then what? We have some choices.
Use nested function calls:
But this is hard to read. What if we broke it out?
Use separate variables for each state:
Not much better. The variable names are either redundant with the method names or non descriptive.
Let’s try using one variable to hold the story as it builds:
This is better, but contrived looking with ‘story’ repeated everywhere. What if we want to tell the same story several times with a different name? We’d have to copy and paste all three lines.
So we make a method:
Which is great, but what if you want the option to just use a piece of your story?
Ugh. What if there are many possible sub stories?
Maybe use lambdas with a pipeline:
Ruby syntax starts getting in the way. We can at least hide it away:
But this is still sort of all over the place. We can tidy it up by wrapping it in a class:
Which is actually pretty great in terms of readability. But what if there’s another ending, which this class doesn’t know about?
The nice readability of our story in code is gone, especially if there’s more than one of these building blocks.
But suppose we skip all this superstructure and use our original methods plus a small glue method in the data class?
Voila:
This doesn’t allow extra arguments, so if a method like :the
took a parameter like :bunny
, you couldn’t do 'Foofoo' | :the, :bunny
. You could accomplish this with 'Foofoo' .| :the, :bunny
(calling the operator directly), but this doesn’t work with the multiline format above. It’s an open question for me whether Ruby could be made to support this like elixir or clojure.