Arel with Wharel
Object Relational Mappers (ORM) are complex things. As the term implies, an ORM provides a mapping from objects in an object-oriented language like Ruby to a relational persistence layer. In theory, this makes it possible to abstract away details of this layer behind objects; in practice, of course, things are not so simple.
In this post, I’ll show you how I’ve leveraged a couple very simple features of Ruby’s object model to simplify building queries in ActiveRecord, in the form of a teeny tiny gem called Wharel. Wharel allows you to create Arel predicates in blocks passed to ActiveRecord query methods, without all the standard boilerplate.
Before I explain Wharel, though, let’s take a few steps back and consider the problem this gem tries to solve.
Querying with ActiveRecord
Part of the challenge of building and extending an ORM is striking a good
balance between convenience and control. ActiveRecord offers you by default a
fairly minimal interface for making queries which (I suppose) satisfies about
90% of users' basic needs. That interface is made up of the handful of
methods like where
and not
, which allow you to create queries built up of
simple equality predicates, optionally negated:
Post.where(title: "foo").where.not(content: "bar") #=> SELECT "posts".* FROM "posts" WHERE "posts"."title" = 'foo' AND "posts"."content" != 'bar'
ActiveRecord also supports SELECT
, HAVING
, GROUP BY
, along with the
combinators AND
and (recently) OR
, as well as inner and outer JOIN
, and a few
others I forget just now.
Which is all fine and good, and may be adequate to power queries in your little CRUD
app. But very quickly you will find that where
and friends are not enough.
Suppose, for example, you want to search for partial matches on string-valued
data. A Rails developer would typically drop down to SQL to do this:
Post.where("title LIKE ?", "%#{params[:title]}%") #=> SELECT `posts`.* FROM `posts` WHERE (title LIKE '%foo%')
… where params[:title]
is a title from user input, and the application is
using MySQL as its database.
In this case, we’re not really using the ORM much at all, other than to
sanitize the data (params[:title]
) interpolated into the SQL string. We
can’t extend or re-use this query in other contexts. We can’t take it apart,
build it up, or compose it. We can’t use it with a different database which
uses a different name for LIKE
(like PostgreSQL, which uses ILIKE
for
case-insensitive matching).
In short, we can either use this SQL snippet as-is, or re-write it to be something different.
Enter Arel
For one-off situations, this may well be enough. But for anything else, ActiveRecord offers another layer in the form of Arel, a relational algebra for building queries abstracted from the particulars of any particular database. If ActiveRecord is a mapping between Ruby model objects and a relational database, Arel is the fine-grained mapping between Ruby query objects and SQL query fragments.
Let’s get concrete. Convert the LIKE
query above to Arel and we get:
posts = Post.arel_table Post.where(posts[:title].matches("%#{params[:title]}%")) #=> SELECT `posts`.* FROM `posts` WHERE (`posts`.`title` LIKE '%foo%')
Here, we needed to first call Post.arel_table
to get an instance of
Arel::Table
corresponding to the posts
table for the model. We can then
call posts[:title]
to get an arel node (actually, an Arel::Attribute
) for
the title
column.
We can then use this title
node to build a query using the matches
method,
which corresponds to LIKE
in MySQL.
This may not seem like such a big win, since anyway you could have written the query in raw SQL with less lines of code. But what if you wanted to re-use that snippet in combination with another condition?
You could certainly do this with a new inline SQL, but a more scalable way is to use Arel. For example:
class Post < ApplicationRecord scope :with_title, ->(str) { where(matches_title(str)) } scope :with_title_or_subtitle, ->(str) { where(matches_title(str).or(matches_subtitle(str))) } class << self def matches_title(title) arel_table[:title].matches("%#{title}%") end def matches_subtitle(subtitle) arel_table[:subtitle].matches("%#{subtitle}%") end end end
In this case, we’ve used the arel predicates returned by matches_title
and
matches_subtitle
to build a combined predicate using the pattern a.or(b)
,
which generates SQL like:
# Post.with_title_or_subtitle("foo") SELECT `posts`.* FROM `posts` WHERE (`posts`.`title` LIKE '%foo%' OR `posts`.`subtitle` LIKE '%foo%')
As you can imagine, with more complex queries, using Arel building blocks can be quite useful. But it should also be apparent that Arel’s “wordiness” can be a real turn-off, and partly explains why Rails developers tend to prefer one-off SQL fragments over Arel predicates.
Where + Arel = Wharel
Recently I’ve been looking a lot at the internals of Sequel, a Ruby ORM developed by Jeremy Evans that is superficially similar in many ways to ActiveRecord, but which has a vastly different internal implementation and much broader feature set.
One of the things that fascinated me about Sequel is its interface for building queries, which allows you to use blocks in the form of so-called “virtual rows”, like this:
Artist.where{id > 5}
Here, we are querying on a column id
. But notice that rather than passing a hash of
keys and values to where
, we are instead passing a block which references
the method id
and builds it into a predicate with id > 5
.
This type of query would not be possible passing a hash to ActiveRecord’s
where
, since a hash can only generate equality predicates. Instead, we would
need to build the appropriate predicate with Arel, like this:
artists = Artist.arel_table Artist.where(artists[:id].gt(5))
For such a simple query, I find the ActiveRecord/Arel code to be much harder to read. Unsurprisingly, others have felt the same way.
Many years ago, Ernie Miller set out to improve ActiveRecord with an interface similar to Sequel, in the form of a gem called Squeel. Squeel was popular, but ultimately the monkey-patching magic it required to bend ActiveRecord into shape burnt out its maintainer, and the gem is now abandoned. More recently, a gem named BabySqueel has attempted to recreate some of this interface without Squeel’s monkey-patching baggage.
While working on Mobility, I recently had a need to incorporate Arel predicates with ActiveRecord relations in a way that went beyond AR’s standard query method interface. I decided to implement something like the block-based query interface from Sequel, but vastly simplified.
In the process, it became apparent to me how simple it is to do this, so I extracted the core logic into a separate gem I’ve called Wharel, a portmanteau of “Where” and “Arel”.
What does Wharel do? Well, add it to your Gemfile and you can simplify queries like the one above to:
Artist.where { id.gt(5) }
Wharel also supports combining nodes from different models by optionally using a block argument, like this:
Artist.joins(:posts).where { |a| Post.where { |p| p.title.matches(a.name) } }
(This query matches artists who have at least one post with a title that matches their name.)
As you can see, Wharel adds an optional block format like Sequel, within which
any attributes called as methods are converted to their corresponding Arel
node. In this case, id
becomes the arel node Artist.arel_table[:id]
.
That’s it. There’s no additional magic to support predicates like id >
5
, which Sequel, Squeel and BabySqueel offer, because I wanted to keep Wharel
to an absolute minimum level of complexity.
And Wharel is indeed very simple: its core is a mere 31 lines of code. How can that be? Let’s have a look inside.
Virtual Rows and BasicObjects
Here is the Wharel
module:
module Wharel module QueryMethods def where(opts = :chain, *rest, &block) block_given? ? super(VirtualRow.build_query(self, &block)) : super end module WhereChain def not(*args, &block) block_given? ? super(VirtualRow.build_query(@scope, &block)) : super end end end class VirtualRow < BasicObject def initialize(klass) @klass = klass end def method_missing(m, *) @klass.column_names.include?(m.to_s) ? @klass.arel_table[m] : super end class << self def build_query(klass, &block) row = new(klass) query = block.arity.zero? ? row.instance_eval(&block) : block.call(row) ::ActiveRecord::Relation === query ? query.arel.constraints.inject(&:and) : query end end end end
There is really not much here. The QueryMethods
module is required to patch
ActiveRecord to handle blocks; nothing special there. What is special is the
VirtualRow
class, which subclasses BasicObject
.
Why BasicObject
? You may have seen this esoteric, minimalist object somewhere in a
diagram of Ruby’s object model, and recall that every Object
inherits
from it. Indeed, BasicObject
is a very simple object with almost no
methods at all, and for most contexts not something you’d really want to lean
on too much.
But in this case, the fact that BasicObject
is so minimal happens to be very
useful, because it allows us to create what’s called a Clean
Room. Take a look at
build_query
, which you’ll see we call from where
or not
to convert a
block to an Arel predicate:
VirtualRow.build_query(self, &block)
Here, self
is the class or relation (for not
it is @scope
, which is
basically the same), and block
is the block passed to where
or not
.
In build_query
, we first create an instance of VirtualRow
with a reference
to the class (or relation) (this is the new(klass)
line). We then check the
arity
of the block (don’t be scared, “arity” is just another word for
“number of arguments”).
If the block has no arguments (block.arity.zero?
), like this:
Post.where { title.eq("foo") }
… then we call instance_eval
on the virtual row and pass it the block.
Now what happens?
Well, instance_eval
is a bit like that movie, Being John
Malkovich, when you go
through the door and enter the mind of actor John Malkovich. Except in this
case, the door takes you to the mind of whatever object you’re calling
instance_eval
on. Any method you call from this context is received by that
object, just like everything you see when you go though the door is what John
Malkovich is seeing.
Here, the object is the virtual row. And this is where the fact that
VirtualRow
subclasses BasicObject
comes in useful. Since VirtualRow
has
virtually no methods (no pun intended), anything you call on it will fall through
to method_missing
. So if you call title
, method_missing
will be called
with :title
as its first argument.
What do we do then? Well, just define a method_missing
to catch these method
calls and “route” them to the model’s Arel table, which Wharel does in three
lines of code:
def method_missing(m, *) @klass.column_names.include?(m.to_s) ? @klass.arel_table[m] : super end
Here, @klass
would be Artist
, so we’re calling
Artist.arel_table[:title]
, which will get us the Arel node for title
.
Perfect! That’s exactly what we wanted.
Since this works for any attribute (as
long as the attribute name does not overlap with a method on BasicObject
,
which is exceedingly unlikely), we can build up any arbitrary Arel predicate
in this block without ever having to explicitly call arel_table
on any
model.
That Object Model Thing
Although this post has Arel in its title, the real point I’m trying to make
with Wharel is to show how some of Ruby’s seemingly esoteric features, like
BasicObject
and instance_eval
, can actually be very powerful. With only
thirty lines of code, we’ve created an interface that eliminates a huge amount
of Arel boilerplate code in any Rails app. I find this really remarkable.
Although I don’t think many Rubyists are very familiar with it, Ruby’s Object Model is actually very powerful and I highly recommend digging deeper into its details. Books like Metaprogramming Ruby by Paolo Perrotta are a great place to start, but it’s also a good idea to just pry open your favourite gem (like I did with Sequel) and see what it’s doing under the hood. If it’s doing something powerful, it’s probably leveraging this model to do it.
Posted on May 30th, 2018.