Tracing the Arel AST to the adapter, prepared statement traps on PostgreSQL, and the query cache pitfalls that actually bite in production.
I learned ActiveRecord the second time at the London agency I was a founding engineer at, debugging someone else’s Rails app at 1 a.m. A User.where(id: ids).pluck(:email) was hammering the writer with a fresh plan on every call. p95 had drifted from 180 ms to 2.4 s over a week and nobody noticed because the homepage was fine.
The fix took six minutes once I knew where to look. Getting there took two hours. That’s why I stopped treating ActiveRecord as the friendly DSL on top of SQL and started treating it as a four-stage pipeline I own all the way down.
Relation builder, Arel AST, connection adapter, PostgreSQL. Every weird query problem I’ve debugged at the creator economy platform I worked at lived in one of those layers. Knowing which one to suspect is most of the work.
User.where(active: true).order(:created_at) does not run SQL. It builds an ActiveRecord::Relation that wraps an Arel tree. Arel is the AST. ActiveRecord is the builder. The SQL adapter is the printer that walks the tree and emits a string.
You can poke the tree directly. Single most useful debugging trick I know, and almost nobody uses it.
relation = User.where(active: true).where("created_at > ?", 7.days.ago).order(:id)
arel = relation.arel
puts arel.to_sql
# Walk the tree
arel.ast.children.each do |node|
puts node.class
end
# Inspect the where clauses
relation.where_clause.ast.children.each do |predicate|
puts predicate.class
puts predicate.left.name if predicate.respond_to?(:left)
end
Two things jump out the first time you do this. where("created_at > ?", ...) becomes an opaque Arel::Nodes::SqlLiteral, while where(active: true) becomes a typed Arel::Nodes::Equality. The literal is invisible to merging, predicate stripping, and most static analysis. And order(:id) is a separate child of the AST root, not attached to the where node. If you ever wondered why .unscope(:order) works the way it does, that’s why.
Prefer hash form over string form. Not for style. For introspection.
Rails 7 enables prepared statements on PostgreSQL by default. The pitch is “save a parse on repeated queries.” The reality cost me two hours back at that agency.
PostgreSQL prepares the statement against the first execution and reuses the plan. For balanced data, fine. For skewed data, the planner can pick a plan that suits the first input and punishes the next ten thousand. Classic case: WHERE tenant_id = $1, first call comes in for a tiny tenant, planner picks a nested loop. Then the same prepared statement runs for the largest tenant on the platform and does a million row nested loop.
That’s what hit me on that agency Rails app. The Rails console run that morning had primed the statement with a small tenant. The booking flow ran it with a large one. We were paying sequential-scan cost on a query that index scanned beautifully on its own.
The fix is two layers.
# config/database.yml
production:
adapter: postgresql
prepared_statements: true
statement_limit: 200
variables:
plan_cache_mode: force_custom_plan
force_custom_plan tells PostgreSQL to re-plan on every execution. You lose the parse-cache benefit. You gain a planner that sees the actual parameter every time. On a tenant-keyed workload, that trade is a no-brainer.
For queries that are wildly skewed and you don’t want the global setting, disable prepared statements for just that call:
class Tenant < ApplicationRecord
scope :with_active_bookings, ->(tenant_id) {
connection
.unprepared_statement do
where(id: tenant_id)
.joins(:bookings)
.where(bookings: { state: "active" })
.to_a
end
}
end
unprepared_statement is a per-block escape hatch. The query goes out as a one-shot, the planner sees the real values, and it picks a sane plan. Use it surgically. Don’t paper your codebase with it.
The ActiveRecord query cache is per-connection, per-request, keyed on the SQL string. That last part is the trap.
# These two hit different cache entries
User.where(id: [1, 2, 3]).to_a
User.where(id: [3, 2, 1]).to_a
# These two hit the same cache entry
User.where(id: 1).to_a
User.where(id: 1).to_a
Ordering of an IN list changes the string and therefore the cache entry. That’s not wrong, but it’s surprising the first time you watch your hit rate drop because someone reordered an array.
The bigger trap. The cache does not know about transactions on other connections. Your request caches SELECT count(*) FROM users early. A background worker on a different connection inserts ten rows. Your request still returns the stale count.
# Most controllers don't need this. The ones doing aggregates do.
class ReportsController < ApplicationController
def dashboard
ActiveRecord::Base.connection.uncached do
@daily_signups = User.where(created_at: Date.current.all_day).count
@active_subscriptions = Subscription.where(state: "active").count
end
end
end
uncached is the right tool here. Skip the cache for the queries that must see live data. Don’t disable it globally. The cache saves real work on the queries that get repeated.
plan_cache_mode = force_custom_plan is the sane default at multi-tenant scale.uncached for aggregates that must read live.Thanks for reading. If you’ve got thoughts, send them my way.