ActiveRecord Query Internals

Tracing the Arel AST to the adapter, prepared statement traps on PostgreSQL, and the query cache pitfalls that actually bite in production.

I learned ActiveRecord the second time at the London agency I was a founding engineer at, debugging someone else’s Rails app at 1 a.m. A User.where(id: ids).pluck(:email) was hammering the writer with a fresh plan on every call. p95 had drifted from 180 ms to 2.4 s over a week and nobody noticed because the homepage was fine.

The fix took six minutes once I knew where to look. Getting there took two hours. That’s why I stopped treating ActiveRecord as the friendly DSL on top of SQL and started treating it as a four-stage pipeline I own all the way down.

Relation builder, Arel AST, connection adapter, PostgreSQL. Every weird query problem I’ve debugged at the creator economy platform I worked at lived in one of those layers. Knowing which one to suspect is most of the work.

From relation to AST

User.where(active: true).order(:created_at) does not run SQL. It builds an ActiveRecord::Relation that wraps an Arel tree. Arel is the AST. ActiveRecord is the builder. The SQL adapter is the printer that walks the tree and emits a string.

You can poke the tree directly. Single most useful debugging trick I know, and almost nobody uses it.

relation = User.where(active: true).where("created_at > ?", 7.days.ago).order(:id)
arel = relation.arel
puts arel.to_sql

# Walk the tree
arel.ast.children.each do |node|
  puts node.class
end

# Inspect the where clauses
relation.where_clause.ast.children.each do |predicate|
  puts predicate.class
  puts predicate.left.name if predicate.respond_to?(:left)
end

Two things jump out the first time you do this. where("created_at > ?", ...) becomes an opaque Arel::Nodes::SqlLiteral, while where(active: true) becomes a typed Arel::Nodes::Equality. The literal is invisible to merging, predicate stripping, and most static analysis. And order(:id) is a separate child of the AST root, not attached to the where node. If you ever wondered why .unscope(:order) works the way it does, that’s why.

Prefer hash form over string form. Not for style. For introspection.

Prepared statements lie about safety

Rails 7 enables prepared statements on PostgreSQL by default. The pitch is “save a parse on repeated queries.” The reality cost me two hours back at that agency.

PostgreSQL prepares the statement against the first execution and reuses the plan. For balanced data, fine. For skewed data, the planner can pick a plan that suits the first input and punishes the next ten thousand. Classic case: WHERE tenant_id = $1, first call comes in for a tiny tenant, planner picks a nested loop. Then the same prepared statement runs for the largest tenant on the platform and does a million row nested loop.

That’s what hit me on that agency Rails app. The Rails console run that morning had primed the statement with a small tenant. The booking flow ran it with a large one. We were paying sequential-scan cost on a query that index scanned beautifully on its own.

The fix is two layers.

# config/database.yml
production:
  adapter: postgresql
  prepared_statements: true
  statement_limit: 200
  variables:
    plan_cache_mode: force_custom_plan

force_custom_plan tells PostgreSQL to re-plan on every execution. You lose the parse-cache benefit. You gain a planner that sees the actual parameter every time. On a tenant-keyed workload, that trade is a no-brainer.

For queries that are wildly skewed and you don’t want the global setting, disable prepared statements for just that call:

class Tenant < ApplicationRecord
  scope :with_active_bookings, ->(tenant_id) {
    connection
      .unprepared_statement do
        where(id: tenant_id)
          .joins(:bookings)
          .where(bookings: { state: "active" })
          .to_a
      end
  }
end

unprepared_statement is a per-block escape hatch. The query goes out as a one-shot, the planner sees the real values, and it picks a sane plan. Use it surgically. Don’t paper your codebase with it.

Query cache is sneakier than you think

The ActiveRecord query cache is per-connection, per-request, keyed on the SQL string. That last part is the trap.

# These two hit different cache entries
User.where(id: [1, 2, 3]).to_a
User.where(id: [3, 2, 1]).to_a

# These two hit the same cache entry
User.where(id: 1).to_a
User.where(id: 1).to_a

Ordering of an IN list changes the string and therefore the cache entry. That’s not wrong, but it’s surprising the first time you watch your hit rate drop because someone reordered an array.

The bigger trap. The cache does not know about transactions on other connections. Your request caches SELECT count(*) FROM users early. A background worker on a different connection inserts ten rows. Your request still returns the stale count.

# Most controllers don't need this. The ones doing aggregates do.
class ReportsController < ApplicationController
  def dashboard
    ActiveRecord::Base.connection.uncached do
      @daily_signups = User.where(created_at: Date.current.all_day).count
      @active_subscriptions = Subscription.where(state: "active").count
    end
  end
end

uncached is the right tool here. Skip the cache for the queries that must see live data. Don’t disable it globally. The cache saves real work on the queries that get repeated.

Takeaways

ActiveRecord is a four-stage pipeline. Relation, Arel, adapter, Postgres. Know which layer your bug lives in.
Prefer hash conditions over string conditions. Hash form is introspectable. Strings are opaque to merging.
Prepared statements with skewed data can pick a plan for the first input and punish the rest. plan_cache_mode = force_custom_plan is the sane default at multi-tenant scale.
The query cache keys by literal SQL and does not invalidate across connections. Use uncached for aggregates that must read live.
Runbooks beat instinct. One sentence in the right place saves every future on-call.

Thanks for reading. If you’ve got thoughts, send them my way.