A senior Rails engineer's take on Discard, Paranoia, and PaperTrail, GDPR right-to-deletion conflicts, and keeping version tables from eating your Aurora writer.
Late evening deploy at the creator economy platform I worked at. We were shipping a schema change to the Rails monolith hitting Aurora PostgreSQL. The migration looked harmless. add_column :users, :soft_deleted_at, :datetime. I’d reviewed it that morning and ack’d it as safe. It wasn’t. The default backfill grabbed an ACCESS EXCLUSIVE lock on a table with hundreds of millions of rows and held login down for about 85 seconds. Half the senior California engineers got paged. I’m the reason that incident has a runbook.
Here’s how I think about soft deletes and audit trails in Rails. Don’t use Paranoia. Use Discard for soft delete, PaperTrail for audit, and keep them as separate concerns. The rest of this post explains why, and where the dragons live.
Paranoia overrides ActiveRecord’s default_scope to silently filter out soft-deleted rows. It’s seductive. You add acts_as_paranoid, and every existing query Just Works. Until it doesn’t.
The problem is the silent override. Every join, every preload, every find_by now carries a hidden deleted_at IS NULL predicate. A SQL view or analytics extract that bypasses ActiveRecord gives a different answer. A Sidekiq job that finds a soft-deleted user by id silently does nothing. A junior engineer writes User.unscoped.where(...) to “fix” a bug and has now bypassed every other scope on the model too.
Discard is the opposite. There’s no default_scope. You opt in to filtering.
# Gemfile
gem "discard", "~> 1.3"
# app/models/concerns/soft_deletable.rb
module SoftDeletable
extend ActiveSupport::Concern
included do
include Discard::Model
self.discard_column = :discarded_at
scope :active, -> { kept }
end
def soft_delete!(actor: nil)
transaction do
discard!
PaperTrail.request(whodunnit: actor&.id) { touch }
end
end
end
# app/models/user.rb
class User < ApplicationRecord
include SoftDeletable
has_paper_trail meta: { tenant_id: :tenant_id, ip: :current_request_ip }
end
Now every query is explicit. User.kept.where(...) for active users. User.discarded.where(...) for the soft-deleted ones. User.with_discarded.where(...) for the union. When a teammate reviews a PR, they can read the scope and know exactly what rows are coming back. No hidden predicate.
People reach for Paranoia because it gives both soft delete and “history” in one gem. The history Paranoia gives you is “the row used to exist.” That isn’t an audit trail. An audit trail tells you who changed what, when, from what IP, on which version of the app. You want that for compliance, for support, for “why does this customer’s subscription say active but their card was refunded three days ago.”
PaperTrail does this well. The pattern that’s worked for me in production:
# config/initializers/paper_trail.rb
PaperTrail.config.version_limit = 50
# app/models/version.rb
class Version < PaperTrail::Version
self.table_name = "versions"
scope :for_tenant, ->(t) { where("object_changes->>'tenant_id' = ?", t.id.to_s) }
scope :in_window, ->(from, to) { where(created_at: from..to) }
end
# app/controllers/application_controller.rb
class ApplicationController < ActionController::Base
before_action :set_paper_trail_whodunnit
before_action :set_request_context
private
def set_paper_trail_whodunnit
PaperTrail.request.whodunnit = current_user&.id
end
def set_request_context
PaperTrail.request.controller_info = {
ip: request.remote_ip,
release_sha: ENV["GIT_SHA"],
user_agent: request.user_agent
}
end
end
Two things matter here. First, version_limit = 50. Per-record cap. Without it, the versions table will eat your writer. I’ll come back to this. Second, controller_info. The release SHA lives in the audit record so when someone asks “why did this attribute flip on Tuesday afternoon”, you can correlate it with the deploy that shipped Tuesday morning.
Back to that 85-second outage. The migration was a non-null column with a default on a hot table. It used the strong_migrations gem’s add_column_with_default helper, which is “safer than raw ActiveRecord.” Safer. Not safe.
What actually went wrong. The helper acquired an ACCESS EXCLUSIVE lock while applying the backfill. On Aurora at our row count, that meant about 90 seconds of blocked writes. Login error rate hit 100% for 85 of those seconds. First instinct in the war room was to roll back. Rails doesn’t have a clean rollback for a partially-applied add_column_with_default. By the time the lock would have released, we’d have been about 60 seconds into the cascade. We let it finish. 87 seconds. Locks released. Login recovered within 15 seconds because the dependent service had a tight retry loop.
The postmortem fix was the standard three-step Aurora dance.
# db/migrate/20240221_add_discarded_at_to_users.rb
class AddDiscardedAtToUsers < ActiveRecord::Migration[7.1]
disable_ddl_transaction!
def up
safety_assured do
add_column :users, :discarded_at, :datetime, null: true
add_index :users, :discarded_at,
where: "discarded_at IS NOT NULL",
algorithm: :concurrently,
name: "idx_users_discarded_at_partial"
end
end
def down
remove_index :users, name: "idx_users_discarded_at_partial"
remove_column :users, :discarded_at
end
end
A nullable column, a partial index, and no backfill. The discard column never needs a backfill because the default semantic of “missing means not discarded” is what you actually want. Same shape works for audited_at, archived_at, anything that’s a soft-state marker.
We also added a strong_migrations rule that blocks any add_column with a non-null default against tables over 10M rows in CI. The default was safer than raw ActiveRecord. The defaults are not safe.
Here’s the contradiction. Soft delete keeps the row. Right-to-deletion requires the row to be gone. PaperTrail keeps a copy of every old version of the row, including PII. You will get a deletion request. You need a plan for both the live record and the audit history.
The approach that’s held up:
# app/services/gdpr/erasure.rb
module Gdpr
class Erasure
PII_FIELDS = %w[email phone first_name last_name ip_address].freeze
def initialize(user)
@user = user
end
def call
ActiveRecord::Base.transaction do
anonymize_versions
anonymize_user
@user.discard!
end
end
private
def anonymize_versions
PaperTrail::Version
.where(item_type: "User", item_id: @user.id)
.find_each do |v|
obj = v.object || {}
changes = v.object_changes || {}
PII_FIELDS.each do |f|
obj[f] = "[redacted]" if obj.key?(f)
if changes.key?(f)
changes[f] = changes[f].map { "[redacted]" }
end
end
v.update_columns(object: obj, object_changes: changes)
end
end
def anonymize_user
@user.update_columns(
email: "deleted-#{@user.id}@invalid.local",
phone: nil,
first_name: "[redacted]",
last_name: "[redacted]",
ip_address: nil
)
end
end
end
Anonymize, don’t drop. The audit shape stays intact, the timestamps and the whodunnit references stay queryable, and the PII is gone. Discarding the user record itself records the erasure event in the audit log. Auditors love this because they can prove the erasure ran without exposing the original PII.
There’s a related trap. Don’t rely on a Rake task that someone runs manually when a deletion ticket comes in. Build it as a Sidekiq job triggered from your support tooling, with a unique constraint keyed by user id so it’s idempotent. Apple sent us the same renewal notification twice once and we ended up with duplicate subscription rows because the handler had no idempotency check. Same lesson applies here.
The versions table grows linearly with every audited write. On a large Rails app, that’s millions of rows per week. I’ve seen it pass the size of the actual business tables. Two things help.
First, partition by month. PostgreSQL native range partitioning works fine. Drop partitions older than your retention window in a maintenance job. Second, archive cold partitions to S3 in Parquet before drop. Queryable via Athena when an auditor asks for ancient history. Live writer stays small.
The other thing. Be careful what you read off the versions table during peak hours. Version-table partition operations should run through a maintenance guard that refuses to execute between peak hours.
versions table by month. Drop old partitions. Archive to S3.Thanks for reading. If you’ve got thoughts, send them my way.