Understanding the Problem
Memory Bloat and Request Queueing
In large Rails applications, memory usage often grows gradually due to object retention, inefficient queries, or caching misconfigurations. This can result in sluggish responses, increased GC pauses, and even process crashes. Similarly, if Puma workers are saturated, incoming requests begin to queue, compounding latency and timeouts.
Why It Happens in RoR
- ActiveRecord loading too many rows or attributes (N+1 queries)
- Unreleased file handles or database connections
- Large in-memory objects like JSON blobs or cache dumps
- Improper concurrency settings in multi-threaded environments
Architecture Considerations
Single-Threaded Legacy Design
Many older Rails apps are architected for a single-threaded runtime (like Unicorn), which doesn't leverage concurrent request processing. Transitioning to Puma or Falcon introduces concurrency, requiring thread-safe code and tuning.
ORM Overhead with ActiveRecord
ActiveRecord can consume excessive memory when records are loaded eagerly or transformed into complex nested structures. Avoiding 'select *' queries and lazy-loading associations helps manage memory usage.
Diagnostics and Observability
Profiling Tools
Use these tools to identify memory and performance hotspots:
derailed_benchmarks
: benchmarks memory usage per requeststackprof
: samples stack frames to find bottlenecksrack-mini-profiler
: integrates into the request cycle
Memory Leak Indicators
- Memory usage increases with each request batch
- Long GC pauses or frequent GC cycles
- Heap snapshots showing persistent or growing object counts
gem install derailed_benchmarks RAILS_ENV=production bundle exec derailed exec perf:mem
Common Pitfalls
N+1 Query Patterns
Fetching parent objects and lazily loading children creates redundant queries. Use .includes
and .preload
to resolve.
# Bad @users.each { |u| u.posts.first } # Good User.includes(:posts).each { |u| u.posts.first }
Memory-Hungry Serializers
APIs that serialize massive JSON responses may hold large objects in memory longer than necessary. Paginate and stream where appropriate.
Improper Puma Configuration
Puma defaults may not suit high-load systems. Ensure workers and threads are tuned for CPU cores and request latency.
workers Integer(ENV['WEB_CONCURRENCY'] || 2) threads_count = Integer(ENV['RAILS_MAX_THREADS'] || 5) threads threads_count, threads_count
Step-by-Step Fixes
1. Optimize Database Queries
Use Bullet
gem in development to catch N+1 and unused eager loading.
2. Limit In-Memory Object Lifespan
Avoid long-lived in-memory objects like large class-level caches or global state.
3. Tune Puma Settings
Balance thread count and worker processes based on available CPU and request characteristics. Avoid excessive concurrency on memory-constrained systems.
4. Monitor Garbage Collection
Log GC stats and observe major vs minor collection trends using GC::Profiler
.
GC::Profiler.enable puts GC::Profiler.report
5. Deploy Memory Watchdogs
Use tools like memory_profiler
and objspace
to track memory allocation. Consider puma_worker_killer
to restart bloated workers.
Best Practices
- Paginate all collection responses in APIs
- Use streaming for large downloads or reports
- Memoize cautiously—avoid global caches in shared workers
- Audit background jobs for memory retention
- Upgrade to recent Ruby versions (3.2+) for memory efficiency
Conclusion
Performance degradation due to memory bloat or request queueing in Ruby on Rails applications is not merely a runtime concern—it reflects architectural and code-level inefficiencies. Proactive profiling, careful database interaction, and thread-aware application design are essential for maintaining performance under real-world load. With robust observability and memory-conscious coding, even monolithic Rails apps can scale reliably in enterprise environments.
FAQs
1. Why does my Rails app memory keep growing over time?
This is typically due to retained objects in memory—often from caches, global variables, or slow GC collection cycles. Use profiling tools to pinpoint root causes.
2. How do I make my Puma server more memory efficient?
Reduce thread counts, use worker restarts with puma_worker_killer, and monitor GC activity. Also ensure requests aren't creating large temporary objects.
3. What causes ActiveRecord to use excessive memory?
Loading large datasets, deeply nested associations, or unnecessary attributes can inflate memory use. Use select
to fetch only needed columns.
4. Can I use a multi-threaded server safely in Rails?
Yes, but only if your app is thread-safe. Avoid mutable global state and ensure all code, including gems, is thread-friendly.
5. How can I test memory issues locally?
Use derailed_benchmarks
or memory_profiler
locally to simulate production load and observe memory behavior over request cycles.