Warming Up Cache Digests Overnight

235 Views Asked by At

We have a Rails 3.2 website which is fairly large with thousands of URLs. We implemented Cache_Digests gem for Russian Doll caching. It is working well. We want to further optimize by warming up the cache overnight so that user gets a better experience during the day. I have seen answer to this question: Rails: Scheduled task to warm up the cache?

Could it be modified for warming up large number of URLs?

2

There are 2 best solutions below

0
On

To trigger cache hits for many pages with expensive load times, just create a rake task to iteratively send web requests to all record/url combinations within your site. (Here is one implementation)

Iteratively Net::HTTP request all site URL/records:

To only visit every page, you can run a nightly Rake task to make sure that early morning users still have a snappy page with refreshed content.

lib/tasks/visit_every_page.rake:

namespace :visit_every_page do
  include Net
  include Rails.application.routes.url_helpers

  task :specializations => :environment do
    puts "Visiting specializations..."
    Specialization.all.sort{ |a,b| a.id <=> b.id }.each do |s|
      begin
        puts "Specialization #{s.id}"
        
        City.all.sort{ |a,b| a.id <=> b.id }.each do |c|
          puts "Specialization City #{c.id}"
          Net::HTTP.get( URI("http://#{APP_CONFIG[:domain]}/specialties/#{s.id}/#{s.token}/refresh_city_cache/#{c.id}.js") )
        end
      
        Division.all.sort{ |a,b| a.id <=> b.id }.each do |d|
          puts "Specialization Division #{d.id}"
          Net::HTTP.get( URI("http://#{APP_CONFIG[:domain]}/specialties/#{s.id}/#{s.token}/refresh_division_cache/#{d.id}.js") )
        end
      end
    end
  end

  # The following methods are defined to fake out the ActionController
  # requirements of the Rails cache
  
  def cache_store
    ActionController::Base.cache_store
  end

  def self.benchmark( *params )
    yield
  end

  def cache_configured?
    true
  end
end

(If you want to directly include cache expiration/recaching into this task, check out this implementation.)

via a Custom Controller Action:

If you need to bypass user authentication restrictions to get to your pages, and/or you don't want to screw up (too badly) your website's tracking analytics, you can create a custom controller action for hitting cache digests that use tokens to bypass authentication:

app/controllers/specializations.rb:

class SpecializationsController < ApplicationController
...
  before_filter :check_token, :only => [:refresh_cache, :refresh_city_cache, :refresh_division_cache]
  skip_authorization_check :only => [:refresh_cache, :refresh_city_cache, :refresh_division_cache]

...

  def refresh_cache
    @specialization = Specialization.find(params[:id])
    @feedback = FeedbackItem.new
    render :show, :layout => 'ajax'
  end

  def refresh_city_cache
    @specialization = Specialization.find(params[:id])
    @city = City.find(params[:city_id])
    render 'refresh_city.js'
  end

  def refresh_division_cache
    @specialization = Specialization.find(params[:id])
    @division = Division.find(params[:division_id])
    render 'refresh_division.js'
  end

end

Our custom controller action renders the views of other expensive to load pages, causing cache hits to those pages. E.g. refresh_cache renders the same view page & data as controller#show, so requests to refresh_cache will warm up the same cache digests as controller#show for those records.

Security Note:

For security reasons, I recommend before providing access to any custom refresh_cache controller request that you pass in a token and check it to make sure that it corresponds with a unique token for that record. Matching URL tokens to database records before providing access (as seen above) is trivial because your Rake task has access to the unique tokens of each record -- just pass the record's token in with each request.

tl;dr:

To trigger thousands of site URL's/cache digests, create a rake task to iteratively request every record/url combination in your site. You can bypass your app's user authentication restrictions for this task by creating a a custom controller action that authenticates access via tokens instead.

0
On

I realize this question is about a year old, but I just worked out my own answer, after scouring a bunch of partial & incorrect solutions.

Hopefully this will help the next person...

Per my own utility class, which can be found here: https://raw.githubusercontent.com/JayTeeSF/cmd_notes/master/automated_action_runner.rb

You can simply run this (per it's .help method) and pre-cache your pages, without tying-up your own web-server, in the process.

class AutomatedActionRunner  
  class StatusObject
    def initialize(is_valid, error_obj)
      @is_valid = !! is_valid
      @error_obj = error_obj
    end

    def valid?
      @is_valid
    end

    def error
      @error_obj
    end
  end

  def self.help
    puts <<-EOH
      Instead tying-up the frontend of your production site with:
        `curl http://your_production_site.com/some_controller/some_action/1234`
        `curl http://your_production_site.com/some_controller/some_action/4567`
      Try:
        `rails r 'AutomatedActionRunner.run(SomeController, "some_action", [{id: "1234"}, {id: "4567"}])'`
    EOH
  end

  def self.common_env
    {"rack.input"  => "", "SCRIPT_NAME" => "", "HTTP_HOST" => "localhost:3000" }
  end
  REQUEST_ENV = common_env.freeze

  def self.run(controller, controller_action, params_ary=[], user_obj=nil)
    success_objects = []
    error_objects = []
    autorunner = new(controller, controller_action, user_obj)
    Rails.logger.warn %Q|[AutomatedAction Kickoff]: Preheating cache for #{params_ary.size} #{autorunner.controller.name}##{controller_action} pages.|

    params_ary.each do |params_hash|
      status = autorunner.run(params_hash)
      if status.valid?
        success_objects << params_hash
      else
        error_objects << status.error
      end
    end

    return process_results(success_objects, error_objects, user_obj.try(:id), autorunner.controller.name, controller_action)
  end

  def self.process_results(success_objects=[], error_objects=[], user_id, controller_name, controller_action)
    message = %Q|AutomatedAction Summary|
    backtrace = (error_objects.first.try(:backtrace)||[]).join("\n\t").inspect
    num_errors = error_objects.size
    num_successes = success_objects.size

    log_message = %Q|[#{message}]: Generated #{num_successes} #{controller_name}##{controller_action}, pages; Failed #{num_errors} times; 1st Fail: #{backtrace}|
    Rails.logger.warn log_message

    # all the local-variables above, are because I typically call Sentry or something with extra parameters!
  end

  attr_reader :controller
  def initialize(controller, controller_action, user_obj)
    @controller = controller
    @controller = controller.constantize unless controller.respond_to?(:name)
    @controller_instance = @controller.new
    @controller_action = controller_action
    @env_obj = REQUEST_ENV.dup
    @user_obj = user_obj
  end

  def run(params_hash)
    Rails.logger.warn %Q|[AutomatedAction]: #{@controller.name}##{@controller_action}(#{params_hash.inspect})|
    extend_with_autorun unless @controller_instance.respond_to?(:autorun)

    @controller_instance.autorun(@controller_action, params_hash, @env_obj, @user_obj)
  end


  private

  def extend_with_autorun
    def @controller_instance.autorun(action_name, action_params, action_env, current_user_value=nil)
      self.params = action_params # suppress strong parameters exception
      self.request = ActionDispatch::Request.new(action_env)
      self.response = ActionDispatch::Response.new
      define_singleton_method(:current_user, -> { current_user_value })

      send(action_name) # do it
      return StatusObject.new(true, nil)
    rescue Exception => e
      return StatusObject.new(false, e)
    end
  end
end