Why are there missing file names and lines in Ruby 2.1's ObjectSpace.dump_all heap output?

338 Views Asked by At

I recently profiled a bunch of Ruby 2.1.2 code using the new ObjectSpace.dump_all method. I got back some useful data (using the script that @tmm1 suggests in that link):

 931 /app/vendor/bundle/ruby/2.1.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:ARRAY
1015 /app/vendor/bundle/ruby/2.1.0/gems/activemodel-4.1.5/lib/active_model/attribute_methods.rb:385:ARRAY
1015 /app/vendor/bundle/ruby/2.1.0/gems/activemodel-4.1.5/lib/active_model/attribute_methods.rb:385:DATA
1015 /app/vendor/bundle/ruby/2.1.0/gems/activemodel-4.1.5/lib/active_model/attribute_methods.rb:385:NODE
1054 ::REGEXP
1075 ::ICLASS
1095 /app/vendor/ruby-2.1.2/lib/ruby/2.1.0/psych/class_loader.rb:32:STRING
1753 /app/vendor/cache/aws-sdk-core-ruby-99f5012f1162/aws-sdk-core/lib/seahorse/client/net_http/connection_pool.rb:49:ARRAY
1753 /app/vendor/cache/aws-sdk-core-ruby-99f5012f1162/aws-sdk-core/lib/seahorse/client/net_http/connection_pool.rb:49:STRING
1969 /app/vendor/bundle/ruby/2.1.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:NODE
2036 /app/vendor/bundle/ruby/2.1.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:DATA
2507 /app/vendor/bundle/ruby/2.1.0/gems/pg-0.17.1/lib/pg/result.rb:10:STRING
3227 /app/vendor/bundle/ruby/2.1.0/gems/polyglot-0.3.5/lib/polyglot.rb:65:STRING
4592 ::OBJECT
5291 ::CLASS
15623 ::NODE
19227 ::ARRAY
25977 ::DATA
27162 ::HASH
140490 ::STRING

My question is: why are so many objects being allocated where we can't see the file name or line number (those last six lines)? It seems like my problem has to do with a lot of strings hanging around, which definitely makes sense given what this app does. But not knowing where these strings are getting built from means I can't do anything.

I'm running ObjectSpace.trace_object_allocations_start when the app starts up, well before the operations I'm trying to profile. And I am definitely calling GC.start before each invocation of ObjectSpace.dump_all. I'm calling .dump_all in some Sidekiq middleware that dumps every 100 jobs in a manner similar to @krasnoukhov's post.

I guess my real question is, is there someone here who understands Ruby object allocation enough to explain why ObjectSpace might not know the origin of those STRINGs?

thanks!

Edit: I'm using pretty much the same code as in @Krasnoukhov's blog

if ENV["PROFILE"]
  require "objspace"
  ObjectSpace.trace_object_allocations_start
  Sidekiq.logger.info "allocations tracing enabled"

  module Sidekiq
    module Middleware
      module Server
        class Profiler
          # Number of jobs to process before reporting
          JOBS = 100
      class << self
        mattr_accessor :counter
        self.counter = 0

        def synchronize(&block)
          @lock ||= Mutex.new
          @lock.synchronize(&block)
        end
      end

      def call(worker_instance, item, queue)
        begin
          yield
        ensure
          self.class.synchronize do
            self.class.counter += 1

            if self.class.counter % JOBS == 0
              Sidekiq.logger.info "reporting allocations after #{self.class.counter} jobs"
              GC.start
              ObjectSpace.dump_all(output: File.open("heap.json", "w"))
              Sidekiq.logger.info "heap saved to heap.json"
            end
          end
        end
      end
    end
  end
end


Sidekiq.configure_server do |config|
  config.server_middleware do |chain|
      chain.add Sidekiq::Middleware::Server::Profiler
    end
  end
end
0

There are 0 best solutions below