Recompiling output of Erlang's beam_disasm.file

923 Views Asked by At

I'm trying to modify a beam file such that local function calls are interpreted as external module calls even though a function may be defined in the same module where it is being called.

Given m as a module, I've tried several permutations of recompiling a disassembled beam file to no avail. Here is an example, in elixir, of one of my attempts:

IO.inspect(:file.write_file("/tmp/disasm.asm", :io_lib.fwrite("~p.\n", [:beam_disasm.file(:code.which m)])))

IO.inspect(:compile.noenv_file("/tmp/disasm.asm", [:from_asm]))

I'd really appreciate any input on how I could easily recompile the output of :beam_disasm.file back into a .beam file.

Thanks!

EDIT: Proving more information

Assume I have an elixir module that looks like this:

defmodule MyApp.IndirectMod do

  def value do
    1
  end

  def indirect_value do
    value()
  end

  def indirect_value_2 do
    MyApp.IndirectMod.value()
  end

end

After the application is compiled, :beam.disasm provides the following output of its beam file:

[   {:attribute, 1, :file, {'lib/temp.ex', 1}},   {:attribute, 1, :module, MyApp.IndirectMod},   {:attribute, 1, :compile, :no_auto_import},   {:attribute, 1, :export,    [__info__: 1, indirect_value: 0, indirect_value_2: 0, value: 0]},   {:attribute, 1, :spec,    {{:__info__, 1},
    [
      {:type, 1, :fun,
       [
         {:type, 1, :product,
          [
            {:type, 1, :union,
             [
               {:atom, 1, :attributes},
               {:atom, 1, :compile},
               {:atom, 1, :functions},
               {:atom, 1, :macros},
               {:atom, 1, :md5},
               {:atom, 1, :module},
               {:atom, 1, :deprecated}
             ]}
          ]},
         {:type, 1, :any, []}
       ]}
    ]}},   {:function, 0, :__info__, 1,    [
     {:clause, 0, [{:atom, 0, :module}], [], [{:atom, 0, MyApp.IndirectMod}]},
     {:clause, 0, [{:atom, 0, :functions}], [],
      [
        {:cons, 0, {:tuple, 0, [{:atom, 0, :indirect_value}, {:integer, 0, 0}]},
         {:cons, 0,
          {:tuple, 0, [{:atom, 0, :indirect_value_2}, {:integer, 0, 0}]},
          {:cons, 0, {:tuple, 0, [{:atom, 0, :value}, {:integer, 0, 0}]},
           {nil, 0}}}}
      ]},
     {:clause, 0, [{:atom, 0, :macros}], [], [nil: 0]},
     {:clause, 0, [{:atom, 0, :attributes}], [],
      [
        {:call, 0,
         {:remote, 0, {:atom, 0, :erlang}, {:atom, 0, :get_module_info}},
         [{:atom, 0, MyApp.IndirectMod}, {:atom, 0, :attributes}]}
      ]},
     {:clause, 0, [{:atom, 0, :compile}], [],
      [
        {:call, 0,
         {:remote, 0, {:atom, 0, :erlang}, {:atom, 0, :get_module_info}},
         [{:atom, 0, MyApp.IndirectMod}, {:atom, 0, :compile}]}
      ]},
     {:clause, 0, [{:atom, 0, :md5}], [],
      [
        {:call, 0,
         {:remote, 0, {:atom, 0, :erlang}, {:atom, 0, :get_module_info}},
         [{:atom, 0, MyApp.IndirectMod}, {:atom, 0, :md5}]}
      ]},
     {:clause, 0, [{:atom, 0, :deprecated}], [], [nil: 0]}    ]},   {:function, 7, :indirect_value, 0,    [{:clause, 7, [], [], [{:call, 8, {:atom, 8, :value}, []}]}]},   {:function, 11, :indirect_value_2, 0,    [
     {:clause, 11, [], [],
      [
        {:call, 12,
         {:remote, 12, {:atom, 0, MyApp.IndirectMod}, {:atom, 12, :value}}, []}
      ]}    ]},   {:function, 3, :value, 0, [{:clause, 3, [], [], [{:integer, 0, 1}]}]} ]

The particular piece of information to which I'd like to bring to your attention is this:

{:function, 7, :indirect_value, 0,
 [{:clause, 7, [], [], [{:call, 8, {:atom, 8, :value}, []}]}]},
{:function, 11, :indirect_value_2, 0,
 [
   {:clause, 11, [], [],
    [
      {:call, 12,
       {:remote, 12, {:atom, 0, MyApp.IndirectMod}, {:atom, 12, :value}}, []}
    ]}
 ]},

indirect_value2 is a "remote" call while indirect_value is a "local" call. What I'm trying to achieve is to have indirect_value be mimicked/seen as a remote call like indirect_value_2.

I'm trying to achieve this during the compilation process. The only approach I've thought is is to dissassemble the beam file, alter it appropriately and reassemble it. I'm very much open to alternative suggestions.

3

There are 3 best solutions below

0
On

Again, Erlang compiler is able to compile intermediate results of compilation process. You're thinking to modify beam assembly which in Erlang you can also produce with to_asm flag or in shell with a +S option, it can be sent back to compiler input. I am just not sure how to do this in code, but in shell you just call erlc file.asm or whatever is your filename. I.e. it can work in theory, all pieces are there. It just feels a bit wrong to do because beam assembly is a final result of compilation, highly optimised and many things are rewritten or removed.

Please note that beam.disasm file you provided is written with Elixir syntax and the erlc compiler will not read it, maybe elixir compiler could.

Also note that you're deep in low level compiler territory and if the dragon comes to eat you, only a handful of people can actually help you, so find them in Slack or on the Mailing List.

How I would do this in Erlang: I would modify Erlang AST by writing a parse transformation http://www.erlang-factory.com/upload/presentations/521/yrashk_parse_transformations_sf12.pdf also more docs for it from 2010-2013 Is there a good, complete tutorial on Erlang parse transforms available?

I cannot imagine a possible use case for this. What are you trying to do?

1
On

Compiling via erlc -S xxx.erl will produce xxx.S which when you modify can be compiled with erlc xxx.S into BEAM file. But you did not specify what is the use case, where input comes from and where the output goes? Is it OK to use parse transform? Is it OK to use command line compiler or must be called from code?

Is it necessary to compile from code? If so, then someone more familiar with compiler module will suggest correct call for your case.

Or maybe for your case it would be perfectly fine to use a parse transformation, where you replace all local calls with ?MODULE:call() which then will transparently be compiled without any intrusion into what you want.

1
On

Ok, so this is Elixir compile-time processing. I am not familiar with how it works in Elixir, but they rely on macros more, and preprocessing before compilation is deprecated/not available like in Erlang or hard to get. If this was Erlang there are tools for what you want. Dig Elixir chat on Slack.