loading...
Okinawa Ruby User Group

How can I use RubyVM::AbstractSyntaxTree.of in irb?

hanachin profile image Seiei Miyagi ・2 min read

Put following snippets to ~/.irbrc.

unless defined?(SCRIPT_LINES__)
  SCRIPT_LINES__ = {}
end

ast_happier = TracePoint.new(:call) do |tp|
  SCRIPT_LINES__['(irb)'] = tp.binding.local_variable_get(:statements).lines
end
ast_happier.enable(target: IRB::WorkSpace.instance_method(:evaluate))

It works, yay!

% irb
irb(main):001:0> pp RubyVM::AbstractSyntaxTree.of(-> { puts :hi })
(SCOPE@1:35-1:48
 tbl: []
 args:
   (ARGS@1:35-1:35
    pre_num: 0
    pre_init: nil
    opt: nil
    first_post: nil
    post_num: 0
    post_init: nil
    rest: nil
    kw: nil
    kwrest: nil
    block: nil)
 body: (FCALL@1:38-1:46 :puts (ARRAY@1:43-1:46 (LIT@1:43-1:46 :hi) nil)))
=> #<RubyVM::AbstractSyntaxTree::Node:SCOPE@1:35-1:48>

But how does it work?

SCRIPT_LINES__ is used for standard library debug.
https://docs.ruby-lang.org/en/2.6.0/DEBUGGER__.html

Of course there is no documentation.

SCRIPT_LINES__ = {} unless defined? SCRIPT_LINES__ # :nodoc:

https://github.com/ruby/ruby/blob/v2_6_3/lib/debug.rb#L22

In the ast.c, SCRIPT_LINES__ is also used for finding the source code of the proc.

static VALUE
script_lines(VALUE path)
{
    VALUE hash, lines;
    ID script_lines;
    CONST_ID(script_lines, "SCRIPT_LINES__");
    if (!rb_const_defined_at(rb_cObject, script_lines)) return Qnil;
    hash = rb_const_get_at(rb_cObject, script_lines);
    if (!RB_TYPE_P(hash, T_HASH)) return Qnil;
    lines = rb_hash_lookup(hash, path);
    if (!RB_TYPE_P(lines, T_ARRAY)) return Qnil;
    return lines;
}

https://github.com/ruby/ruby/blob/v2_6_3/ast.c#L189-L201

static VALUE
rb_ast_s_of(VALUE module, VALUE body)
{
    VALUE path, node, lines;
    int node_id;
    const rb_iseq_t *iseq = NULL;


    if (rb_obj_is_proc(body)) {
        iseq = vm_proc_iseq(body);


        if (!rb_obj_is_iseq((VALUE)iseq)) {
            iseq = NULL;
        }
    }
    else {
        iseq = rb_method_iseq(body);
    }


    if (!iseq) return Qnil;


    path = rb_iseq_path(iseq);
    node_id = iseq->body->location.node_id;
    if (!NIL_P(lines = script_lines(path))) {
        node = rb_ast_parse_array(lines);
    }
    else if (RSTRING_LEN(path) == 2 && memcmp(RSTRING_PTR(path), "-e", 2) == 0) {
        node = rb_ast_parse_str(rb_e_script);
    }
    else {
        node = rb_ast_parse_file(path);
    }


    return node_find(node, node_id);
}

https://github.com/ruby/ruby/blob/v2_6_3/ast.c#L220-L253

During the RubyVM::AbstractSyntaxTree.of, ruby does

  1. read the file location from the RubyVM::InstractionSequence of given proc.
  2. read whole source code from some sources, such as SCRIPT_LINES__/the real file/command line input given by -e
  3. parse whole source code again to create an AST.
  4. read the node id from the RubyVM::InstructionSequence of given proc.
  5. find a node from the AST, the node which has same node id of given proc.

If parsing the same file, ruby parser gives identical node id for same object in that position. That's the point.

I use TracePoint to set the source code to SCRIPT_LINES__ before evaluate irb input.

The TracePoint work at here.

    # Evaluate the given +statements+ within the  context of this workspace.
    def evaluate(context, statements, file = __FILE__, line = __LINE__)
      eval(statements, @binding, file, line)
    end

https://github.com/ruby/ruby/blob/v2_6_3/lib/irb/workspace.rb#L83-L86

Hava a fun!

Posted on Apr 25 '19 by:

Discussion

markdown guide