How to free yyptr stack in Bison?

Question

How to free yyptr stack in Bison?

146 Views Asked by rmosolgo At 12 April 2023 at 14:57

I'm using Bison to generate a parser and the lexer is written in Ragel. When I run my code with Valgrind, it gives me this output:

200,014 bytes in 2 blocks are definitely lost in loss record 139,468 of 139,600
  malloc (vg_replace_malloc.c:393)
 *yyparse (parser.c:1768)
 *GraphQL_CParser_Parser_c_parse (graphql_c_parser_ext.c:8)
  vm_call_cfunc_with_frame (vm_insnhelper.c:3268)
  vm_sendish (vm_insnhelper.c:5080)
  vm_exec_core (insns.def:820)
  rb_vm_exec (vm.c:2374)
  ...

(There's more to it -- I'm calling into the parser from Ruby. The full build output is available here, for now: https://github.com/rmosolgo/graphql-ruby/actions/runs/4556195698/jobs/8036241668?pr=4410)

It points to my (generated) yyparse function, at code like this:

# else /* defined YYSTACK_RELOCATE */
      /* Extend the stack our own way.  */
      if (YYMAXDEPTH <= yystacksize)
        YYNOMEM;
      yystacksize *= 2;
      if (YYMAXDEPTH < yystacksize)
        yystacksize = YYMAXDEPTH;

      {
        yy_state_t *yyss1 = yyss;
        union yyalloc *yyptr =
          YY_CAST (union yyalloc *,
                   YYSTACK_ALLOC (YY_CAST (YYSIZE_T, YYSTACK_BYTES (yystacksize))));
        if (! yyptr)
          YYNOMEM;
        YYSTACK_RELOCATE (yyss_alloc, yyss);
        YYSTACK_RELOCATE (yyvs_alloc, yyvs);
#  undef YYSTACK_RELOCATE
        if (yyss1 != yyssa)
          YYSTACK_FREE (yyss1);
      }

How can I address this error message from Valgrind? I'm not sure if this is a legitimate memory leak and I should find a way to free that allocated memory, or whether it's a false positive and I should suppress this message.

Is there anything I can do to debug further?

The entire generated C parser here, in case it helps: https://github.com/rmosolgo/graphql-ruby/blob/a048682e6d8468d16947ee1c80946dd23c5d91f9/graphql-c_parser/ext/graphql_c_parser_ext/parser.c

And the Bison definition is available here: https://github.com/rmosolgo/graphql-ruby/blob/a048682e6d8468d16947ee1c80946dd23c5d91f9/graphql-c_parser/ext/graphql_c_parser_ext/parser.y

Thanks for taking a look!

Original Q&A

There are 1 best solutions below

**vvv444** · Answer 1 · 2023-04-25T10:23:05.893000

I incline to believe that the behaviour you observe is a Bison bug. Below is a short summary of why I think so and what happens with memory allocations in this generated yyparse() function:

The parser uses two stacks, state stack yyss and value stack yyvs (defined here). The initial space for them is reserved as local array with depth YYINITDEPTH. If this depth isn't enough, additional allocations are performed from the heap (malloc).
When allocating larger memory, it is allocated as a single block that is used to hold both stacks together. Data from the previous memory block is copied to the new one.
The allocation us done by YYSTACK_ALLOC on line 1775. If previous allocation took place, that memory is definitely freed by YYSTACK_FREE on line 1782.
The last memory block that is used is supposed to be freed on line 3224. My guess that there is some scenario when the function returns without passing through this line.

It obviously requires some debugging to understand what's going on here. I can suggest some general points on how I would approach it:

First, check if the memory leak is deterministically reproducible. Does it happens every time when you run this pipeline?
Try to simplify the reproduction setup. Work towards making it shorter, simpler and run on your local development machine.
- Understand what is the command line command being run by the build system to invoke Valgrind. What other files does it need? Try running the same command on your dev machine. Check if the problem still reproduces.
- Then, work towards shortening the run cycle. As far as I understand, it runs a lot of tests under Valgrind, taking more than 20 minutes. Try bisecting the test set to understand if the problem is related to a specific test or just depends on the number of tests.
If you succeed to minify the reproduction setup, you can run it under debugger or add some tracing.

The tracing I would add is some naive printing to log file like this along with counting allocation/free mismatch conditions:

// my_alloc.h
#include <stdio.h>
#include <stdlib.h>

extern FILE* g_my_trace_file;
extern size_t g_alloc_count;


#define MY_TRACE(...) \
    do { \
        if (!g_my_trace_file) \
            g_my_trace_file = fopen("my_trace.txt", "a"); \
         \
        fprintf(g_my_trace_file, __VA_ARGS__); \
        fflush(g_my_trace_file); \
    } while(0)
        
inline void *my_alloc( size_t size ) { \
    p = malloc(size); \
    MY_TRACE("yyparse: Allocated block of %z bytes at addr %p\n", size, p); \
    ++g_alloc_count; \
    return p; \
}

inline void free( void *p ) { \
    MY_TRACE("yyparse: Freed block at addr %p\n", p); \
    --g_alloc_count;
    free(p); \
}

And add this in some .c file:

FILE* g_my_trace_file = NULL;
size_t g_alloc_count = 0;

In parser.c replace YYSTACK_ALLOC/YYSTACK_FREE on lines 628, 629 with my_alloc/my_free and include my_alloc.h.

In graphql_c_parser_ext.c add the following lines in GraphQL_CParser_Parser_c_parse:

VALUE GraphQL_CParser_Parser_c_parse(VALUE self) {
  g_alloc_count = 0;
  VALUE filename = rb_ivar_get(self, rb_intern("@filename"));
  MY_TRACE("Calling yyparse for file %s\n", filename);  // <-- Here I'm not sure how to obtain string from VALUE to print it... Whats the type of VALUE?
  yyparse(self, filename);
  MY_TRACE("Finished yyparse for file %s, leak count: %z\n", filename, g_alloc_count);
  return Qnil;
}

From this log you should be able to pinpoint parsing of which file causes the allocation leak.

Keep us updated on the findings for further assistance :-)

How to free yyptr stack in Bison?

There are 1 best solutions below

Related Questions in C

Related Questions in RUBY

Related Questions in VALGRIND

Related Questions in BISON

Related Questions in RAGEL

Trending Questions

Popular # Hahtags

Popular Questions