IronPython garbage collection - How does it provides compatibility with C-extensions?

27 Views Asked by At

In this part of the talk on GIL by Larry Hastings, there is an explanation about how ironclad provides C-extension compatibility with IronPython. This is the interesting part of the talk:

We implemented something called Ironclad which was an implementation of the Python C-API for IronPython and IronPython doesn't have a GIL and it doesn't use reference counting but we maintain binary compatibility with existing binaries.

What you have with those existing binaries is you have C objects (effectively python objects implemented in C) that expect to use reference counting so we had a hybrid system. For objects that were returned from the C extension we artificially inflated their reference counting by one so that the macros compiled into the binaries would never be triggered by getting down to zero and then we had a proxy object that if garbage collection was entered for this object then we would decrement the reference count to zero.

Technically we had a leak though because if you pass in references to python objects to the C extension, the C extension could keep references to those alive and essentially what Larry is saying is that if you move to something like Mark and sweep for the main python C interpreter, those C objects are opaque to the the garbage collector. It can't know what internal references you have.

My questions:

1- If GC is implemented in the interpreter itself (e.g. IronPython) what are the macros he is referring to? (That we should care about and increment ref count for the sake of it!)

2- What is the role of the proxy object? It is a proxy for a python object implemented in a C extension? Why don't we decrement refcount directly on the original object?

2

There are 2 best solutions below

2
MSalters On
  1. The macros's he is referring to are the Python-C API macro's which are used on the C side.

  2. The role of the proxy object is to translate between the ref counting GC of CPython and the .Net GC of IronPython.

This is an advanced subject. If you're not familiar with the two Python implementations, these answers probably won't help you. But there are no simple answers here.

0
Lee On

A module could do practical anything. Some modules kept objects during method/functions call for example when the calculation could be longer. In normal Python. In that case the object should add one to the reference count until it is done.

At a very basic level, a Python object's reference count is incremented whenever the object is referenced, and it's decremented when an object is dereferenced. If an object's reference count is 0, the memory for the object is deallocated. Your program's code can't disable Python's reference counting.

For example you have three methods.

start_calculation(data)
get_finished_calculations()

calling the first one would set data on some module stack and keep some python objects the ref count to data should be incremented by one. When finally returned by get_finished_calculations with the results the ref count would be reduced. But because these stack is on the C modules side which is considered unmanaged code in .Net terms the garbage collector can't see it und would delete the data even when still needed by the module for the calculation because IronPython had nor Referecne count like CPython has.