Why does linecache check for the length of the tuple elements in the cache?

255 Views Asked by At

In https://github.com/python/cpython/blob/3.6/Lib/linecache.py there are 3 instances of this

In getines

def getlines(filename, module_globals=None):
    if filename in cache:
        entry = cache[filename]
        if len(entry) != 1:
            return cache[filename][2]

In updatecache

def updatecache(filename, module_globals=None):
    if filename in cache:
        if len(cache[filename]) != 1:
            del cache[filename]

In lazycache

def lazycache(filename, module_globals):
    if filename in cache:
        if len(cache[filename]) == 1:
            return True
        else:
            return False

I am writing my own version of linecache and to write the tests for it I need to understand the scenario in which the tuple can be of length 1

There was one scenario in which the statement in getlines got executed. It was when the file was accessed for the first time and stored in the cache and then removed before accessing it the second time. But I still cannot figure out why is it there in the other two functions.

It would be very helpful if someone could help me understand the purpose of using this length check.

1

There are 1 best solutions below

1
abarnert On

Look at the places the code stores values in the cache. It can store two different kinds of values under cache[filename]:

  • A 4-tuple of size, mod time, list of lines, and name (here and here).
  • A 1-tuple of a function that returns a list of lines on demands (here).

The 1-tuple is used for setting up for lazy loading of modules. Normal files, to lazy-load them, you just open and read. But module source may be available from the module's loader (found via the import system), but not available just by opening and reading a file—e.g., zipimport modules. So lazycache has to stash the loader's get_source method, so if it later needs the lines, it can get them.

And this means that, whenever it uses those cache values, it has to check which kind it's stored and do different things. If it needs the lines now, and it has a lazy 1-tuple, it has to go load the lines (via updatecache; if it's checking for cache eviction and it finds a lazy tuple that was never evaluated, it drops it; etc.

Also notice that in updatecache, if it's loaded the file from a lazy 1-tuple, it doesn't have a mod time, which means that in checkcache, it can't check whether the file is stale. But that's fine for module source—even if you change the file, the old version is still the one that's imported and being used.


If you were designing this from scratch, rather than hacking on something that's been in the stdlib since the early 1.x dark ages, you'd probably design this very differently. For example, you might use a class, or possibly two classes implementing the same interface.

Also, notice that a huge chunk of the code in linecache is there to deal with special cases related to loading module source that, unless you're trying to build something that reflects on Python code (like traceback does), you don't need any of. (And even if you are doing that, you'd probably want to use the inspect module rather than talking directly to the import system.)

So, linecache may not be the best sample code to base your own line cache on.