I'm writing a class that parses HTML in order to provide an interface to a profile on a webpage. It looks something like this:
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
def username(self):
return self.title.split(':')[0]
Except more complex and time consuming. Since I know that the underlying profiles aren't going to be changing during the lifetime of a Profile
object, I thought this would be a good place to cache results in order to avoid recalculating values that are already known. I implemented this with a decorator, and the result looks like this:
def cached_resource(method_to_cache):
def decorator(self, *args, **kwargs):
method_name = method_to_cache.__name__
try:
return self._cache[method_name]
except KeyError:
self._cache[method_name] = method_to_cache(self, *args, **kwargs)
return self._cache[method_name]
return decorator
class Profile(BeautifulSoup):
def __init__(self, page_source):
super().__init__(page_source)
self._cache = {}
@cached_resource
def username(self):
return self.title.split(':')[0]
When I give this code to pylint, it complains about cached_resource
having access to a protected variable of a client class.
I realize that the distinction between public and private isn't a huge deal in Python, but I'm still curious -- have I done something bad here? Is it poor style to have decorators rely on implementation details of the classes they're associated with?
EDIT: I'm unclear about how the closure in Duncan's answer works, so maybe this is a little bit kludge-y, but would this be a simpler solution?
def cached_resource(method_to_cache):
def decorator(self, *args, **kwargs):
method_name = method_to_cache.__name__
try:
return self._cache[method_name]
except KeyError:
self._cache[method_name] = method_to_cache(self, *args, **kwargs)
except AttributeError:
self._cache = {}
self._cache[method_name] = method_to_cache(self, *args, **kwargs)
finally:
return self._cache[method_name]
return decorator
There is a bit of a code smell about that, I think I would agree with pylint on this one though it is quite subjective.
Your decorator looks like it is a general-purpose decorator, but it is tied into the internal implementation detail of the class. If you tried to use it from another class it won't work without the initialisation of
_cache
in__init__
. The linkage I don't like is that the knowledge of an attribute called '_cache' is shared between both the class and the decorator.You could move the initialisation of
_cache
out of__init__
and into the decorator. I don't know if that would help pacify pylint and it still requires the class to know about and avoid using the attribute. A cleaner solution here (I think) would be to pass the name of the cache attribute into the decorator. That should break the linkage cleanly:And if you don't like a lot of decorator calls repeating the name of the attribute then:
Edit: Martineau suggests this may be overkill. It could be if you don't actually need separate access to the
_cache
attribute inside the class (e.g. to have a cache reset method). In that case you could manage the cache entirely within the decorator, but if you are going to do that you don't need a cache dictionary on the instance at all, as you can store the cache in the decorator and key on theProfile
instance: