Does rpyc caching on remote host. How to clean it?

352 Views Asked by At

I'm using Python 3 on the Linux box and classic rpyc to the same host. Having the simple python file, tst.py, in the current directory with two lines in it:

a = {'a': 0}

b = 3

Then I run the following commands:

>>> import rpyc; conn = rpyc.classic.connect('127.0.0.1')
>>> conn.execute('import tst')
>>> conn.eval('dir(tst)')
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', 
    '__package__', '__spec__', 'a', 'b']
>>> conn.eval('tst.a, tst.b')
({'a': 0, 'b': 1}, 3)

Everything is as expected. If I close the connection now: "conn.close()", close the python session, delete 'pycache' from the current directory, edit "tst.py" file, leaving only one line in it:

a = {'a': 0, 'b': 2}

and repeat the same commands above from the scratch in a new session:

..... (skipped) ...

>>> conn.eval('dir(tst)')
'__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', 
   '__package__', '__spec__', 'a', 'b']

>>> conn.eval('tst.a, tst.b')
({'a': 0, 'b': 1}, 3)

So, surprisingly, the result remains the same though the tst.py file changed and the local python cache has been deleted. Can somebody explain to newbie what I've done wrongly and how to clean the previous loaded code. Does "rpyc" have it's own cache? If you change the name of this "tst.py" file and repeat the same procedure again with a new name, then the result will be correct. Again, this points to caching but not in current directory.

1

There are 1 best solutions below

1
On

If you are running rpyc, it means you have one server process (that you connect to, but did not show in the question) and one client process (here it seems you used a REPL).

The server keeps running, from before you connect to it, event after you close the connection; that's what servers do. The problem is that what you see are the Python objects living in the current memory of the server process.

rpyc does not have a cache, you just connected twice to the same process, and thus saw the same thing the second time than the first.
If you change the name of the tst.py file, you will have to import it again but with the new name. What Python does when importing a module is creating a module object and giving it to you.

conn.eval('dir(tst)')

is asking Python to list the content of the dir module, which is what was defined in the corresponding file when it was imported (with any subsequent change that you made to its content).
The second time you connect to the server process, you dir the content of the exact same Python module object, living in the server process memory, so you get the exact same result.

But changing the filename, then importing it will create a second module object, that may be different from the first one.

When you say you "close the Python session", you mean that you end the client process, but it does not impact the server process.

The __pycache__ folder is just a place for Python to put intermediary compilation objects, which reduces the time required to import again a file. You can generally just ignore it. In any case, it has nothing to do with your current problem.

To speak about solutions, it would help to know what you are trying to achieve. But I will provide some generic answers :

  1. You could stop the server process then start it again, so that you can import the module (whose source file content you have changed). If you can afford to restart the rpyc server, I recommend this solution because it is generally simpler compared to the others.
  2. Simply Use setters : define in your tst module functions to change the variable values, and remote-call them. Example :
    >>> import rpyc; conn = rpyc.classic.connect('127.0.0.1')
    >>> conn.execute('import tst')
    >>> conn.eval('dir(tst)')
    ['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', 
        '__package__', '__spec__', 'a', 'b']
    >>> conn.eval('tst.a, tst.b')
    ({'a': 0, 'b': 1}, 3)
    >>> conn.eval('tst.set_a(4)')
    >>> conn.eval('tst.a, tst.b')
    (4, 3)
    
    It requires you to handle everything there is to reset, potentially having to notify some parts of the server that a reset has been done and that some variables should be updated. It is more difficult than the previous answer, but does not require to restart the server process.
  3. You can use importlib.reload to force Python to construct a new module object, with the same name, thus taking into account recent changes to the source file content. But this may cause NASTY problems because there may still exist some references to your old module, so that both contents (the one before and the one after the reload) may coexist in different parts of your application, causing many many confusing bugs and difficult to investigate. I do not recommend doing it.