Is there a function for condensing an expression that has many redundant terms so as to not have to recalculate them?

65 Views Asked by At

Consider the simple expression:

c = ((a * b) / (x * y)) * sqrt(a * b + 1 + x * y)

The computer only needs to calculate a * b and x * y once, but evaluating the expression as given would make the computer calculate them twice. It would have been better to write:

d = a * b
e = x * y
f = d / e
g = d + 1 + e
c = f * sqrt(g)

Is there a function (ideally python but any language should suffice) to convert the first code block into the second? The problem is I have a very long (pages long output from a Mathematica Solve) expression that has many unique but duplicate terms (meaning, many more than the 2 unique terms a * b and x * y that were duplicated (each appeared more than once) in the expression above) so it's not reasonable for me to do the reducing by hand. In addition, a * b could appear later when calculating another quantity. FYI I need speed in my application and so these sort of optimizations are useful.

I tried googling for this functionality but perhaps I don't know the right search terms.

2

There are 2 best solutions below

0
Jérôme Richard On

Yes, this optimization is called Common Sub-expression Elimination and it is done in all mainstream optimising compilers since decades (I advise you to read the associated books and research papers about it).

However, this is certainly not a good idea to use this kind of optimization in Python. Indeed, Python is not design with speed in mind and the default interpreter does not perform such optimization (in fact it almost perform no optimization). Python codes are expected to be readable and not generated. A generated code is likely slower than a native code while having nearly no benefit over a generated native code. There are just-in-time (JIT) compilers for Python like PyPy doing such optimization so it is certainly wise to try them. Alternatively, Cython (not to be confused with CPython) and Numba can strongly help to remove the overheads Python thanks to the (JIT/AOT) compilation of a Python code to a native execution. Indeed, recomputing multiple time the same expression if far from being the only one issue with the CPython interpreter: basic floating-point/arithmetic operations tends to be at least one order of magnitude slower than native code (mainly due to the dynamic management of objects including their allocation, not to mention the overhead of the interpreter loop). Thus, please use a compiler that does such optimizations very well instead of rewriting the wheel.

0
Philliph On

Thank you, Jerome. (I tried upvoting but don't yet have enough "reputation"). Your response led me to the answer: Use sympy.cse and then copy paste its output and then use a compiler module like Numba as you suggested. One note is sympy appears to only reduce to a certain extent and then decides it's simple enough, e.g.

import sympy as sp
a,b,c,x,y = sp.symbols('a b x y')
input_expr = ((a * b) / (x * y)) * sp.sqrt(a * b + 1 + x * y)
repl, redu = sp.cse(input_expr)
print(repl, redu)
for variable, expr in repl:
    print(f'{variable} = {expr}')
>>> [(x0, a*b)] [x0*sqrt(x*y + x0 + 1)/(x*y)]
>>> x0 = a*b

So, it does not assign x1 to x*y, but if instead the expression is slightly more complicated, then it does.

input_expr = ((a * b) / (x * y + 2)) * sp.sqrt(a * b + 1 + x * y)
>>> [(x0, x*y), (x1, a*b)] [x1*sqrt(x0 + x1 + 1)/(x0 + 2)]
>>> x0 = x*y
>>> x1 = a*b

But, even for my very large expression, sympy reduced it to a point that I can finish up by hand without difficulty, so I consider my problem solved.

Best practice for using common subexpression elimination with lambdify in SymPy