I am working a Shunting Yard program in Python and running into a problem with a math expression such as "3x + 4 * y" or "3 sin(x)".
The problem is that the Python tokenize function doesn't know that I need "3x" to become ["3", "*" "x"] instead of just ["3", "x"]. I can, of course, stipulate that the user must enter 3*x instead of just 3x but that's so lame. There should be a cleaner way to get around this.
Here's the tokenizing code I am using (copied from stackoverflow postings):
expression="3x + 4*y"
from io import StringIO
print [token[1] for token in tokenize.generate_tokens (StringIO(expression).readline) if token[1]]
which gives me:
[u'3', u'x', u'+', u'4', u'*', u'y']
but I need:
[u'3', u'*', u'x', u'+', u'4', u'*', u'y']
in order for the Shunting Yard code to work properly.
Thank you for helping.
You'll need to pay attention to the types of the tokens returned and preprocess the token array to get what you want. In your example, 3x results in a NUMBER followed by a NAME token. Insert the implied OP token and you're away.