Python: splitting a text file by a # character and summarizing the total

595 Views Asked by At

main(), invoice.close(), and the print function just above it all throw an "invalid syntax" exception.

I don't know anything about dictionaries or with functions (at this time), so this is the best i could come up with.

Here's my desired outcome:

enter image description here

Here are the contents of invoice.txt:

#### CONTENTS OF invoice.txt ###
# hammer#9.95
# saw#20.15
# shovel#35.40

EDIT *** Here is the exception with the added bracket enter image description here

print('{0: <10}'.format('Item'), '{0: >17}'.format('Cost'), sep = '' )

def main():
    invoice = open("invoice.txt", "r")

    count = 0
    total = 0

    hammer = invoice.readline()

    while invoice != "":
        saw = invoice.readline()
        shovel = invoice.readline()

        hammer = hammer.rstrip("\n")
        saw = saw.rstrip("\n")
        shovel = shovel.rstrip("\n")

        hammer = hammer.split("#")
        saw = saw.split("#")
        shovel = shovel.split("#")

        print('{0: <10}'.format(hammer[0]), '{0: >17}'.format('$' + hammer[1]), sep = '' )
        print('{0: <10}'.format(saw[0]), '{0: >17}'.format('$' + saw[1]), sep = '' )
        print('{0: <10}'.format(shovel[0]), '{0: >17}'.format('$' + shovel[1]), sep = '' )

        # total = total + float(hammer[1]) + float(saw[1]) + float(shovel[1])         # DOESN"T WORK
        total = total + (int(float(hammer[1])) + int(float(saw[1])) + int(float(shovel[1]))

        # total = total + (int(hammer[1])) + (int(saw[1])) + (int(shovel[1]))         # DOESN"T WORK
        print("{0: <10}".format("Total cost") + "{0: >17}".format("{0:.2f}".format(float(total))))

    invoice.close()
main()

3

There are 3 best solutions below

0
On

Assuming the contents are actually just

hammer#9.95
saw#20.15
shovel#35.40

this is just a DSV using a # as the delimiter. Fortuantely, most CSV programs support multiple delimiters, including python's.

import csv
from io import StringIO 


def print_report(items, fields):
    print("\t".join(fields))
    total_cost = sum(float(item["Cost"]) for item in items) 
    for item in items:
        print(f"{item['Item']}\t${float(item['Cost']):.2f}")
    print()
    print(f"Total cost\t${total_cost:.2f}")
    print(f"Number of tools\t{len(items)}")

fields = ["Item", "Cost"]
# This StringIO is basically like opening a file
s = StringIO(u"hammer#9.95\nsaw#20.15\nshovel#35.40")
items = list(csv.DictReader(s, delimiter="#", fieldnames=fields))
print_report(items, fields)

Item    Cost
hammer  $9.95
saw $20.15
shovel  $35.40

Total cost  $65.50
Number of tools 3
0
On

A lot of the formatting code here is redundant and hardcoding the names from the text file isn't scalable. If you need to introduce a new item, you have to change all of the code. We should stay DRY by writing a helper function to handle printing. format can accept multiple arguments which simplifies matters.

Everything else is a matter of reading the file (use a context manager with so you don't need to worry about remembering to call close), matching valid lines, splitting, stripping and casting the data. Once everything is in a list, we can call sum and len functions on that list to produce the desired totals.

I have hardcoded the column sizes--those really should be parameters, and we should loop over the data to precompute column names. But that's probably overkill.

Use a CSV library if your data is actually a CSV but the rest of the code is pretty much the same.

import re

def fmt(x, y):
    if isinstance(y, float):
        y = "$" + "{:1,.2f}".format(y).rjust(5)

    return '{0:<18}{1:>10}'.format(x, y)

if __name__ == "__main__":
    items = []

    with open("invoice.txt", "r") as f:
        for line in f:
            if re.search(r"#\d+\.\d+$", line):
                items.append([x.strip() for x in line.split("#") if x])

    print(fmt("Name", "Cost"))

    for name, price in items:
        print(fmt(name, float(price)))

    print()
    print(fmt("Total cost", sum([float(x[1]) for x in items])))
    print(fmt("Number of tools", len(items)))

Output:

Name                    Cost
hammer                $ 9.95
saw                   $20.15
shovel                $35.40
Total cost            $65.50
Number of tools            3
2
On

Syntax Errors

Let's take a step back and check out why your code wasn't working. Here's the error message I got when I ran your code:

  File "test_2.py", line 31
    print("{0: <10}".format("Total cost") + "{0: >17}".format("{0:.2f}".format(float(total))))
        ^
SyntaxError: invalid syntax

When reading error messages in Python, there's a lot of information there that can be overwhelming at the start, but once you learn how to extract it, they are really helpful! Let's break the error message down:

  • The error is in test_2.py, at line 31. test_2 is what I called my file in this case, your's might be different.
  • The type of error is a syntax error, meaning that the line of code is unable to be parsed correctly by Python.
  • Note the arrow below the print statement: This indicates exactly which symbol the error was raised at during parsing.

A very common syntax error that pop up is unmatched parentheses. What makes this error even more tricky, is the parser will tell you the point where it discovered a problem with parenthesis matching, but the problem might be located on an entirely different line. This is exactly what is happening here, so my general rule for syndax errors is check the line that the error tells you to check, as well as a few lines before and after. Let's take a closer look at the two lines you indicated didn'd work:

total = total + (int(float(hammer[1])) + int(float(saw[1])) + int(float(shovel[1]))

print("{0:<10}".format("Total cost") + "{0:>17}".format("{:.2f}".format(float(total))))

In the first line, you open a parenethesis (, and then call int() and float() on hammer, saw, and shovel respectively, but you never put a closing parenthesis down! So the error seems to be earlier than Python says, but from the parser's perspective, there's actually no issue at the moment. It's not until the parser reaches line 31 that we get a problem; The parser was expecting a closing parenthesis, but instead it got a print function, and raised an error at the print function on line 31 instead of on the previous line where there was a missing parenthesis.

These types of things take time to get used to, but you will learn the tips and tricks over time.

Using variables better

In your Python program you shouldn't assume the contents of a text file. At the moment, you are assuming there will only ever be 3 items, and your variable names seem to reflect what you are assuming the items are going to be. What happens if the text file contains thousands of items? It's not feasible to make a new variable for each item in the text file. You should use generic names like tool instead of hammer, saw and shovel, and peocess each item from the text file in a separate iteration of the loop, instead of all at once.

Example Solution

Here's an example solution for you. Like you said, using dictionaries and would be nice, so the solution below implements a function that returns a dictionary of items, where the keys are the item names, and the values are the item costs. The get_invoice function is then used to return a formatted string for the invoice, and the invoice is printed out.

def main():
    try:
        products = read_products("invoice.txt")
    except IOError as e:
        print('An error occurred while opening the file:', e)
        return
    except ValueError as e:
        print('An error occurred while reading the contents of the file:', e)
        return
    invoice = get_invoice(products)
    print(invoice)

def get_invoice(products):
    footer_headings = ('Total cost', 'Number of tools')

    # Get the length of the longest product name, to aid formatting.
    l_pad = len(max(products, key=lambda k: len(k)))
    l_pad = max(l_pad, len(max(footer_headings, key=lambda h: len(h))))
    r_pad = 10

    total = sum(products.values())
    count = len(products)

    header = 'Item'.ljust(l_pad) + 'Cost'.rjust(r_pad)
    body = '\n'.join([f'{p[0]}'.ljust(l_pad) + (f'${p[1]:.2f}').rjust(r_pad) for p in products.items()])
    footer = f'{footer_headings[0]}'.ljust(l_pad) + f'${total:.2f}'.rjust(r_pad) + '\n' + f'{footer_headings[1]}'.ljust(l_pad) + f'{count}'.rjust(r_pad)

    return f'{header}\n{body}\n\n{footer}'

def read_products(file_path):
    products = {}

    with open(file_path, "r") as f:
        for line in f:
            # Split the data
            data_split = line.rstrip("\n").split("#")

            # Ensure the data is safe to unpack
            if len(data_split) != 2:
                raise ValueError(f'Expected two fields but got {len(data_split)}')

            # Add the data to a dictionary for later use
            name, cost = data_split
            cost = float(cost)
            products[name] = cost

    return products

if __name__ == "__main__":
    main()