TypeError: list indices must be integers or slices, not str on a Windows 10

Question

TypeError: list indices must be integers or slices, not str on a Windows 10

3.1k Views Asked by AudioBubble At 29 July 2025 at 05:29

I am trying to find out the inverse document frequency of a list of Sherlock Holmes stories. Have a look at the code:

Inverse document frequency is the measure of how common or rare a word is across multiple documents.

So, that would mean that Inverse Document Frequency or idf for short, measures how common a word is in a particular document which isn't quite as common in others.

The formula for idf is: log x (Total_Documents/The_Number_Of_Documents_Containing(word))

main.py

import math
import nltk
import os
import sys


def main():

    if len(sys.argv) != 2:
        sys.exit("Usage: python main.py corpus")
    print("Loading data...")
    corpus = load_data(sys.argv[1])

    words = set()
    for filename in corpus:
        words.update(corpus[filename])

    idfs = list()
    for word in words:
        f = sum(word in corpus[filename] for filename in corpus)
        idf = math.log(len(corpus) / f)
        idfs[word] = idf

    tfidfs = dict()
    for filename in corpus:
        tfidfs[filename] = []
        for word in corpus[filename]:
            tf = corpus[filename][word]
            tfidfs[filename].append((word, tf * idfs[word]))

    for filename in corpus:
        tfidfs[filename].sort(key=lambda tfidf: tfidf[1], reverse=True)
        tfidfs[filename] = tfidfs[filename][:5]

    print()
    for filename in corpus:
        print(filename)
        for term, score in tfidfs[filename]:
            print(f"    {term}: {score:.4f}")


def load_data(directory):
    files = dict()
    for filename in os.listdir(directory):
        with open(os.path.join(directory, filename)) as f:

            contents = [
                word.lower() for word in
                nltk.word_tokenize(f.read())
                if word.isalpha()
            ]

            frequencies = dict()
            for word in contents:
                if word not in frequencies:
                    frequencies[word] = 1
                else:
                    frequencies[word] += 1
            files[filename] = frequencies

    return files


if __name__ == "__main__":
    main()

But when I run python .\main.py .\shelock_holmes\ in Powershell,

I get this confusing error:

Loading data...
Traceback (most recent call last):
  File ".\main.py", line 65, in <module>
    main()
  File ".\main.py", line 22, in main
    idfs[word] = idf
TypeError: list indices must be integers or slices, not str

Can anybody please help me?

Original Q&A

There are 2 best solutions below

Wasif On 17 October 2020 at 06:37

Actually idfs is a list. And idfs[word] = idf adds key-values to it like a dictionary. So you should instead of idfs = list() make it idfs = {} a dictionary. Otherwise if you need list, then use .append() to add items to the end.

**CryptoFool** · Accepted Answer

You define idfs as a list:

idfs = list()

If udfs is a list, then in this assignment:

idfs[word] = idf

word must be an integer, because it specifies an index or position within the list.

But it appears that words is a list of str, and so inside the iteration:

for word in words:

word is a str. Since a str is not an integer, the line

idfs[word] = idf

causes the error you're getting, for exactly the reason that it explains. Maybe idfs should be a dict rather than a list, defined like this:

idfs = dict()

Then the line:

idfs[word] = idf

interprets word as a key in the dictionary, and assigns idf as the value of that key in the dict. Dictionary keys can be any object, and are most often strings, so this makes good sense.

TypeError: list indices must be integers or slices, not str on a Windows 10

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in NLTK

Related Questions in FREQUENCY

Related Questions in WORD-FREQUENCY

Trending Questions

Popular # Hahtags

Popular Questions