Designing two programs to accomplish a text-processing task on Windows

40 Views Asked by At

Task: Given the song lyrics of a sufficiently large and skew-free sampling of every American and U.K. song, take the lyrics, decompose them into unique words (called lyrics'), and then store them into a growing list of unique words (called the lexicon). If lyrics' has words not stored in the lexicon, then those words are added to the lexicon. If every word in lyrics' is also found in the lexicon, then adding lyrics' causes the lexicon to remain the same (idempotency).

Objects: One program to turn the song lyrics into a list of unique words, one program to add the output of the first program to the lexicon, and the lexicon itself (probably a simple .txt file, but if there's something better, please recommend.)

Operating System: Windows 11 Home.

The concern is that the list might grow very large, so large, that doing a linear search through the lexicon of each word in lyrics' to check for uniqueness might be prohibitively expensive.

What would be the most efficient way to set up and program these three objects so that this task might be successful?

0

There are 0 best solutions below