Python dictionary replacement with space in key

Question

Python dictionary replacement with space in key

3.1k Views Asked by Nhu Trinh At 02 February 2016 at 01:55

I have a string and a dictionary, I have to replace every occurrence of the dict key in that text.

text = 'I have a smartphone and a Smart TV'
dict = {
    'smartphone': 'toy',
    'smart tv': 'junk'
}

If there is no space in keys, I will break the text into word and compare one by one with dict. Look like it took O(n). But now the key have space inside it so thing is more complected. Please suggest me the good way to do this and please notice the key may not match case with the text.

Update

I have think of this solution but it not efficient. O(m*n) or more...

for k,v in dict.iteritems():
    text = text.replace(k,v) #or regex...

Original Q&A

There are 5 best solutions below

mindink On 02 February 2016 at 02:00

If your keys have no spaces:

output = [dct[i] if i in dct else i for i in text.split()]

' '.join(output)

You should use dct instead of dict so it doesn't collide with the built in function dict()

This makes use of a dictionary comprehension, and a ternary operator to filter the data.

If your keys do have spaces, you are correct:

for k,v in dct.iteritems():
    string.replace('d', dct[d])

And yes, this time complexity will be m*n, as you have to iterate through the string every time for each key in dct.

Prune On 02 February 2016 at 02:02

Drop all dictionary keys and the input text to lower case, so the comparisons are easy. Now ...

for entry in my_dict:
    if entry in text:
        # process the match

This assumes that the dictionary is small enough to warrant the match. If, instead, the dictionary is large and the text is small, you'll need to take each word, then each 2-word phrase, and see whether they're in the dictionary.

Is that enough to get you going?

cdlane On 02 February 2016 at 04:42

You need to test all the neighbor permutations from 1 (each individual word) to len(text) (the entire string). You can generate the neighbor permutations this way:

text = 'I have a smartphone and a Smart TV'

array = text.lower().split()

key_permutations = [" ".join(array[j:j + i]) for i in range(1, len(array) + 1) for j in range(0, len(array) - (i - 1))]

>>> key_permutations
['i', 'have', 'a', 'smartphone', 'and', 'a', 'smart', 'tv', 'i have', 'have a', 'a smartphone', 'smartphone and', 'and a', 'a smart', 'smart tv', 'i have a', 'have a smartphone', 'a smartphone and', 'smartphone and a', 'and a smart', 'a smart tv', 'i have a smartphone', 'have a smartphone and', 'a smartphone and a', 'smartphone and a smart', 'and a smart tv', 'i have a smartphone and', 'have a smartphone and a', 'a smartphone and a smart', 'smartphone and a smart tv', 'i have a smartphone and a', 'have a smartphone and a smart', 'a smartphone and a smart tv', 'i have a smartphone and a smart', 'have a smartphone and a smart tv', 'i have a smartphone and a smart tv']

Now we substitute through the dictionary:

import re

for permutation in key_permutations:
    if permutation in dict:
        text = re.sub(re.escape(permutation), dict[permutation], text, flags=re.IGNORECASE)

>>> text
'I have a toy and a junk'

Though you'll likely want to try the permutations in the reverse order, longest first, so more specific phrases have precedence over individual words.

Brendan Abel On 04 February 2016 at 19:25

You can do this pretty easily with regular expressions.

import re

text = 'I have a smartphone and a Smart TV'
dict = {
    'smartphone': 'toy',
    'smart tv': 'junk'
}

for k, v in dict.iteritems():
    regex = re.compile(re.escape(k), flags=re.I)
    text = regex.sub(v, text)

It still suffers from the problem of depending on processing order of the dict keys, if the replacement value for one item is part of the search term for another item.

**Nhu Trinh** · Accepted Answer · 2016-02-04T19:04:00.187000

If the key word in the text is not close to each others (keyword other keyword) we may do this. Took O(n) to me >"<

def dict_replace(dictionary, text, strip_chars=None, replace_func=None):
    """
        Replace word or word phrase in text with keyword in dictionary.

        Arguments:
            dictionary: dict with key:value, key should be in lower case
            text: string to replace
            strip_chars: string contain character to be strip out of each word
            replace_func: function if exist will transform final replacement.
                          Must have 2 params as key and value

        Return:
            string

        Example:
            my_dict = {
                "hello": "hallo",
                "hallo": "hello",    # Only one pass, don't worry
                "smart tv": "http://google.com?q=smart+tv"
            }
            dict_replace(my_dict, "hello google smart tv",
                         replace_func=lambda k,v: '[%s](%s)'%(k,v))
    """

    # First break word phrase in dictionary into single word
    dictionary = dictionary.copy()
    for key in dictionary.keys():
        if ' ' in key:
            key_parts = key.split()
            for part in key_parts:
                # Mark single word with False
                if part not in dictionary:
                    dictionary[part] = False

    # Break text into words and compare one by one
    result = []
    words = text.split()
    words.append('')
    last_match = ''     # Last keyword (lower) match
    original = ''       # Last match in original
    for word in words:
        key_word = word.lower().strip(strip_chars) if \
                   strip_chars is not None else word.lower()
        if key_word in dictionary:
            last_match = last_match + ' ' + key_word if \
                         last_match != '' else key_word
            original = original + ' ' + word if \
                       original != '' else word
        else:
            if last_match != '':
                # If match whole word
                if last_match in dictionary and dictionary[last_match] != False:
                    if replace_func is not None:
                        result.append(replace_func(original, dictionary[last_match]))
                    else:
                        result.append(dictionary[last_match])
                else:
                    # Only match partial of keyword
                    match_parts = last_match.split(' ')
                    match_original = original.split(' ')
                    for i in xrange(0, len(match_parts)):
                        if match_parts[i] in dictionary and \
                           dictionary[match_parts[i]] != False:
                            if replace_func is not None:
                                result.append(replace_func(match_original[i], dictionary[match_parts[i]]))
                            else:
                                result.append(dictionary[match_parts[i]])
            result.append(word)
            last_match = ''
            original = ''

    return ' '.join(result)

Python dictionary replacement with space in key

There are 5 best solutions below

Related Questions in PYTHON

Related Questions in ALGORITHM

Related Questions in DICTIONARY

Related Questions in REPLACE

Related Questions in TEXTMATCHING

Trending Questions

Popular # Hahtags

Popular Questions