Assigning values to objects attributes that don't exist

415 Views Asked by At

I'm doing a data mining homework with python(2.7). I created a weight dict for all words(that exist in the category), and for the words that don't exist in this dict, i want to assign a default value. First I tried with setdefault for everykey before using it, it works perfectly, but somehow I think it doesn't look so pythonic. Therefore I tried using defaultdict, which works just fine most of the time. However, sometimes it returns an incorrect value. First I thought it could be caused by defaultdict or lambda function, but apparently there are no errors.

for node in globalTreeRoot.traverse():
    ...irrelevant...
    weight_dict = {.......}
    default_value = 1.0 / (totalwords + dictlen)
    node.default_value = 1.0/ (totalwords + dictlen)
    ......
    node.weight_dict_ori = weight_dict
    node.weight_dict = defaultdict(lambda :default_value,weight_dict)

So, when I tried to print a value that doesn't exist during the loop, it gives me a correct value. However, after the code finishes running, when I try:

print node.weight_dict["doesnotexist"],

it gives me an incorrect value, and when incorrect usually a value related to some other node. I tried search python naming system or assign values to object attributes dynamically, but didn't figure it out.

By the way, is defaultdict faster than using setdefault(k,v) each time?

3

There are 3 best solutions below

2
On BEST ANSWER

This is not a use case of defaultdict.

Instead, simply use get to get values from the dictionary.

val = dict.get("doesnotexist", 1234321)

is perfectly acceptable python "get" has a second parameter, the default value if the key was not found.

If you only need this for "get", defaultdict is a bit overkill. It is meant to be used like this:

example = defaultdict(list)
example[key].append(1)

without having to initialize the key-list combination explicitly each time. For numerical values, the improvements are marginal:

ex1, ex2 = dict, defaultdict(lambda: 0)
ex1[key] = ex1.get(key, 0) + 1
ex2[key] += 1

Your original problem probably is because you reused the variable storing the weight. Make sure it is local to the loop!

var = 1
ex3 = defaultdict(lambda: var)
var = 2
print ex3[123]

is supposed to return the current value of var=2. It's not substituted into the dictionary at initialization, but behaves as if you had define a function at this position, accessing the "outer" variable var.

A hack is this:

def constfunc(x):
  return lambda: x
ex3 = defaultdict(constfunc(var))

Now constfunc is evaluated at initialization, x is a local variable of the invocation, and the lambda now will return an x which does not change anymore. I guess you can inline this (untested):

ex3 = defaultdict((lambda x: lambda: x)(var))

Behold, the magics of Python, capturing "closures" and the anomalies of imperative languages pretending to do functional programming.

2
On

setdefault is definitely what you should use to set a default value.

for node in globalTreeRoot.traverse():
    node.default_value = 1.0 / (totalwords + dictlen)
    node.weight_dict = {}
    # if you did want to use a defaultdict here for some reason, it would be
    #   node.weight_dict = defaultdict(lambda: node.default_value)
    for word in wordlist:
        value = node.weight_dict.setdefault(word, node.default_value)
5
On

Apparently, there is something wrong with defaultdict.

d1 = {"a":10,"b":9,"c":8}
seven = 7
d2 = defaultdict(lambda :seven,d1)
seven = 8
d3 = defaultdict(lambda :seven,d1)

And the result:

>>> d2[4234]
8

I still don't understand why it works this way. As for my work, I'll stick with setdefault.

UPDATE: Thanks for answering. I misunderstood how variable scoping works in Python.