I'm doing a data mining homework with python(2.7). I created a weight dict for all words(that exist in the category), and for the words that don't exist in this dict, i want to assign a default value. First I tried with setdefault for everykey before using it, it works perfectly, but somehow I think it doesn't look so pythonic. Therefore I tried using defaultdict, which works just fine most of the time. However, sometimes it returns an incorrect value. First I thought it could be caused by defaultdict or lambda function, but apparently there are no errors.
for node in globalTreeRoot.traverse():
...irrelevant...
weight_dict = {.......}
default_value = 1.0 / (totalwords + dictlen)
node.default_value = 1.0/ (totalwords + dictlen)
......
node.weight_dict_ori = weight_dict
node.weight_dict = defaultdict(lambda :default_value,weight_dict)
So, when I tried to print a value that doesn't exist during the loop, it gives me a correct value. However, after the code finishes running, when I try:
print node.weight_dict["doesnotexist"],
it gives me an incorrect value, and when incorrect usually a value related to some other node. I tried search python naming system or assign values to object attributes dynamically, but didn't figure it out.
By the way, is defaultdict faster than using setdefault(k,v) each time?
This is not a use case of
defaultdict
.Instead, simply use
get
to get values from the dictionary.is perfectly acceptable python "get" has a second parameter, the default value if the key was not found.
If you only need this for "get", defaultdict is a bit overkill. It is meant to be used like this:
without having to initialize the key-list combination explicitly each time. For numerical values, the improvements are marginal:
Your original problem probably is because you reused the variable storing the weight. Make sure it is local to the loop!
is supposed to return the current value of
var=2
. It's not substituted into the dictionary at initialization, but behaves as if you had define a function at this position, accessing the "outer" variable var.A hack is this:
Now constfunc is evaluated at initialization, x is a local variable of the invocation, and the lambda now will return an x which does not change anymore. I guess you can inline this (untested):
Behold, the magics of Python, capturing "closures" and the anomalies of imperative languages pretending to do functional programming.