bucketing values in python

361 Views Asked by At

I want to split my values associating them with hash between buckets. As HASH I am using embedded hash Python function which range is -abs(sys.maxsize) to sys.maxsize.

I have created a function to list buckets by values. I don't understand why I don't get the exact same values by splitting them. I have returned the values with non-scientific notations. Is it the issue? Do you have a better solution to perform this?

I do all of this to bucket with pandas after that: https://www.datasciencemadesimple.com/binning-or-bucketing-of-column-in-pandas-python-2/

import sys

#default values (min & max are hash table values for hash())
min = -abs(sys.maxsize)
max = sys.maxsize
n_buckets = 5

def get_buckets(min,max,n_buckets):
    # get the length for bucketing
    hash_lenght=abs(min - max)
    # return the length for each bucket size
    bucket_size = hash_lenght // n_buckets
    i = 0
    buckets = [min]
    # for each bucket i retrieve the begin value (the end value for last bucket)
    while i < n_buckets:
        i = i+1
        last_elem = buckets[-1]
        new_size = last_elem + bucket_size
        buckets.append(new_size)
        print(buckets)
        print(i)

    return bucket_size

print(get_buckets(min,max,n_buckets))
0

There are 0 best solutions below