Is there a way to create bins in python instead of listing all the bin numbers (as seen in code below), and maybe without having to use np.digitize?

Question

Is there a way to create bins in python instead of listing all the bin numbers (as seen in code below), and maybe without having to use np.digitize?

1.3k Views Asked by Preston B. At 15 June 2025 at 16:18

In my code, I have created 10 bins (specific ranges of bins are listed below):

4100000-4155304
4155304-4210608
4210608-4321216
4321216-4542432
4542432-4984865
4984865-5327533
5327533-5670201
5670201-5746217
5746217-5873109

5873109-6000000

bins = [4100000,4155304,4210608,4321216,4542432,4984865,5327533,5670201,5746217,5873109,6000000]
bin_indices = np.digitize(bins_array, bins)

Is there a way I can do this without having to list all the bin numbers (bins = [bin numbers]), and maybe also without having to use np.digitize? Thank you very much!

Original Q&A

There are 2 best solutions below

**Michael** · Answer 1

Simply use the numpy.arange method:

bins = np.arange(4100000, 6000000, 55304)
bins

Output

array([4100000, 4155304, 4210608, 4265912, 4321216, 4376520, 4431824,
       4487128, 4542432, 4597736, 4653040, 4708344, 4763648, 4818952,
       4874256, 4929560, 4984864, 5040168, 5095472, 5150776, 5206080,
       5261384, 5316688, 5371992, 5427296, 5482600, 5537904, 5593208,
       5648512, 5703816, 5759120, 5814424, 5869728, 5925032, 5980336])

Cheers

**bbartling** · Answer 2

I cant find the original author of a different SO post where I got this from using Pandas but maybe try something like this below that I thru together really fast for an idea to try. The data frame is just numpy random range to generate the fake data in the ranges you are looking for.

import pandas as pd
import numpy as np

#create bins & categories for data ranges
cats = ['4100000_4155303',
        '4155304_4210608',
        '4210608_4321215',
        '4321216_4542431',
        '4542432_4984864',
        '4984865_5327532',
        '5327533_5670200',
        '5670201_5746216',
        '5746217_5873108',
        '5873109_6000000']

bins = [0,
        4100000,
        4210608,
        4321215,
        4542431,
        4984864,
        5327532,
        5670200,
        5746216,
        5873108,
        6000000]


def binn(df):
    df = (df.groupby([df.index, pd.cut(df['A'], bins, labels=cats)])
                .size()
                .unstack(fill_value=0)
                .reindex(columns=cats, fill_value=0))
    return df


rng = np.random.default_rng()
df = pd.DataFrame(rng.integers(4155304, 6000000, size=(1000, 1)), columns=list('A'))

dfBinned = binn(df)

print('All data binned in column A of the df')
print(dfBinned.sum(axis = 0))

This prints:

All data binned in column A of the df
A
4100000_4155303      0
4155304_4210608     35
4210608_4321215     42
4321216_4542431    130
4542432_4984864    239
4984865_5327532    174
5327533_5670200    205
5670201_5746216     37
5746217_5873108     63
5873109_6000000     75
dtype: int64

Is there a way to create bins in python instead of listing all the bin numbers (as seen in code below), and maybe without having to use np.digitize?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in ARRAYS

Related Questions in NUMPY

Related Questions in BINS

Related Questions in DIGITIZATION

Trending Questions

Popular # Hahtags

Popular Questions