I have a dataframe with 19M rows of different customers (~10K customers) and for their daily consumption over different date ranges. I have resampled this data into weekly consumption and the resulted dataframe is 2M rows. I want to know the ranges of consecutive dates for each customer and select those with the max(range). Any ideas? Thank you!
How to select a range of consecutive dates of a dataframe with many users in pandas
107 Views Asked by dogo At
1
There are 1 best solutions below
Related Questions in PANDAS
- object of type 'float' has no len() when using to_stata
- Pandas date ranges and averaging the counts
- Using Pandas how do I deduplicate a file being read in chunks?
- How to count distance to the previous zero in pandas series?
- Succint way of handling missing observations in numpy.cov?
- Pandas and GeoPandas indexing and slicing
- convert kenneth French data to daily datetime format in python
- keep timezone "CET" from convert into "CEST" in python
- Calculating the difference in dates in a Pandas GroupBy object
- python.exe crashes down while interpreting 'read_csv' command of pandas library
- Column is not appended to pandas DataFrame
- reshaping and rearranging a pandas table
- csv parsing and manipulation using python
- Using StringIO with pandas.read_csv keyword arguments
- Pandas is installed but import pandas throws error
Related Questions in DATAFRAME
- Extract series of observations from dataframe for complete sets of data
- R: Avoid loop or row apply function
- using apply with an anonymous function which uses specific locations in the row
- R dplyr - error in subsetting of local data frame
- subtract column1 (dataframe1) from column2 (dataframe2) based on matching column in both R
- How to get maximum value from a column in a data.frame and get ALL records
- Column is not appended to pandas DataFrame
- Convert list of overlapping data.frames into single data.frame
- XML to data frame with missing nodes
- Summing multiple columns to equal -1,0,1
- Apply function iteratively across a dataframe
- How to parse data from .TX0 file into dataframe
- Join 2 DataFrames on an index without introducing nans on missing indices
- Convert list returned by sapply() to a data.frame
- How to replace values in a data frame with another value
Related Questions in TIME-SERIES
- How to best compress timeseries into a different duration?
- Calculating the difference in dates in a Pandas GroupBy object
- Simple Python Median Filter for time series
- Converting time series to data frame, matrix, or table
- Highstock time series navigator blank
- How to compute relative difference in a circular domain (weekday) in R
- How to store and retrieve time series using google appengine using python
- Plotting multivariate time-series data in R
- Reintroduction of AR and GARCH processes in MATLAB
- value from a past, potentially missing month in dataframe
- Forecasting an Arima Model in R Returning Strange Error
- computed initial MA coefficients are not invertible [Python] [TSA] [ARIMAX] [CrossValidation]
- Load local dataset into Python 3.4 Pandas statsmodel for Time Series
- Combining time series data into a single data frame
- Plotting Probability Density Heatmap Over Time in R
Related Questions in DATA-ANALYSIS
- R sensitivity package (fast99)
- Difference between weka tool's correlation coefficient and scikit learn's coefficient of determination score
- What are the approaches to the Big-Data problems?
- How to get a number of probability distributions "averaged"?
- Incorrect colouring of Surface plot
- Encoding issues while reading/importing CSV file in Python3 Pandas
- Counting the number of join symptoms
- QlikView Resources
- Point Classification in a set of Bounding Boxes
- How to use multiple data to train a linear regression model in R
- look ahead time analysis in R (data mining algorithm)
- how long does it take to find maximum element in descending sorted array?
- compare previous and present hash key values from a Pandas dataFrame
- "Does Not Exist" (DNE) property filter for Keen IO analysis calls
- How do I choose which parameters to estimate in an ARMA model in python statsmodel?
Related Questions in PANDAS-TIMEINDEX
- Week of year is not correctly shown
- Finding the midpoint between values in a pandas datetime column and making a start and end time period column based on the midpoint
- How to generate monthly period index with annual frequency?
- How to count number of values in column based on one timestamp value python and add the count to new column
- Find current active connections given connection and disconnection times of a location
- Pandas GroupBy time idxmax w/ empty groups throws exception
- pandas how to get mean value of datetime timestamp with some conditions?
- Manipulate the Dataframe to start from the nearest varying Midnight timestamp
- Pandas time series index attribute error when using TsTables & PyTables in creating a table class
- Pandas: cut date column into period date groups/bins
- How to resample intra-day intervals and use .idxmax()?
- How to resample a grouped dataframe with zero order hold?
- Round all index to 30 min in Pandas datetimeindex
- Grouping time-series by some custom datetime range?
- How to get column values from another dataframe with a different datetime index
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
It would be great if you could post some example code, so the replies will be more specific.
You probably want to do something like
earliest = df.groupby('Customer_ID').min()['Consumption_date']to get the earliest consumption date per customer, andlatest = df.groupby('Customer_ID').max()['Consumption_date']for the latest consumption date, and then take the differencetime_span = latest-earliestto get the time span per customer.Knowing the specific df and variable names would be great