Deleting Chained Duplicates

75 Views Asked by Christian DJ At 27 July 2025 at 13:26

Lets say I have a list:

lits = [1, 1, 1, 2, 0, 0, 0, 0, 3, 3, 1, 4, 5, 2, 2, 2, 0, 0, 0]

and i need this to become [1, 1, 2, 0, 0, 3, 3, 1, 4, 5, 2, 2, 0, 0] (Delete duplicates, but only in a chain of duplicates. Going to do this on a huge HDF5 file, with pandas, numpy. Would rather not use a for loop iterating through all elements.

table = table.drop_duplicates(cols='[SPEED OVER GROUND  [kts]]', take_last=True)

Is there a modification I can do to this code?

Original Q&A

There are 1 best solutions below

JohnE On 24 June 2015 at 11:53

In pandas you can do a boolean mask, selecting a row only if it is differs from either the preceding or succeeding value:

>>> df=pd.DataFrame({ 'lits':lits })

>>> df[ (df.lits != df.lits.shift(1)) | (df.lits != df.lits.shift(-1)) ]

    lits
0      1
2      1
3      2
4      0
7      0
8      3
9      3
10     1
11     4
12     5
13     2
15     2
16     0
18     0

Deleting Chained Duplicates

There are 1 best solutions below

Related Questions in NUMPY

Related Questions in PANDAS

Related Questions in DUPLICATES

Related Questions in HDF5

Trending Questions

Popular # Hahtags

Popular Questions