Pareto chart showing 10 largest values

211 Views Asked by At

I want to display a pareto chart that shows only the largest 10 values. With the code below I can get the Pareto chart but the dataset is too large so that there is too much noise and certain datapoints are not visible.

library(qcc)
df = TestData$Amount
names(df) = TestData$CarType
pareto.chart(df) 

I already thought and tried adding an array from [1:10] to get the top ten but the dataset is then not ordered - he is picking the first ten values randomly.

CarType Amount
Audi 12.546
Mercedes 6.767
VW 3.556
Skoda 5.768
Bentley 1.657
Ford 2.934
Lexus 15.567
Mitsubishi 532
Hyundai 8.611
BMW 213
Scania 4.450
Volvo 10.123

Any suggestions?

3

There are 3 best solutions below

1
On

We can order the Amount column by decreasing value with the function order and select the 10 first lines as you mentioned in your post :

library(qcc)
df = TestData$Amount
names(df) = TestData$CarType
df=df[order(-df)][1:10]
pareto.chart(df) 
0
On

You can get top 10 values by -

library(dplyr)

df <- TestData %>% arrange(Amount) %>% slice(1:10)

#also
df <- TestData %>% slice_max(Amount, n = 10)

Or in base R -

df <- TestData[with(TestData, tail(order(Amount), 10)), ]
0
On

Using data.table

library(data.table)
df1 <- head(setDT(TestData)[order(Amount)], 10)