Easy way to map values using intervals stored as strings in R?

243 Views Asked by At

I have a dataframe of intervals stored as strings :

                interval
1       '(-inf-57142.8]'
2    '(57142.8-94002.6]'
3   '(94002.6-130862.4]'
4  '(130862.4-167722.2]'
5    '(167722.2-204582]'
6    '(204582-241441.8]'
7  '(241441.8-278301.6]'
8  '(278301.6-315161.4]'
9  '(315161.4-352021.2]'
10      '(352021.2-inf)'

I want to map any given number to interval "bins", using the intervals stored in the dataframe above and the index as the bin number i.e.

-57142.8 would map to 1

-57142.9 would map to 2

130862.5 would map to 4

352021.2 would map to 9

352021.3 would map to 10

etc

The intervals are generated dynamically using a discretize function.

Are there any simple R tools for helping to achieve this?

Or anything that deals with intervals stored as strings?

Thanks In Advance

1

There are 1 best solutions below

0
On

Resolved this using gsub & findInterval, It may be useful to others?

Get boundary from strings described in original question above :

  boundaries<-gsub("\\(-inf-|\\(-inf-|\\(\\d+[.]*\\d+[-]+|\\'|\\]","",intervals$interval)[1:9] %>% as.numeric() 

Get Interval position:

findInterval(value_to_test,boundaries[1:9],rightmost.closed = FALSE,all.inside = TRUE)

The endpoints '(-inf-57142.8]' & '(352021.2-inf)' are dealt with seperately as special cases. If the value_to_test lands on a boundary its Interval position is also a special case and adjusted by -1.