New R-bie,
I am trying to clean 3 columns of data from my dataframe df
. The columns consist of numeric elements that range in their value from 0.19, 687.00, 49405, 107.440, 13764.000, 1.740. I will create df
below for the purpose of this example. The goal is going to be to implement this line of code into a mutate function from dplyr so clean a column of data from data.frame
.
Example:
df <- c(1.560, 1.790, 3456.000, 1.0700, 0.16000, 1.347, 4.200)
I have been trying to remove the 0's at the end of the elements so that they all look like this
df <- c(1.56, 1.79, 3456, 1.07, 0.16, 1.347, 4.20)
I can partially achieve my desired results by running the line of code below:
signif(df[1], 5)
signif(df[2], 5)
signif(df[3], 5)
signif(df[4], 5)
signif(df[5], 5)
signif(df[6], 5)
signif(df[7], 5)
with the df[7] element 4.200
returning 4.2
Although I have to do this one by one otherwise if I do: signif(df[1:6], 5)
, i get this vector returned 1.560 1.790 3456.000 1.070 0.160 1.347 4.200
- I have also tried using regex to extract the patterns of 0's at the end of the object, but any quantifiers or expression I use seems to remove all the trailing zeros.
I was thinking of removing the last digit if it were a 0, to leave numbers like
1.347
as they were, but clean the rest of the column to then remove an exact match of".00"
to get a whole integer leaving3456
and '4.20'. When using"(\\.000)$"
to match and remove 0's from (eg.4128.000, 13764.000
), other elements also have their 0's removed (eg.4.2
,0.9
) instead of leaving4.200
and0.900
, from which I'd like to extract4.20
and0.90
. Using"(0)$"
doesn't work either, and I have tried a plethora of regex variations to achieve this...any ehlp would be much appreciated.
It is true that the trailing "000"'s disappear with
sub
orgsub
using that pattern, but not because of the pattern matching any characters. Rather it's entirely because of the initial conversion to "character" class:And if you wanted 2 digits to the right of the decimal point you could do:
And to get rid of the quotes use
print
(although they remain character value so you cannot use arithmetic operators on that result.: