Add decimal after x digit R

808 Views Asked by At

I have a dataframe with coordinates (SWEREF_99, proj 3006) and realized that a lot of the coordinates have the wrong format.

For northern coordinates, they always have seven digits before a comma, and eastern coordinates have six digits before comma, then decimal digits. Like: 6789654.234 (north). However, my coordinates look like this; 6789654234. This means that I want to enter a comma/dot between digit 4 and 2.

I have tried formatC and format but when I use that I add multiple commas, such as 678.965.423.4 or just adds zeros after deciamal.

test <- format(data$North, 7, nsmall = 0)

Here´s a dataframe with imaginary coordinates, but it describes my data well.

data <- data.frame(North = c(678645345, 675836578, 6573645.786), East = c(354674.47, 37465434, 4263546))

As you can see, some of my coordinates look good, they have seven digits for north coordinates, and six digits for east coordinates. So these I don't want to change, but only the ones that do not include a comma.

Have anyone experience the same problem?

4

There are 4 best solutions below

4
On BEST ANSWER

You can use regex to insert a decimal point at a particular position :

transform(data, North = sub('(.{7})\\.?(.*)', '\\1.\\2', North), 
                East = sub('(.{6})\\.?(.*)', '\\1.\\2', East))

#        North      East
#1  6786453.45 354674.47
#2  6758365.78 374654.34
#3 6573645.786  426354.6

We divide data in 2 capture groups.

The first capture group (.{7}) - captures first 7 characters

\\.? - an optional decimal place if present is ignored after that.

The second capture group ((.*)) is everything after that.

We return the two captures groups with backreference adding a "." between them ('\\1.\\2').

The same logic is applied for East column.

0
On

Its most of the time easier to manipulate numbers as strings in such cases. This should solve your problem:

data <- data.frame(North = c(678645345, 675836578, 6573645.786), East = c(354674.47, 37465434, 4263546))

insertComma <- function(data,commaPosition)
{
  data.str <- as.character(data)
  data.comma <- ifelse(substr(data.str,commaPosition,commaPosition) != ".",paste0(substr(data.str,1,commaPosition - 1),".",
                                                                                  substr(data.str,commaPosition,nchar(data.str))),data.str)
  return(data.comma)
}
commaPositions <- c(8,7)

sapply(seq_len(ncol(data)),function(col) insertComma(data[,col],commaPositions[col]))

     [,1]          [,2]       
[1,] "6786453.45"  "354674.47"
[2,] "6758365.78"  "374654.34"
[3,] "6573645.786" "426354.6" 
3
On

Try this:

within(data, {
  North = North / 10 ^ (nchar(trunc(North)) - 7L)
  East = East / 10 ^ (nchar(trunc(East)) - 6L)
})
0
On

This seems to work, using basic functions from tidyverse and base-R (grepl, substr, paste, mutate, case_when). If you're not comforatble with regex, you can try this:

library(dplyr)

data <- data.frame(North = c(678645345, 675836578, 6573645.786), East = c(354674.47, 37465434, 4263546))

data %>%
  mutate(North = case_when(grepl(".", as.character(North), fixed = TRUE) ~ as.character(North),
                           TRUE ~ paste0(substr(North,1,7),".",substr(North,8,nchar(North)))),
         East = case_when(grepl(".", as.character(East), fixed = TRUE) ~ as.character(East),
                          TRUE ~ paste0(substr(East,1,6),".",substr(East,7,nchar(East)))))

Output:

        North      East
1  6786453.45 354674.47
2  6758365.78 374654.34
3 6573645.786  426354.6