adding leading zeros to a list AND assign the modified values back to the variable in the data frame

218 Views Asked by At

The dataset I'm working with is a .csv file which has 13 columns with the following headers (bold), and 302 rows : id : 1,2,..., 302), source_code: : AAA, BBB, CCC., date,day, month, year, time,hour, minute, second,latitude, longitude, inscriptions: NA, 1 or 0.

I want the column of IDs (1, 2, 3,...,302) to have 3 digit numbers, with prevailing zeros when necessary (001, 002, ..., 010, ..., 302).

This is what I have done:

data<-read.csv("C:/TFM/corrected_filtered_coordinates.csv")

str(data)

#'data.frame': 302 obs. of 13 variables:

#$ id : int 1 2 3 4 5 6 7 8 9 10 ...

data$id <- as.numeric(data$id)

sprintf("%03d", data$id)

#sprintf("%03d", data$id)

#[1] "001" "002" "003" "004" "005" "006" "007" "008" "009" "010"

str(data)

#'data.frame': 302 obs. of 13 variables:

#$ id : int 1 2 3 4 5 6 7 8 9 10 ...

So far, I have found 2 functions that work (A list with the transformed IDs appears in the console after I run either of them):

formatC(data$id, width = 3, format = "d", flag = "0") and sprintf("%03d", data$id)

HOWEVER

when I display the data after running either of those functions, using data or str(data) , the IDs still appear without the leading zeros!

I have tried changing the IDs to characters #data$id <- as.character(data$id) and as numeric data$id <- as.numeric(data$id) before running the functions...but nothing appears to work!

How can I make the IDs keep the changes? I am failing to assign the modified values back to the variable in the data frame

Thank you in advance!!!

1

There are 1 best solutions below

2
On BEST ANSWER

Guessing at this a bit, but it sounds like you are failing to assign the modified values back to the variable in the data frame.

dd <- data.frame(id=c(1,2,3,10))
dd$id <- sprintf("%03d", dd$id)
str(dd)
## 'data.frame':    4 obs. of  1 variable:
## $ id: chr  "001" "002" "003" "010"
  • I used dd rather than data for the name of the data frame; data is also the name of a built-in function in R, and it's best practice to avoid it