I have a problem with my dataframe. The missing values are marked with # and I cant find a way to automatically replace them with NA.
Here is my dataframe: https://gofile.io/?c=BfpgbC
This is what I have tried:
library(naniar)
df_new= testframe %>% replace_with_na(replace = list(NO2_Königsplatz = "#"))
testframe[testframe== "#"] <- NA
Both does not work. When I manually replace each value, it works but that is not an option because it takes too long.
After replacing the missing values with NA I want to convert all columns (not the 1st column) into numerics to calculate the means.
Any ideas how to solve this?
EDIT WITH CORRECT DATA
Here is a second approach:
The last step will generate warnings about coerced NA values, which can be ignored. We can use the
lubridate
anddplyr
packages:Note that the timezone is assumed UTC unless specified otherwise.
Result:
EDIT WITH BASE R SOLUTION
OLD ANSWER WITH INCORRECT DATA
You can specify which values represent missing values when you import the data to R. In general if unsure about data, it's best to read "as is", explore and then figure out the quirks of that particular dataset, then go back and fix it up.
For the data linked in your question, this should work:
Result:
You might also consider conversion of the
Zeitpunkt
column to a datetime class, depending on what you want to do next.