Desired Output
Need the following output
df2 <-
data.frame(
v1 = c(1100001, 1100002, 1100003, 1100004, 1100005)
, v2 = c("A R", "W R", "A K", "M", "A C")
, v3 = c("P", "G P", "G P", "P", "P")
, v4 = c(110, 161, 129, 132, "Absent")
, v5 = c(55, 80.5, 64.5, 66, "Absent")
)
df2
v1 v2 v3 v4 v5
1 1100001 A R P 110 55
2 1100002 W R G P 161 80.5
3 1100003 A K G P 129 64.5
4 1100004 M P 132 66
5 1100005 A C P Absent Absent
Here is my original data.frame
.
df1 <-
structure(list(value = c(
"1100001 A R P 110 55",
"1100002 W R G P 161 80.5",
"1100003 A K G P 129 64.5",
"1100004 M P 132 66",
"1100005 A C P Absent Absent"
)), row.names = c(NA, -5L), class = c("data.frame")
)
df1
value
1 1100001 A R P 110 55
2 1100002 W R G P 161 80.5
3 1100003 A K G P 129 64.5
4 1100004 M P 132 66
5 1100005 A C P Absent Absent
Used cSplit
function from splitstackshape
, but could not manage to get the required output. Any points.
library(splitstackshape)
cSplit(indt = df1, splitCols = "value", sep = " ")
value_1 value_2 value_3 value_4 value_5 value_6 value_7
1: 1100001 A R P 110 55 NA
2: 1100002 W R G P 161 80.5
3: 1100003 A K G P 129 64.5
4: 1100004 M P 132 66 NA NA
5: 1100005 A P Absent Absent NA NA
1.Assuming the rule for separation are: a) more than one space b) numbers follwing a letter and space are to be split
2.We make sure all soon to be varibles are separated by more than one space (using
gsub
to replace one space between a letter and a number with two spaces)3.Then we use
tidyr::separate
to separate the string into variables using two or more spaces as deliminiatorReturns:
Edit
Regarding the new constraint that was not included in the original example (only one space between a number and a letter):
Suggested solution:
We just repeat the command for adding an extra space with "reversed" regex. So that any single space between a number and a letter will get an extra space and will in then be separated by the
separate
call