Transpose dataframe from wide-to-long with multiple values

83 Views Asked by At

I have wide data with multiple values that I need to transpose into a long data table. I've looked at several sites and StackOverflow and cannot seem to find the answer to this, even though its so simple.

Some example data:

#example data
wide <- data.frame(
  VariableName = c("Var1","Var2","Var3","Var4","Var5"),
 Year1 = c(411,723,325,456,579),
 Year2 = c(123,300,400,500,600),
 Year3 = c(1457,1000,569,896,956)
)

which looks like this

  VariableName Year1 Year2 Year3
1         Var1   411   123  1457
2         Var2   723   300  1000
3         Var3   325   400   569
4         Var4   456   500   896
5         Var5   579   600   956

And I need it to look like this

VariableName    Var1  Var2  Var3  Var4  Var5
Year1           411   723   325   456   579
Year2           123   300   400   500   600
Year3           1457  1000  569   896   956

I've tried several functions, including gather(), pivot_longer(), melt() but I just can't seem to get the function to either work, or give me the results I need.

2

There are 2 best solutions below

0
On BEST ANSWER

Option 1: With tidyr, pivot_longer and then pivot_wider:

library(tidyr)

wide %>%
  pivot_longer(-VariableName, names_to = 'Year') %>%
  pivot_wider(id_cols = Year, names_from = VariableName)

Option 2: You can also use some useful tools provided by tibble to operate row names before and after transposing the data.

library(tibble)

wide %>%
  column_to_rownames("VariableName") %>%
  t() %>% as.data.frame() %>%
  rownames_to_column("Year")
Output
# # A tibble: 3 × 6
#   Year   Var1  Var2  Var3  Var4  Var5
#   <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 Year1   411   723   325   456   579
# 2 Year2   123   300   400   500   600
# 3 Year3  1457  1000   569   896   956
0
On

As you are only looking for a transposed version of your data frame, you can simply transpose it with t(). Directly applying t() to your data frame does not work, though, because you have stored the row names in a column, which means that data coercion will cast all your data to the least common denominator, i.e. to strings:

> t(wide)
             [,1]   [,2]   [,3]   [,4]   [,5]  
VariableName "Var1" "Var2" "Var3" "Var4" "Var5"
Year1        "411"  "723"  "325"  "456"  "579" 
Year2        "123"  "300"  "400"  "500"  "600" 
Year3        "1457" "1000" " 569" " 896" " 956"

Solution: Store the variable names in the row names, not in a column.

> row.names(wide) <- wide$VariableName
> wide <- subset(wide, select=-c(VariableName))
> t(wide)
       Var1 Var2 Var3 Var4 Var5
Year1  411  723  325  456  579
Year2  123  300  400  500  600
Year3 1457 1000  569  896  956

As @darren-tsai pointed out, the result of t() is a matrix, so if you need a data frame, you must convert it with as.data.frame.