Error in write_dta : A provided string value was longer than the available storage size of the specified column

1.3k Views Asked by At

I am trying to export my data table from R Studio to the dta format. I use write_dta function from haven library in R and get the following error:

A provided string value was longer than the available storage size of the specified column.

I am quite new to R and Stata and don't understand what it means and what should I do about it.

2

There are 2 best solutions below

0
Thomas Rosa On BEST ANSWER

It sounds like you have a piece of long text in your data.frame. The write_dta has known issues handling long strings (https://github.com/tidyverse/haven/issues/437). You can trim the strings in your data.frame like this:

df = as.data.frame(apply(YOUR_DATA, 2, function(x){
     if(class(x) == 'character') substr(x, 1, 128) else x}))

And then try write_dta(df). The max length of 128 characters should be safe, but newer versions of Stata can handle a lot more.

0
colonus On

I noticed that with the data.frame solution potential labels will get lost. A tibble would allow one to keep labels (e.g. imported *.sav file with labels from a survey collection plattform).

Here is a tidyverse solution using haven to read and write that would keep labels. Keep in mind that your inital df also needs to be a tibble.

library(tidyverse)

df <- haven::read_sav("YOUR FILE.sav")   # could also be some other file format that you start with as a tibble

df <- df %>%
  mutate(across(where(is.character), ~ substr(., 1, 2045)))

haven::write_dta(df, "NAME OF NEW FILE.dta")

For me the maximum string length that worked to write_dta(df) was 2045.