I am attempting to loop-apply a function to a list of filepaths in R, with the goal of appending the output to an out-of-memory SQLite database using dbplyr. The database is currently empty and has a predefined structure.
Here is the database structure (I do not know how to create a proper reprex for a db connection)
database
# Source: table<epi_table> [0 x 14]
# Database: sqlite 3.44.2 [/home/me/repositories/some_path/data/some_path.db]
# ℹ 14 variables: doc_index <int>, content_type <chr>, style_name <chr>, text <chr>, level <dbl>,
# num_id <int>, row_id <int>, is_header <int>, cell_id <dbl>, col_span <dbl>, row_span <int>,
# successful_import <int>, filepath <chr>, doc_vs_docx <chr>
The function I loop-apply to the list of filepaths produces outputs like this for each iteration:
new_row <- structure(list(doc_index = 1L, content_type = "paragraph", style_name = NA_character_,
text = "Some_text", level = NA_real_, num_id = NA_integer_,
row_id = NA_integer_, is_header = NA, cell_id = NA_real_,
col_span = NA_real_, row_span = NA_integer_, successful_import = TRUE,
filepath = "some_file.docx", doc_vs_docx = "docx"), row.names = c(NA,
-1L), class = c("tbl_df", "tbl", "data.frame"))
new_row
# A tibble: 1 × 14
doc_index content_type style_name text level num_id row_id is_header cell_id col_span row_span successful_import filepath doc_vs_docx
<int> <chr> <chr> <chr> <dbl> <int> <int> <lgl> <dbl> <dbl> <int> <lgl> <chr> <chr>
1 1 paragraph NA Some_text NA NA NA NA NA NA NA TRUE some_file.docx docx
I attempted to use the following code to insert rows into the database:
database |> rows_insert(new_rows, conflict = 'ignore')
However, I encountered an error regarding different object classes:
Error in `auto_copy()`:
! `x` and `y` must share the same src.
ℹ `x` is a <tbl_SQLiteConnection/tbl_dbi/tbl_sql/tbl_lazy/tbl> object.
ℹ `y` is a <tbl_df/tbl/data.frame> object.
ℹ Set `copy = TRUE` if `y` can be copied to the same source as `x` (may be slow).
If I intend to add regular tibble rows to a tbl object, do I need to convert it to a tbl object beforehand? Is dbplyr not that useful for this particular use case? I am not much interested in learning proper SQL synthax right now, and always relied on dbplyr.
Related: Is it possible to insert (add) a row to a SQLite db table using dplyr package?
I eventually understood
DBI::dbWriteTable
is so simple and efficient, we should not be looking elsewhere.Although
dbplyr
can be a go-to for anyone averse to SQL, it is usually recommended as a data analysis tool, and there is some strong opinion against using the package to manipulate/edit databases see Hadley's criticism in a comment here and here.I found a lot of comments suggesting we use the
DBI
package instead. I managed to add the rows by simply usingDBI::dbWriteTable()
, as in:And now I can even keep using
dbplyr
to query the updated table withcollect(database)