I am trying to use a parametrized query multiple times with package:ROracle
.
library(DBI)
conn <- dbConnect(ROracle::Oracle(), ...)
statement <- "select * from bigtable where name = :1"
name <- c("Bob", "Alice", "Carol")
I expected the query to be executed for each row of the data frame (i.e. as for to the param
argument in DBI::dbGetQuery()
), but it is not the case:
dbGetQuery(conn, statement, data.frame(name))
# Error in .oci.GetQuery(conn, statement, data = data, prefetch = prefetch, :
# bind data has too many rows
I can loop over the names, but I think it may be less efficient (I am new to RDBMS but I read about optimization/caching):
lapply(name, function(i) dbGetQuery(conn, statement, data.frame(i)))
How can it be done efficiently?
There are some related questions that does not solve the problem:
- In this (old, unanswered) question, OP uses now-depreceated functions in
package:ROracle
. - In this question, the problem was finally "solved" using a loop, which is inefficient (see above).
I think there are a couple of ways to approach this.
Insert (to a temp table), single-query on a join:
This method has the advantage of being a single query. If you need it split by name later, you can always use R's
split
.(Apparently this doesn't work in
ROracle
?) Create a stored-statement and repeatedly bind/fetch:I do not tend to use this method, though that's as much preference as anything else.
Use
SELECT ... NAME IN (...)
The
paste(...)
in the statement produces":1,:2,:3"
(counting along the length ofname
), which should be compatible with your original query mechanism.Having said that ... every server has its own optimization methods, but I suspect that with your
lapply(.., \(z) dbGetQuery(..))
, your server will cache the optimized version since your query does not change between iterations. A DBA more familiar with Oracle might be able to clarify or refute my claim, but I have a hard time believing that it does not cache the query for some time (in a sql statement cache).