R DBI::dbGetQuery - adding multiple values within IN from a vector

71 Views Asked by At
library(RJDBC)
library(odbc)
library(tidyverse)
library(writexl)
library(readxl) 

Sample vector for the IN clause

values_for_in_clause <- c("23545234", "3424566", "11245677")

Comma-separated string for the IN clause

in_clause_values <- paste0("'", values_for_in_clause, "'", collapse = ",")


data <- DBI::dbGetQuery(Connection, 
                                 "
SELECT numeric_column
FROM data 
WHERE    numeric_column IN (", in_clause_values, ")
")

I get this error:

Error in .jcall(s, "V", "setString", i, as.character(v)) :
java.sql.SQLException: ORA-17003: Invalid column index

The code should look like this:

SELECT numeric_column
FROM data 
WHERE    numeric_column IN ('23545234', '3424566', '11245677')
2

There are 2 best solutions below

0
jamespryor On BEST ANSWER

Combine sQuote with toString to format a vector for use in a SQL statement:

library(odbc)
library(tidyverse)
    
values <- c("23545234", "3424566", "11245677")

formatted_values <- toString(sQuote(values, q = F))

query <- paste0("select numeric_column
from data
where numeric_column in (", formatted_values, ");")

data <- dbGetQuery(conn = connection,
                   statement = query)
0
Marmite Bomber On

Your problem is the missing paste in the dbGetQuery call.

This adaption will work (note the added paste)

data <- DBI::dbGetQuery(jdbcConnection, 
paste(
"SELECT numeric_column
FROM data 
WHERE    numeric_column IN (", paste(values_for_in_clause, collapse=','), ")
")
)

Also note that if your column numeric_column is numeric you do not need to quote the numbers in the list. This must work as well numeric_column IN (1,2,3)

But a potential bigger problem is the string concatenation in the SQL text (check for "SQL injection"), so the more secure way (and also more performant in case of frequent executions) would be to pass the list as a parameters (bind variables)

data <- DBI::dbGetQuery(jdbcConnection, 
"SELECT numeric_column
FROM data 
WHERE    numeric_column IN (?,?,?)",
list=as.list(values_for_in_clause) 
)

Note that if your parameter vector has a dynamic size you will need to generate the appropriate number of the question marks in the SQL query.