Query of Oracle database using RJDBC

111 Views Asked by At

Using RJDBC to connect to a server, which hosts an oracle (11.2g) database, I can use DBI::dbGetQuery() to get results from queries. RJDBC::dbGetTables() and DBI::dbListTables() both work.

I cannot, however, use the dplyr::tbl() to work with the db-tables as if they were in-memory objects, using dplyr verbs for example. I was able to do in my former job and liked it very much :-)

tbl(con, in_schema("abc","TABLE_NAME")) 

results in:

Error in `dplyr::db_query_fields()`:
! Can't query fields.
Caused by error in `dbSendQuery()`:
! Unable to retrieve JDBC result set
  JDBC ERROR: ORA-00933: SQL command not properly ended

  Statement: SELECT *
FROM "abc"."TABLE_NAME" AS "q01"
WHERE (0 = 1)

I have already tried the following from posit (https://solutions.posit.co/connections/db/databases/oracle/):

sql_translation.JDBCConnection <- dbplyr:::sql_translation.Oracle
sql_select.JDBCConnection <- dbplyr:::sql_query_select.Oracle
sql_subquery.JDBCConnection <- dbplyr:::sql_query_wrap.Oracle

but now

tbl(con, in_schema("abc","TABLE_NAME")) 

results in:

Error in `collect()`:
! Failed to collect lazy table.
Caused by error in `dbSendQuery()`:
! Unable to retrieve JDBC result set
  JDBC ERROR: ORA-00933: SQL command not properly ended

  Statement: SELECT *
FROM ("abc"."TABLE_NAME") 
FETCH FIRST 11 ROWS ONLY

Since I have the feeling that our oracle version doesn't like "FETCH FIRST", I've changed the query being sent with this:

trace(dbplyr:::sql_query_select.Oracle, edit=TRUE)

replacing "FETCH FIRST ", <limit>, "ONLY" with "WHERE ROWNUM <", <limit>". This returns a result! But: 1) R believes the first 10 rows to be all there is (and doesn't know the query was lazy) and 2) dyplr::filter doesn't work as it returns:

Error in `collect()`:
! Failed to collect lazy table.
Caused by error in `dbSendQuery()`:
! Unable to retrieve JDBC result set
  JDBC ERROR: ORA-00933: SQL command not properly ended

  Statement: SELECT *
FROM ("abc"."TABLE_NAME") 
WHERE ("variable" = x)
WHERE ROWNUM < 11

So how to achieve that dplyr verbs work and R perfoming lazy queries (and knowing about it)?

Thanks for your help!

1

There are 1 best solutions below

2
On
SELECT *
FROM "abc"."TABLE_NAME" AS "q01"
WHERE (0 = 1)

Is invalid as Oracle forbids you using the AS for a table alias; if you get R to generate SQL with the correct dialect so that it does not use AS before table aliases then it should work.


SELECT *
FROM ("abc"."TABLE_NAME") 
WHERE ("variable" = x)
WHERE ROWNUM < 11

Is invalid as you have two WHERE clauses; the second WHERE should be an AND.

R believes the first 10 rows to be all there is

If you have hardcoded a static filter to a query that limits the output to ROWNUM < 11 then you will only ever get 10 rows (and those rows will be the first 10 random rows that match the filter conditions, not 10 rows in a specific order). So, R is correct, that is all there is with a hardcoded filter of 10 rows.

If you want to paginate a query then you want (for example, fetching rows 101-110):

SELECT column1,
       column2,
       column3
FROM   (
  SELECT column1,
         column2,
         column3,
         ROWNUM AS rn
  FROM   (
    SELECT column1,
           column2,
           column3
    FROM   "abc"."TABLE_NAME"
    WHERE  "variable" = x
    ORDER BY something
  )
  WHERE  ROWNUM <= 110
)
WHERE rn > 100

(Order first, then filter by the upper bound and use an inline view to materialise the row numbering and then filter by the lower bound using the materialised row number.)