I've got a Snowflake SQL query I'm trying to execute in R via ODBC connection that looks like this
SET quiet=TRUE;
USE SOMEDATABASE.SOMESCHEMA;
--Select timestamp of last sale per customer
DROP TABLE IF EXISTS sales;
CREATE TEMPORARY TABLE sales(CustomerId VARCHAR(16777216), SaleTS TIMESTAMP_NTZ(9));
INSERT INTO sales
SELECT CustomerId,
SaleTS
FROM SALES
WHERE SaleTS>= '2020-11-19 00:00:00'
AND SaleTS <= '2020-11-19 23:59:59.999'
GROUP BY CustomerId;
--Use temp table to get correct row from sales table
SELECT SUM(SalesDetail.price) as SumPrice
COUNT(*) as SoldVolume
FROM sales
LEFT JOIN SALES as SalesDetail
ON Sales.CustomerId = SalesDetail.CustomerId
AND sales.SaleTS = SalesDetail.SaleTS
Querying Microsoft SQL Server from R I'd normally include set nocount no; at the top of the query to ensure only the last step is returned to R to avoid the error Actual statement count 6 did not match the desired statement count 1. Error makes sense, SQL is returning 6 components when R is expecting 1 (6 one for each step in my SQL query). In Snowflake there doesn't appear to be an option to set nocount on in the same way. My question is how do I avoid the above error. Does anyone have any experience of executing a mutli-step Snowflake SQL query via R? How can I get R to receive just the last statement from the ODBC connection. So far I've tried set nocount=TRUE;, set echo=FALSE;, set message=FALSE;, SET quiet=TRUE etc
Snowflake SQL is expressive enough and proposed code could be structured as single query:
The original query is using the same name for both table and temporary table differing only by case
salesvsSALES, which is error-prone.Second: database and schema could be set up during establishing connection, so there is no need for
USEinside script. Alternatively fully qualified name could be used in the script.I guess the intent of query is as follow:
If it is possible that a single person has two entries for exactly the same SaleTS then
RANK() OVER(...)should be used instead.