This is in an ADF pipeline connected to salesforce.
Using common tsql functions like REPLACE(), TRANSLATE() is causing an error.
Goal: Clean special characters, such as newlines and the delimiter of the file being written, out of a long text field.
Problem:
- Without any of these functions, the pipeline runs successfully, but the file it outputs is corrupt.
- When the file is read in a datafactory notebook or opened in excel, the values are wrong, for example null values in the primary key.
With this query the pipeline runs without error but the file is corrupt:
SELECT
Id,
longtextfield
FROM
exampleTable
I have tried:
REPLACE(longtextfield, '|', '') AS longtextfield
CAST(REPLACE(longtextfield, '|', '') AS varchar(5000)) AS longtextfield
REPLACE(ISNULL(longtextfield, ''), '|', '')
TRANSLATE(longtextfield, '|', '')
And other variants. They all fail with this error message:
ERROR [HY000] [Microsoft][Salesforce] (120) SOQL_FIRST mode prepare failure: SOQL error: [Microsoft][SQLEngine] (31480) syntax error near 'SELECT ISNULL(<<< ??? >>>Subject,'') AS testname FROM exampletable
Of course I have carefully verified that the table name, object name, etc. are correct, I cannot determine any actual syntax error, it seems like the use of tsql statements themselves is causing the error.
I have seen these functions in use (on other objects and fields) successfully in very similar pipelines in our org. These pipelines are very simple, there is just 1 copy data activity querying and writing the file.
I'm not intimately familiar with how the tsql-soql translation works, is there something at that level that could be rejecting the function?
SOQL does not support the REPLACE function or other general-purpose text manipulation functions. [Note: SOQL and t-SQL are different]. The available functions in SOQL are listed in the official documentation.
The workaround is to copy the data to azure database / azure storage and then transform it using ADF dataflow activity and then copy the transformed data to the sink. In dataflow, you can use derived column transformation to redefine the column. In derived column transformation, use replace function
replace(longtextfield,'|','')This way, you can replace the characters from the data.