I have a problem parsing the unstructured text file. I tried to upload everything into one column with data type DT_STR but it throws error since there are special characters like '=' in my file. I tried with DT_TEXT data type, it worked well. But I cannot perform any operations on the column. I need to remove empty rows in that column which can be done with 'conditional split'. But the condition:
[column 0]!=""
does not work and throws the following error:
TITLE: Microsoft Visual Studio
Error at Data Flow Task [Conditional Split [32]]: The data types "DT_TEXT" and "DT_WSTR" are incompatible for binary operator "!=". The operand types could not be implicitly cast into compatible types for the operation. To perform this operation, one or both operands need to be explicitly cast with a cast operator.
Error at Data Flow Task [Conditional Split [32]]: Attempt to set the result type of binary operation "[Column 0] != """ failed with error code 0xC0047080.
Error at Data Flow Task [Conditional Split [32]]: Computing the expression "[Column 0]!=""" failed with error code 0xC0047084. The expression may have errors, such as divide by zero, that cannot be detected at parse time, or there may be an out-of-memory error.
Error at Data Flow Task [Conditional Split [32]]: The expression "[Column 0]!=""" on "Conditional Split.Outputs[Case 1]" is not valid.
Error at Data Flow Task [Conditional Split [32]]: Failed to set property "Expression" on "Conditional Split.Outputs[Case 1]".
ADDITIONAL INFORMATION:
Exception from HRESULT: 0xC0204006 (Microsoft.SqlServer.DTSPipelineWrap)
BUTTONS:
OK
I did the parsing with c# script previously. But now I need to avoid coding. Any ideas on how to parse this file are appreciated. Thankyou
Let's break this down
You have an expression
[column 0]!=""
the empty string""
is DT_WSTR and[column 0]
is DT_TEXT data type. As the error message indicates, you cannot test for a not equals condition between these types.I would look at either converting Column 0 to DT_WSTR or simply checking the length.
LEN([column 0]) > 0
The SSIS way
The bigger issue is that your file is not "unstructured text file". It's very structured. The column headers are in line 3 (1 based counting) and data starts in 5. The data itself looks fixed width from here.
How do you handle that in SSIS? Create a Flat File Connection Manager of type Ragged Right. Indicate there are no header columns and skip the first 4 rows/data starts in row 5. Create the columns by specifying the width and now your data flow is simpler and the package will run faster.