Parse unstructured flat file on SSIS

639 Views Asked by At

I have a problem parsing the unstructured text file. I tried to upload everything into one column with data type DT_STR but it throws error since there are special characters like '=' in my file. I tried with DT_TEXT data type, it worked well. But I cannot perform any operations on the column. I need to remove empty rows in that column which can be done with 'conditional split'. But the condition:

[column 0]!=""

does not work and throws the following error:

TITLE: Microsoft Visual Studio

Error at Data Flow Task [Conditional Split [32]]: The data types "DT_TEXT" and "DT_WSTR" are incompatible for binary operator "!=". The operand types could not be implicitly cast into compatible types for the operation. To perform this operation, one or both operands need to be explicitly cast with a cast operator.

Error at Data Flow Task [Conditional Split [32]]: Attempt to set the result type of binary operation "[Column 0] != """ failed with error code 0xC0047080.

Error at Data Flow Task [Conditional Split [32]]: Computing the expression "[Column 0]!=""" failed with error code 0xC0047084. The expression may have errors, such as divide by zero, that cannot be detected at parse time, or there may be an out-of-memory error.

Error at Data Flow Task [Conditional Split [32]]: The expression "[Column 0]!=""" on "Conditional Split.Outputs[Case 1]" is not valid.

Error at Data Flow Task [Conditional Split [32]]: Failed to set property "Expression" on "Conditional Split.Outputs[Case 1]".


ADDITIONAL INFORMATION:

Exception from HRESULT: 0xC0204006 (Microsoft.SqlServer.DTSPipelineWrap)


BUTTONS:

OK

I did the parsing with c# script previously. But now I need to avoid coding. Any ideas on how to parse this file are appreciated. Thankyou

1

There are 1 best solutions below

5
On

Let's break this down

The data types "DT_TEXT" and "DT_WSTR" are incompatible for binary operator "!="

You have an expression [column 0]!="" the empty string "" is DT_WSTR and [column 0] is DT_TEXT data type. As the error message indicates, you cannot test for a not equals condition between these types.

I would look at either converting Column 0 to DT_WSTR or simply checking the length. LEN([column 0]) > 0

The SSIS way

The bigger issue is that your file is not "unstructured text file". It's very structured. The column headers are in line 3 (1 based counting) and data starts in 5. The data itself looks fixed width from here.

How do you handle that in SSIS? Create a Flat File Connection Manager of type Ragged Right. Indicate there are no header columns and skip the first 4 rows/data starts in row 5. Create the columns by specifying the width and now your data flow is simpler and the package will run faster.