I found these related threads, but they do not answer my question directly:
BizTalk - Flat file with Header multiple records and Footer - Disassemble problem
Removing header from a flat file in BizTalk
I'm dealing with an old system that delivers flat files with a very loose schema. In particular, the header consists of two lines: the first line is a title and the second line is column headers. All subsequent lines are valid records.
The problem is that when there are no records for that day, the column headers are omitted; in that case, we have the document title, and then a summary line (for human consumption) that notifies the reader that there are no records for that day.
Because the same file can have such different formats, I'm having a hard time creating a header schema that I can use in my flat file receive pipeline that will allow me to strip the header information off. Furthermore, since the header is multiple lines, it appears that I can't just use a carriage return delimiter.
I have tried two approaches to this:
- A header schema that contains two carriage-return-delimited field elements, each of which are opaque strings
- A header schema that contains two carriage-return-delimited records, each of which defines a dummy infix delimiter that will never exist in either line (resulting in one opaque string per record)
When I deploy these, however, BizTalk will pick up the files and process them, however it does not actually result in any messages. This leads me to believe that BizTalk is treating my flat files as though the entire file is the header, such that it finds no records.
The solution I'm trying to find is how to create a header schema that causes BizTalk to treat the first two lines of a file as the header, regardless of their contents, and discard them. Is this possible?
EDIT: Examples of the different files:
Records exist:
2017-02-27 19:27:03
CustomerName, OrderNumber, Expedite, ItemNumber, Count
CustomerA, O196801, 0, I232, 2
CustomerA, O196801, 0, I255, 1
CustomerB, O196802, 0, I237, 1
CustomerC, O196803, 0, I214, 1
CustomerC, O196803, 0, I232, 2
No records in this file:
2017-02-27 19:30:22
***EOF***
- The first line is always the same, and can be described with a positionally delimited record.
- The second line is either a comma-delimited list of column names, or this EOF line.
- The EOF appears only when there are no records.
Currently I'm able to process files with records only by defining the delimiter between header and document schemas to be the entire column header line, i.e. CustomerName, OrderNumber, Expedite, ItemNumber, Count{CR}{LF}
However, this header schema fails for the empty file when it finds ***EOF***
instead of the column header string.
There might be some clever way to handle this with the Flat File schema, but I can't think of it.
I would probably look to write a custom
Decode
component for the pipeline that would inspect the first few bytes of the message for that***EOF***
- if so, just null out the stream (or perhaps rewrite it with the expected headers) - if not, reset the position of the stream back to 0 and pass it along.e.g. (note: untested, probably works code):