Question about taking sample and saving data from U2 files with headers

272 Views Asked by At

Relative newbie to using Uniquery. I've found some helpful documentation and answers from prior users posting here, and on other sites. I'm trying to document what we have in our U2 files, as we are exploring options to migrate historical data into a data warehouse running SQL.

I've been able to list out how files are structured with LIST DICT <<FILENAME>>, as well as save those results to a file, which I'm then able to view in with Excel.

Once I've found this basic data, I wanted to take some sample data from each of these files. If I use, LIST <<FILENAME>> ALL TO DELIM "|" /TSTSAMPLE.TXT SAMPLE 300, I am able to get this sample. However I was wondering if there's a way to create a tab delimited file instead of using pipe as the delimiter?

Another question I had was if anyone knew of a way to get the headers that go with the data being saved?

I've seen some suggestions of using XML, LIST <<FILENAME>> ALL TOXML, which works, but it doesn't look like empty elements are placed into the saved file.

Have additionally been using, UDT.OPTIONS 91 ON, to get any dates into a readable format for the saved file.

Thanks to any U2 pros who can offer suggestions.

1

There are 1 best solutions below

2
On BEST ANSWER

You are definitely on the right track. UDT.OPTIONS 91 ON is essential for dates and money fields. For the specific question of exporting as tab-delimited, I haven't seen it documented anywhere but this works for me:

LIST <<FILENAME>> ALL TO DELIM 9 /TSTSAMPLE.TXT SAMPLE 300

Obviously the 9 represents CHAR(9) for tab. I'm not sure if other characters work as well - I always use 9 or "|". I don't use ALL because I have dictionaries that are a mess, but good for you if yours are well maintained.

For the headers, that's a tough thing to do in general. I've tried to solve that one too, and ended up creating a tab-delimited header to use for each file. You can start from the XML dump and do some tweaking in your favorite editor to not have to do the whole thing from scratch.

The other thing that is very challenging is a) identifying MV fields and then b) deciding which are controlling and which dependent. I have a program that does this by counting MV marks in a sample of the data, and attempting to line up those fields that have the same count in all records. If you're looking to do that I can post on github or somewhere. It's complicated, and unless your data is perfectly clean, not 100% correct.