I have two data files that are in some weird format. I need to parse them to some decent format to use them for future purposes. After parsing I ended up having two formats in which one had an ID and the other had information about that ID.
Example:
from file 1 I get - Name, Position, PropertyID
from file 2 - PropertyId, Property1,Property2
I have more columns like that in both files.
What is the ideal way to store this information in a flat file on a server as a database? I don't want to use a database (MySQL, MSSQL) for some reason.
Initially, I thought of using a single comma-separated file. But I ended up using so many columns which will create problems upon updating of the data.
I'll be using the parsed data in some other application using Java and Python.
Can anyone suggest a better way to handle this?
Ensure that you normalize your data with an ID to avoid touching so many different data columns with even a single change. Like the file2 you mentioned above, you can reduce the columns to two by having just the propertyId and the property columns. Rather than having 1 propertyId associated with 2 property in a single row you'd have 1 propertyId associated with 1 property per your example above. You need another file to correlate your two main data table. Normalizing your data like this can make your updates to them very minimal when change occurs.
file1:
owner_id | name | position |
1 | Jack Ma | CEO |
file2:
property_id | property |
101 | Hollywood Mansion |
102 | Miami Beach House |
file3:
OwnerId | PropertyId |
1 | 101
1 | 102