Storing data in flat files

1.3k Views Asked by At

I have two data files that are in some weird format. I need to parse them to some decent format to use them for future purposes. After parsing I ended up having two formats in which one had an ID and the other had information about that ID.

Example:

from file 1 I get - Name, Position, PropertyID

from file 2 - PropertyId, Property1,Property2

I have more columns like that in both files.

What is the ideal way to store this information in a flat file on a server as a database? I don't want to use a database (MySQL, MSSQL) for some reason.

Initially, I thought of using a single comma-separated file. But I ended up using so many columns which will create problems upon updating of the data.

I'll be using the parsed data in some other application using Java and Python.

Can anyone suggest a better way to handle this?

2

There are 2 best solutions below

0
On

Ensure that you normalize your data with an ID to avoid touching so many different data columns with even a single change. Like the file2 you mentioned above, you can reduce the columns to two by having just the propertyId and the property columns. Rather than having 1 propertyId associated with 2 property in a single row you'd have 1 propertyId associated with 1 property per your example above. You need another file to correlate your two main data table. Normalizing your data like this can make your updates to them very minimal when change occurs.

file1:

owner_id | name | position |
1 | Jack Ma | CEO |

file2:

property_id | property |

101 | Hollywood Mansion |

102 | Miami Beach House |

file3:

OwnerId | PropertyId |

1 | 101

1 | 102

2
On

I would use JSON. JSON can be easily converted to and from objects in either Python or Java. In Python, JSON maps directly to dict. Java has various facilities to convert. Far less work than doing all that yourself. For Java, see JAXB.

Something like this.

File 1: Map people to propertyID

{
   {"firstName": "John", "lastName": "Smith", "position": "sales"} : 123},
   {"firstName": "Jane", "lastName": "Doe", "position": "manager"} : 456} 
}

File 2: Map propertyId to list of properties.

{
    {123: [{"address": "123 street", "city": "LA"}, {"address": "456 street", "city": "SF"}] } ,
    {456: [{"address": "123 ave", "city": "XX"}, {"address": "456 ave", "city": "SF"}] } 
}

p.s. It might make more sense to associate a person with a list of property IDs and have each property have it's own ID. Easier to move things around and reassign. Just my $0.02.