Parsing name and address from unstructured text

1.1k Views Asked by At

I am working on an application that requires me to parse unstructured text. I need to parse name, address - area,city,country and zip code from it. The addresses will be Indian.

Sample input: "I am ABC working in XYZ company. I am good at web designing having an experience of 3 years. I live in kothrud,Pune-411038,Maharashtra."

Output: NAME : ABC AREA : KOTHRUD CITY : PUNE STATE : MAHARASHTRA ZIP CODE : 411038

I am planning to use Apache ConceptMapper for parsing cities and states for which I will have to build a dictionary set myself, but I guess that can be done. For the zip code, I can use regex. I am stuck at how to parse a name and area. Regex can be used to get name and area with little hacking and lots of patterns but I am wondering if there is any better solution available.

Is there any database I can query to, that would return addresses? I haven't looked into Google maps/places but can you achieve address parsing with them easily?

Any inputs would be highly appreciated.

Thanks.

1

There are 1 best solutions below

0
On

The Google Geocoding API can help with this. It will return the map coordinates for a given address or an appropriate status code if no match is found.