is there anything out there that extracts information from unstructured text(news articles, books etc)

613 Views Asked by At

I have been trying to find a program that can extract information from unstructured text(news articles, books, etc).

My eventual goal is to create a program that can take regular sentences and cache it in a database much like google does but without all its duplicate information.

lets take the NLTK example: "At eight o'clock on Thursday morning Arthur didn't feel very good."

the things that i would want extracted would be:

time: 8:00pm

date: thursday

person: Arthur

action: didn't feel good

is there a program that can do this?

i have tried using NLTK but i cant seem to find any good way to accomplish extracting the information.

1

There are 1 best solutions below

0
On BEST ANSWER

This problem is called Fine grained entity recognition. No, There are no tools (except for research works) that can add such semantics.
To start with, you can recognise Person and Time with appropriate models using Entity Recogniser.
You can recognise the actions from sentence parsing as suggested by @Junuxx.
Also give Wikify a try.
Thank you.