Split complex/compound sentence into Simple sentences Python NLP

800 Views Asked by At

I am working on a project that works with small sentences, So If user passes long sentences or complex/Compound sentences, I want to parse it to simple sentences and then pass it to the software.

I tried the spacy method, but it only works with conjunctions: i.e.: I am going to market and I will buy a book. After parsing: I am going to market. I will buy a book. (It splits it into two simple sentences)

But when I tried with more complex sentences like:

  • I have to save this coupon in case I come back to the store tomorrow. (should split as I have to save this coupon. I came back to store tommorrow.
  • In a way it curbs the number of crime cases happening every day. (should split as In a way it curbs the number of crime cases. Happening every day)

The code I have:

import spacy

en = spacy.load('en_core_web_sm')

text = "In a way it curbs the number of crime cases happening every day."

doc = en(text)

seen = set() # keep track of covered words

chunks = []
for sent in doc.sents:
    heads = [cc for cc in sent.root.children if cc.dep_ == 'conj']

    for head in heads:
        print(head.subtree)
        words = [ww for ww in head.subtree]
        for word in words:
            seen.add(word)
        chunk = (' '.join([ww.text for ww in words]))
        chunks.append( (head.i, chunk) )

    unseen = [ww for ww in sent if ww not in seen]
    chunk = ' '.join([ww.text for ww in unseen])
    chunks.append( (sent.root.i, chunk) )

chunks = sorted(chunks, key=lambda x: x[0])

for ii, chunk in chunks:
    print(chunk)

Is there any library/framework for doing that easily. or anyone suggest how to generate sentence tree on spacy and parse it, so I can break it on desired place.

0

There are 0 best solutions below