Observe changes in a text?

100 Views Asked by At

I have the following problem. I changed some parts of a online articel.
Afterwards, other people start editing this online articel.

Now I'm trying program a code with python that identify, if the guys after me, changed something (and how much, measured in added/deleted characters) in the part I changed.

For expample: The Text was: Hello Wolrd! What happend today? Goodbye I changed the Text to: Hello Wolrd! What happend today? Today I wrote an exam. Goodbye The Guy after me edited: Hello Wolrd! What happend today? Today I wrote a math exam. Goodbye

Now, the code should identify, that she changed "n math" and give me out how much percentage of my edit she changed. In this case: About 20 %.

I start using "difflib", but now I'm figured out that the code makes no sense. My code did the following: With difflib, I figured out the places in the text I changed. @@ -1,4 +1,4 @@
Afterwards, I checked the lines the guy after me changed in the same way. @@ -1,6 +1,6 @@ . After this, I compared, if lines are the same (+ value equal - value). But now, I figured out, that this does not work out. If the guy after me start editing in the middel of my changed part.

Have anybody a clue, how to do it?

1

There are 1 best solutions below

2
On

The diff match patch library (module diff_match_patch) produces a cleaner output and simpler to understand :

Taken from the official docs :

diff_main("Good dog", "Bad dog") => [(-1, "Goo"), (1, "Ba"), (0, "d dog")]

With the actual code :

from diff_match_patch import diff_match_patch
D = diff_match_patch()
D.diff_main("Good dog", "Bad dog")

a '-1' is a deletion

a '1' is a addition

a '0' means no change

See : https://code.google.com/p/google-diff-match-patch/wiki/API