How to make Google Diff Match Patch prefer changes at the end of a string?

3.8k Views Asked by At

I am using the diff_main method of Google's DiffMatchPatch library to get diffs which I then use in my app. Consider this case:

Old string:

Tracker.Dependency.prototype.changed = function () {
   for (var id in this._dependentsById)
     this._dependentsById[id]._compute();
};

New string:

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c;
};

The addition diff I get is:

;
};

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c

Whereas it would seem that for human consumption a more reasonable diff would be:

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c;
};

Is there any way I can make DiffMatchPatch produce the second result rather than the first?

You can see an example here: https://jsfiddle.net/puje78vL/1/

1

There are 1 best solutions below

7
On

I have created a JSFiddle based on the library's author example page (assuming that you want the Javascript version based on the question tag).

And using this code would give me what you expect:

var dmp = new diff_match_patch();

function launch() {
  var text1 = document.getElementById('text1').value;
  var text2 = document.getElementById('text2').value;

  var d = dmp.diff_main(text1, text2);
  var ds = dmp.diff_prettyHtml(d);

  document.getElementById('outputdiv').innerHTML = ds;
}

You can also look at the console and see the raw answer (the arrays) where you can also see that diff_main is returning what you are expecting. Are you doing something different? If so please share your code.

New Info

Now that you provided the full text I can give you a better answer: the result you are seeing is ok, it is just the way the algorithm works

I'll try to explain to you what is going on and how to fix this. Let's take a look at the final part of each text:

Text 1

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};

Text 2

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};

Tracker.autorun = function (f) {
  constructingComputation = true;
  var c = new Tracker.Computation(f);
  return c;
};

Lets notice this:

  1. The final }; of the changed function on Text 1 has no carriage return after it.
  2. The final }; of the changed function on Text 2 has carriage return after it.
  3. The final }; of the autorun function on Text 2 has no carriage return after it.

So the algorithm that calculates the diffs will match 1 with 3, leaving 2 as the added text. This is why tou are getting that output.

Now in order to get the desired output you will need to match 1 with 2. This means add a new empty line at the end of Text 1 as you can see on tour updated JSFIddle:

Tracker.Dependency.prototype.changed = function () {
  for (var id in this._dependentsById)
    this._dependentsById[id]._compute();
};[PRESS ENTER HERE TO ADD NEW LINE]

Take notice that if you use just this text the algorithm will work correctly (as I showd in my orginal answer). It is after you add some more text when this confussion starts to happen, not sure why.