I'm experiencing an issue with two similar insertion sort functions in C++. Both functions are intended to sort a vector of string-character pairs based on the character value in ascending order. The first function insertionSort1 produces the expected sorted output, but the second function insertionSort2, which only differs by the placement of the decrement operation j--, gives an unexpected and incorrect result.
Below is the code snippet that demonstrates both sorting functions and their respective outputs. The only difference between the two functions is that in insertionSort2, j-- is combined with the assignment statement, while in insertionSort1, it is separated onto its own line.
#include <bits/stdc++.h>
using namespace std;
void insertionSort1(vector<pair<string,char>>&v) {
for (int i=1; i<v.size(); i++) {
auto curr=v[i];
int j=i-1;
while(j>=0 && curr.second<v[j].second) {v[j+1]=v[j]; v[j--];}
v[j+1] = curr;
}
}
void insertionSort2(vector<pair<string,char>>&v) {
for (int i=1; i<v.size(); i++) {
auto curr=v[i];
int j=i-1;
while(j>=0 && curr.second<v[j].second) {v[j+1]=v[j--];}
v[j+1] = curr;
}
}
int main() {
vector<pair<string, char>> v1 {
{"Alice" , 'B'},
{"Bob" , 'A'},
{"Charlie", 'B'},
{"David" , 'A'}
};
// Print the original vector
for(auto i : v1) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
// Sort the vector
insertionSort1(v1);
// Print the sorted vector
for(auto i : v1) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {David, A} {Alice, B} {Charlie, B}
cout << "///////////////////////////////////////////\n";
vector<pair<string, char>> v2 {
{"Alice" , 'B'},
{"Bob" , 'A'},
{"Charlie", 'B'},
{"David" , 'A'}
};
// Print the original vector
for(auto i : v2) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Alice, B} {Bob, A} {Charlie, B} {David, A}
// Sort the vector
insertionSort2(v2);
// Print the sorted vector
for(auto i : v2) cout << "{" << i.first << ", " << i.second << "} "; cout << endl; // {Bob, A} {Bob, A} {David, A} {David, A} // Unexpected output occurs here!
return 0;
}
The expected sorted output for both vectors should be: {Bob, A} {David, A} {Alice, B} {Charlie, B}
However, the actual output I'm getting after using insertionSort2 is incorrect: {Bob, A} {Bob, A} {David, A} {David, A}
Why does insertionSort2 produce an incorrect result when it seemingly only has a minor syntactic difference from insertionSort1, which works as expected?
@tbxfreeware
If possible, Taher, please post a program that demonstrates that there is no problem with integer data.
#include <bits/stdc++.h>
using namespace std;
void insertionSort1(vector<int>&v) {
for (int i=1; i<v.size(); i++) {
auto curr=v[i];
int j=i-1;
while(j>=0 && curr<v[j]) {v[j+1]=v[j]; v[j--];}
v[j+1] = curr;
}
}
void insertionSort2(vector<int>&v) {
for (int i=1; i<v.size(); i++) {
auto curr=v[i];
int j=i-1;
while(j>=0 && curr<v[j]) {v[j+1]=v[j--];}
v[j+1] = curr;
}
}
int main() {
vector<int> v1 {7, 4, 1, 2, 3, 6, 9, 8, 5};
for(auto i : v1) cout << i << " "; cout << endl; // 7 4 1 2 3 6 9 8 5
insertionSort1(v1);
for(auto i : v1) cout << i << " "; cout << endl; // 1 2 3 4 5 6 7 8 9
cout << "//////////////////\n";
vector<int> v2 {7, 4, 1, 2, 3, 6, 9, 8, 5};
for(auto i : v2) cout << i << " "; cout << endl; // 7 4 1 2 3 6 9 8 5
insertionSort2(v2);
for(auto i : v2) cout << i << " "; cout << endl; // 1 2 3 4 5 6 7 8 9
return 0;
}```
Under C++17 and later, the right side of an assignment expression is guaranteed to be evaluated completely, including all side-effects, prior to any evaluation of the left side. See: Refining Expression Evaluation Order for Idiomatic C++ by Gabriel Dos Reis, Herb Sutter, and Jonathan Caves.
Consider then, the first assignment (i.e., move right), when:
i== 1j== 0j + 1== 1You want
v[0]to be copied intov[1], using the expressionv[j + 1] = v[j--];What happens, however, is that
jis decremented beforej + 1is evaluated.j== 0j== -1j+ 1 == 0So instead of copying
v[0]intov[1], it is copied back tov[0], andv[1]remains unaltered.At this point,
jhas the value -1, which causes the "j-loop" to end. Afterwards, variablecurr, which came fromv[1](and which is still stored there because of the failure to overwritev[1]above), is stored atv[j + 1], which isv[0].So,
curris stored in two places:v[0]andv[1]. The data originally stored atv[0]was overwritten, and is lost. You can see this in the output below, in the line markedis2, after j-loop: i: 1, j: -1:If the decrement is moved to the left side of the assignment, as in
insertionSort3, below, then the sort works correctly.Expanded Output
This data was generated using the program from the next section. It verifies the overwrite that was predicted in the foregoing analysis.
Output operators / debug routine
This program has been adapted from the OP. It adds output operators and a debug function that simplify the task of watching the sorting steps.
What about integers?
@Taher Anaya, the OP, reports that the problem only occurs when sorting pairs, and specifically, that it does not occur when sorting integers. I have no explanation for this. The analysis above indicates it should affect any objects that are sorted using
insertionSort2.The mystery to me is not that
insertionSort2fails to sort pairs, but rather that it succeeds in sorting anything.If possible, Taher, please post a program that demonstrates that there is no problem with integer data.
[Edit:] Thank you, Taher, for posting a program. Unfortunately, I am not able to coroborate your results. When I ran your program under MSVC (/std:c++latest), I got output that shows overwritten data, similar to what was seen (and expected) with pairs.
This is the same result that @molbdnilo recorded from his run on coliru.stacked-crooked.com.