R - Using the output of compare_df() to update original data frame

449 Views Asked by At

I have 3 questions relating to the compare_df() function within the compareDF CRAN package.

I have two data frames with identical structures but different contents (this_week and last_week):

this_week
  Week   A   B   C
1    1   0   0   0
2    2   0   1   0
3    3   0   1   0
4    4   2   1   0
5    5   2   0   0       

last_week
  Week   A   B   C
1    1   0   0   0
2    2   0   0   0
3    3   0   0   1
4    4   3   0   0
5    5   0   0   0

I am using compare_df(this_week, last_week, group_col = "Week") to compare these two data frames. Specifically, I am interested in the second of the compare_df() function outputs which gives cell-level comparisons.

The output shows which cells have increased from one week to the next:

weeks_compared <- compare_df(this_week, last_week, group_col = "Week")
weeks_compared

$comparison_df
  Week chng_type   A   B   C
1    2         +   0   1   0
2    2         -   0   0   0
3    3         +   0   1   0
4    3         -   0   0   1
5    4         +   2   1   0
6    4         -   3   0   0
7    5         +   2   0   0
8    5         -   0   0   0

$comparison_table_diff
  Week chng_type   A   B   C
1    =         +   =   +   =
2    =         -   =   -   =
3    =         +   =   +   +
4    =         -   =   -   -
5    =         +   +   +   =
6    =         -   -   -   =
7    =         +   +   =   =
8    =         -   -   =   =

Interestingly, row 5 and 6 do not provide the comparison results that I would expect. I would expect:
row 5, column 3 ("A") of the second dataframe ($comparison_table_diff) to be "-"
row 6, column 3 ("A") to be "+".
However, it is actually the opposite way around:

$comparison_df
  Week chng_type A B C
5    4         + 2 1 0
6    4         - 3 0 0

$comparison_table_diff
  Week chng_type A B C
5    =         + + + =
6    =         - - - =

1) Does anyone know why this happens?

In addition, I do not know how to use this output further. My aims are:
2) To update the old data which has increased in last_week
3) To add an asterisk to the last_week data which has increased (in columns "B" and "C" only)

I have not found anything related to actually using the compare_df() outputs on Stack Overflow other than to simply paste these tables, which isn't sufficient for my task.

I wondered if anyone has done anything similar and/or could share some ideas of how I might go about reaching these two aims. Alternatively, would be interested to know if there is a better package to use/workaround for this task. And of course, let me know if there is any further information that's required.

Thanks in advance for any help you can provide!

0

There are 0 best solutions below