i have two Rows on a Cassandra ColumnFamily an want to Compare the Values of Columns with the same Columnname, eg:
CF: User
Key: Columns:
......................................................
K1: {Col1: "Andy" V1: "100"} {Col2: "Tom" V2: "100"}
K2: {Col1: "Andy" V1: "120"} {Col2: "Tom" V2: "90"}
Now i want to compare difference K2 Columns With K1 Columns to get this Result in Cassandra:
Key: Columns:
.........................................................................
K1: {Col1: "Andy" V1: "100"} {Col2: "Tom" V2: "100"}
K2: {Col1: "Andy" V1: "120" Diff: 20} {Col2: "Tom" V2: "90" Diff: -10}
At first i want to Code this with Hadoop but i see A Problem that i can#t define two Keys for a Map Process?
Haddop was the choice because it must be a scalable solution.
I hope anyone has an tipp for?
BG, Danny
I dont understand by which row the base of substraction will be represented? K1[V1]-K2[V1] or vice versa?
Ok, lets say the row with recent timestamp will be a base.
You Map step should emit the following (K => V):
Reduce step will receive array of pair, for each values are sorted by the timestamp:
Now in reduce step you can easly perform a substraction and write necessary columns like "diff" to database