Using excel to verify data sources

582 Views Asked by At

I want to know if there is a function or formula I can use to verify sample data source 1 (multi-lines) to data source 2 (multi-lines).

Example: Source 1 Cell A1 (doc) - 123 Cell A2 (div) - SED Cell A3 (amount)- $100 Cell A4 (line) - 01

Source 2 will be a data dump of different combination of doc, div, amount, and line. I want to know if there is a formula or function that I can do to see if Source 1 matches to any record in Source 2. It has to be a perfect match.

1

There are 1 best solutions below

2
On

There are multiple ways to do this. The simplest is MATCH. MATCH takes a unique value, looks at a list of values, and tells you how many cells in that value was first found. Like so:

=MATCH(A1,C5:C10,0)

In the above case, assume A1 = "Hello" and C6 = "Hello". Since the match was found in the 2nd cell of the search range, MATCH returns 2. The 0 indicates that only exact matches are desired; -1 / 1 in that space indicates that you have a sorted list and want to take the closest value above/below the unique value.

MATCH can be combined with INDEX to take that information and pull in an appropriate value for the related ID. ie: Assume as above, but also that B5:B10 contains, say, a vendor name for the unique ID's found in C5:C10. The vendor name for the 2nd line item of that group can be pulled from column B as follows:

=INDEX(B5:B10,2)

But this can be combined with MATCH to pull any vendor name where the match existed in column C, like so:

=INDEX(B5:B10,MATCH(A1,C5:C10,0))

A function often used for these purposes is VLOOKUP. VLOOKUP looks at a 2D datablock (say, C5:D10), and finds a specific unique value in the leftmost column. It then returns the value x amount of columns to the right from the leftmost column. Like so [assume that C5:C10 still means vendor names, and you are matching a vendor name in column C to pull the total billings from column E]:

=VLOOKUP("ABC Co.", C5:E10, 3,0)

This is incredibly similar to the INDEX / MATCH above, with 3 main differences: (1) it is a little simpler if you have a straightforward block of data; (2) Your unique ID key needs to be to the left of what you're searching. So in the above example, if Sales was in column C and vendor name was in column E, you couldn't do a "right-left" search. You would need to instead re-order your data or use a different method; (3) If you insert a new column between C & E, your formula will still say "3rd column to the right", which will no longer point at your invoice number in column E.

If you have specific questions about how to implement this, we need more specific information about how your data is laid out and how you want your results laid out.