- The platform I am working on has different sources of data collection.
Sources: E-commerce platform, Payment Merchants(some bank tie-up), In-house loyalty program and Customer Demographics
Now we're trying to solve the Merge and Purge problem. But in order to do that we need to integrate the data into one place.
The column have different names and there is no unique id to merge the data on. We need to create custom keys like Name + DOB + some other criteria.
We want to automate the process of finding the best keys to merge the data on.
I have tried looking for research papers targeting this problem but couldn't find much. My only luck has been Paxata
Could someone just point me in the right direction as in where to start?
Finding the best indices to merge two(or more) excel sheets on
72 Views Asked by Purushottam Kumar At
0
There are 0 best solutions below
Related Questions in EXCEL
- Concatenate excel cell string within cell reference string
- Use hidden information for filtering data
- Using Vlookup in Excel sheet to match substring
- Import from api into multiple excel cells
- Loop through list of files and open them
- Pull and push data from and into sql databases using Excel VBA without pasting the data in Excel sheets
- Loop with equation for upper limit
- excel vba null value in array
- Why is my xml file having these after convert from excel?
- TextToColumns function uses wrong delimiter
- Difference between two dates in excel 2013
- Concatenate string and number as number
- Why in a pivot the "include new items in manual filter" option is grey out when source is a powerpivot?
- Count Unique Values Repeated Dates
- How do I extract info from crunchbase
Related Questions in MERGE
- Sub-directory into independent repository and later merge back into main repository
- R: Avoid loop or row apply function
- neo4j load csv invalid "ON MATCH"
- PIG merge two lines in the log
- In SVN, what's the different between "merge from a to b" an "merge from b to a"?
- git merge "deleted by us"
- Merge sort using CUDA: efficient implementation for small input arrays
- Merge project from other branch git
- Get a single line representation for multiple close by lines clustered together in opencv
- merge or mutate a summary (dplyr)
- How to I combine data from two XML files into the same structure?
- Merging PDF files with similar names using PDFTK and a bash script
- git has problems with squashing commits once there is "Merge branch"
- How can I combine elements at the same index from separate lists?
- Merging two sorted stacks
Related Questions in DUPLICATES
- show duplicate values subquery mysql
- Creating an ID variable for duplicates in SAS
- PHP multidimensional array, average of duplicate values
- Update unique column of multiple rows , skip the duplicates
- Removing values from a vector that are not duplicated at least x number of times
- Images duplicates when function is selected. Javascript
- SQL Query Filter to locate DUPLICATES in a column based on Values in 2 other columns
- how to find duplicate values using row_number and partition by
- Divide Rows of DataTable
- Returning a formatted string does not working properly
- How to prevent inserting duplicate rows based on two foreign keys?
- Creating a table of duplicates from SAS data set with over 50 variables
- Error: java.util.zip.ZipException: duplicate entry
- Find Duplicates in SQL Server Table
- How to do the sorting for duplicate numbers in Java?
Related Questions in RECORD-LINKAGE
- MySQL data matching: better options?
- Matching "fuzzy" data based on several inputs
- Is there a open source implementation for Fellegi-Sunter?
- Comparing elements of nested lists in python's record linkage library using BaseCompareFeature
- Left merge two dataframes based on recordlinkage pair matches (multi index)
- Join two data sets with two ID cols with missing data in R
- Compare and link strings with different word orders / word counts
- Duke Fast Deduplication: java.lang.UnsupportedOperationException: Operation not yet supported?
- Retrieving matched record ids in the recordlinkage library
- Python recordlinkage identity
- Jaro-Winkler's difference between packages
- Which technique should be applied to split a large text dataset for data matching?
- Resolving conflicts in Pandas dataframe
- bigquerqy sql link a common grid_id between groups PART II
- how to solve problem of the unhashable error in python?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?