Quantcast
Channel: Machine Learning
Viewing all articles
Browse latest Browse all 62693

In need of a suitable class of algorithms for linking two sets of records.

$
0
0

I have a bipartite graph (two tables of records, U & V) that need matching.

Unfortunately in my case records may connect to more than one record in the other table, and all columns are continuous which seems to rule out normal ways of record linking.

Essentially there should exist an edge Eij between two records when part of Ui is part of Vj. The information I have is that the sum of edges connected to Ui should sum to the values in Ui, and the edges connected to Vj should sum to the values in Vj. (It doesn't always because the data can be bad.)

Help is appreciated, I've been looking at the Subset Sum problem, Constrain Propagation, Maximum Flow, and Bayesian Record Linking, and all of them seem like close fits, I'm not sure if any of them map to the problem very easily.

submitted by SCombinator
[link] [comment]

Viewing all articles
Browse latest Browse all 62693

Trending Articles