Hello all - I'm new to machine learning although I have lots of experience doing prediction with OLS and Logistic regression (ie. cross-validaiton, out-of-sample prediction etc.) as well a bit of clustering (k-means mostly)
Ideally I'm looking for some advice on methods (maybe even an R package) on detecting boundaries across a spatial point process using a continuous outcome. The data is correlated in space but very noisy. There are 2 ways I'd like to ask the question - the second likely more difficult than the first.
Imagine I had the income of every household (and the xy coordinates of their house) in a city and wanted to cluster these events without an a priori notion of how many clusters exist. Again the data are noisy. What algorithms should I look at? Are any specifically well-suited to spatial data?
Imagine I took this same data where each house knew which street it was closest too. I would like to ask which streets exhibit the greatest statistically significant income differences on either side. What algorithms might be useful for this approach. I realize such a hypothesis test is likely not the best way to approach this problem but it seems intuitive enough for a lay person.
Thanks for the advice.
[link][comment]