Chapter 14 Designing Spatial Coverage Samples Using the k-means Clustering Algorithm
聚类分析
计算机科学
算法
数据挖掘
人工智能
作者
D.J. Brus,J.J. de Gruijter,J.W. van Groenigen
出处
期刊:Developments in psychiatry日期:2006-01-01卷期号:: 183-192被引量:26
标识
DOI:10.1016/s0166-2481(06)31014-8
摘要
In situations where we do not want to use available environmental data at the sampling stage of a soil survey because we are uncertain about the correlation with the soil attribute, the best thing we can do is to disperse the points in geographical space. This chapter describes a simple method for selecting such spatial coverage samples. The study area is partitioned into geographically compact subregions from which one point is selected purposively. The partitioning is done by clustering the cells of a fine raster with the k-means clustering algorithm, using the x- and y-coordinates of the midpoints of the cells as classification variables. The centroids of the clusters are used as sample points. The method was tested in two case studies. In the first study, the locations of 23 points were optimised within a square area, and in the second study, 32 points were added to 6 prior points in an irregular shaped area. In both studies, the average kriging variance (AKV) and maximum kriging variance (MaxKV) for the spatial coverage samples were compared with the AKV and MaxKV for the geostatistical samples obtained by directly minimising these criteria with the spatial simulated annealing (SSA) algorithm. The AKV values for the spatial coverage samples and the geostatistical samples obtained with AKV as a minimisation criterion were comparable. If we want to minimise MaxKV, then the SSA procedure is preferable.