Friday, 6 August 2010

Explaining clustering to non- mathematicians and non-geographers 3

In the last post I explained how the expenditure patterns of forces can be mapped using 28 variables to create exact locations for each force in a 28 dimensional space. In this post I provide hopefully self-explanatory maps and diagrams of the results of one of the clustering methods. The diagrams of the dots, are of course, purely illustrative as I, nor anyone else can depict 28 dimensions in two dimensions. This is a hierarchical method of clustering based on centroids. I will explain the method in the next post.

What I want you to note now is the fact that hierarchical clustering starts with one cluster which is divided into two, the third cluster involves the division of one of the two clusters, four clusters are made through dividing one of the three clusters, etc., etc. For instance, the two clusters shown above consist of Cumbria (in the northwest) and City of London Police (in the centre of London) as one cluster and the other 41 forces in  the other cluster. The third cluster is made from dividing the 41 forces. The fourth by dividing Cumbria and the City of London, though if the data were different it could be the one of the offspring clusters of the 41 that was further sub-divided at this stage. This family tree type structure with parents and offspring - hierarchical provides a rigid structure which only allows the division of existing clusters. This means that after the first division Cumbria and the City of London could never be clustered with another force. So if the "best fit" (I will try to explain this later) at the seven cluster stage involved Cumbria (say) being clustered with other forces this will never be achieved using this method.

