Saturday, 12 February 2011

CAD incidents - validation of the k means classification method

London police violent and acqusitive crime incident in 2009 numbers each day
 This post is really quite interesting if you have following my blog.

I have carried out a simple analysis to test three important aspects of my research;
  1. How good is the SPSS 17 k means classification method
  2. Does the Metropolitan Police Service CAD data contain rhythms and cycles that reflect the lives of the people of London and  the way it is policed.
  3. What is the best CAD data to use to detect these rhythms and cycles.
From previous multiple bivariant correlations of the temporal nature of the number of different class or types CAD incidents I know that CAD incident class 1 "Violence against person" negatively correlates with the acquisitive crime incidents. I also know that there were subtle differences in the patterns of the total number of these incidents and those that had been graded as "I" for immediate - emergency response. The graph above shows lines for the number of incidents that fall into those four categories.

I carried out a k means classification using those four variables with the 365 days of the year as cases (using raw unstandardised data). I asked for seven groups to see if the k means classification could split the days into the right weekdays.

I find the result shown in the graph above quite exciting. I hope you can see why. The classification recognises a clear difference between Saturdays and Sundays and week days. The week days have been subdivided in two groups. I have been through all the dates which appear to have allocated to the wrong group and there is a very good explanation for each. For instance each Monday that was allocated to the typically Sunday group (group1) were Bank Holiday Mondays. Every other "misallocation" had a weather or holiday related explanation.

I have tried the classification with standardised data and additional and fewer variables but these four variables appear to me to produce the best results. 

So I now have even more confidence in the k means classification method, I am convinced I using a good set of data and I am sure that analysing the variations in violent and acquisitive crime incidents in the context of the police response is worthwhile.

