Search This Blog

Wednesday, 29 September 2010

CAD grid square analysis of violence and robbery in London

These two maps use the same data as my other maps at Ward level but this time I have calculated the number incidents in a grid square that is 250 metres by 250 metres. This is the maximum accuracy of the Metropolitan Police Service Computer Aided Dispatch system which records these incidents. There are 26116 such squares in Greater London. This compares to 624 Wards

This is where scale comes into the analysis. Will analysis which is using units on average which are 42 times smaller and have a totally equal size (except those that clip the London Boundary) make a difference to the correlation and predictability between robbery and violence against person incidents?

The graph above shows the answer. There is a predictability of almost 0.6 and a correlation of 0.77. These are high figures.

Tuesday, 28 September 2010

Robbery incidents can predict violence incidents and visa versa

This is the graph and map following the methodology of the previous post. The correlation is almost exactly the same as the robbery data arrays at 0.55 and 0.67 when the two West End outlier wards are removed. This raises the intriguing possibility that robbery incidents can predict violence incidents and visa versa.

Sure enough as the graph above shows a high level of predictability between the two incident categories. The correlations are 0.86 for all the Wards and actually goes down to 0.82 if the West End wards are removed. There are interesting aspects to this graph. Even though there is very high predictablity/correlation overall those seven Wards that have between 6 and 8 violence z scores appear to have a very low level of robbery predictability.

Is it possible that good policing may be an influence here?

This a map of London showing the the levels of deprivation in each Ward based on the average score of the Lower Super Output Areas scores calculated for the English Indices of Deprivation 2007. As I have discussed in previous post criminology theories suggest that the higher the deprivation the higher the crime. So as I have carried out my violence analysis at Ward level I can investigate if there is correlation between my deprivation values for each Ward and my various incident occurrence values. First I am analysing robbery incidents.

The correlation between the two arrays of scores is 0.58 but goes up to 0.66 if the West End outliers are removed. More interesting is to plot the two set of values against each other to determine a linear regression line. This line has an equation that describes its slop and where it bisects the y axis. It also has an  R squared value or coefficient of determination. This shows the degree of influence that deprivation has on the occurrence of robbery incidents (to put it simply). See here for a more detail. In this case it is 0.34 or just over third.

Now each Ward has a new value that can be measured above or below the line by using the equation to determine the score if it were on the line and subtracting that score from it actual score. This new score can be mapped to show those Wards that have a higher and lower score than predicted by the line.

We know from previous post that entertainment venues has a high influence on the occurrence of violence, including robbery, which explains some of the high scores. The Wards that do not have that "excuse" and have high scores are therefore interesting. So are the Wards with lower than predicted scores, especially those shown in green. Is it possible that good policing may be an influence here?

Sunday, 26 September 2010

Google Earth London robbery map

The information you get on this is my research literally as I am doing it so there is a quite a bit of experimental stuff.  This is another example.

This is my latest development and an idea that I have not seen on web before and certainly not in the crime mapping environment. Instead of providing an interactive map I am going to provide a .kmz file which you can load into Google Earth and zoom in, zoom out, add your own features to etc.

You can get the file here at my new website -

The top picture should be what you get when you load up the file. The legend for the colours can be found here in a recent post. You can zoom in and click on the wards to discover the numbers of robbery incidents that occurred in 2009. There are other numbers shown that you will understand if you have been reading my posts.

If you turn off the robbery layer by removing the tick you can see the layer of dots underneath. These are the CAD  250 metre by 250 metre grid squares (see previous posts , search on "square" or CAD in the search bar above). I have created a layer by doing a spatial join in ArcMap and counting the number of incidents in each grid square. The resulting layer creates a .kml file that is too large to display in Google Maps so I have only provided those grids squares with 26 or more robbery incidents or an average of one every two weeks or more. This gives a simpler and more focused display in my view. Yellow are the lower numbers, blue the higher and red in between. You can click on them to find the actually number which are shown next to  "count". 

By moving the transparency bar it is possible to see the layers together to find the robbery hotspot within the Wards. Then you can zoom in and see the area in which these robberies occur.

Friday, 24 September 2010

Violence classification including robbery mapped

This my classification of violence at Ward level from police incident data in 2009. I have now included robbery data. Even though some incidents have codes that mean they could be counted twice in this classification I have decided to avoid this by using the following hierarchy.
  • 1 and 29 - domestic violence all
  • 1 but not 29 violence but not domestic violence
  • 5 but not 1 and 29 robbery but not violence or domestic violence
The last two could possibly be changed so that robberies with the code 1 could be counted but it does not make a lot of difference as these are about 400 out of  over 36,000 incidents (the combination of 5 and 29 is less than 100). My thinking that putting 5 and 1 together is tautology unless the incident had an additional violent element. This brings me onto an additional important point, we are not dealing with seriousness here we measuring numbers; a murder incident has the weight as a common assault.

The above is the table I based my classification on. I think this will be my last violence classification, for now.

Robbery map of London

Back to violence. This is a map of robbery incidents. Let me explain in simple terms what a robbery is. It is a theft that involves the application of violence to an individual or individuals or threat of violence to achieve its purpose. Now because robbery, particularly street robbery is and has been for decades a police priority to reduce, only those where the final letter of the law is complied with find their way into the official crime statistics. Therefore you get pick pocket crimes that are simple thefts and bag snatches which most people would think of as a street robbery but in law is classed as a theft because the violence has been applied to the bag not the person carrying it. You can then get an assault followed by a theft and a theft followed by an assault which are not robberies. The use of a weapon to threaten potential violence is an armed robbery though there is no separator legal category for this in England and Wales.

But my data set of incidents is not quite so exacting as that, though it is normally based on the information given by the police officer initially investigating the case who would not have got through training school without learning the definition of robbery verbatim so it is likely to be broadly accurate.

Thursday, 23 September 2010

Stop the Rot O'Connor

I have been torn away from my analysis to study the headline news in the UK about the police and anti-social behaviour (ASB). Her Majesty's Inspectorate of Constabulary (HMIC) has published a number of linked reports about policing ASB in England and Wales which can be found here. They are important and interesting documents that unusually for the Home Office, are of varying quality. That aside there is so much I could comment on but I will be brief.

The authors behind this report are Sir Denis O'Connor and Prof. Martin Innes, the architects of Reassurance Policing. This makes everything a bit surreal because to large extent they are saying that RP and its offspring Neighbourhood Policing (NP) has failed to properly address ASB, probably their primary raison d'etre. In O'Connor's case he seems to acting as if this radical change in policing style which he championed and the Labour Government supported with vast amounts of money had not taken place in the last 10 years.

I argue that there are two styles of visible (uniform) policing - Community or Neighbourhood Policing and Response Policing. The two have to be properly balanced. Prior to O'Connor's NP initiative dating back to 2001 in Surrey and London and being adopted throughout England and Wales in the next four years to an extent that it is now embedded everywhere, the balance was out of kilter in favour Response Policing. Unfortunately the balance is now out of kilter the other way. Response times have been lengthened and the types of incidents police will attend promptly have been reduced to accommodate NP. What has the research published today found? ASB is best dealt with promptly and decisively as it is happening - the role of 24 hour Response policing. I am actually in favour of NP as it provides a balanced approach to policing but not at the expense of Response Policing. For NP to be effective the officers must command respect and be proactive using the full range of policing powers; and I would argue be on duty when the problems on their patch occur. Unfortunately at least two out of the three criteria do not apply to Community Support Officers, who are the backbone of NP.

Lastly, there is politics behind this. O'Connor I am sure pushed for the Police Confidence overarching performance target and the Policing Pledge under Labour as these support NP. The present Home Secretary stamped on them with her (in)famous shoes, and kicked them into the dustbin. This I think explains O'Connor's amnesia, he has to act like he is starting the fight for NP all over again with the looming government spending cuts. It also probably explains why the reports have an unfinished, unloved feel about them.

Wednesday, 22 September 2010

What is the difference between clustering and classification?

This is my understanding of the answer to the question I posed. I have been doing a lot of clustering but in this post I presented my first classification. This is not my final version, I have more potential variables up my sleeve. The classification is a description of the variables based on their membership of a cluster. I now know why creators of geodemographic almost exclusively use the K means means of clustering; it gives an output of the where the final centres of each cluster is for each variable. This gives a pretty good idea of the nature of, in our case, the Wards in each cluster. This is the table for my classification;

The variables were adjusted around the mean, with 1 representing the mean value. This ensured that each variable was given equal weight in the clustering process. The description I gave to each cluster was based on the information in this table. For instance cluster 1 is high in all three variables so areas where violence is common, thus my description, cluster 4 seemed to me to be pubs and club venues from the very high value of the second variable.

I am not saying my descriptions are perfect, I am explaining the process. Of course the members of the clusters are varying distances from the cluster center (this information is also given) and therefore will not necessarily exactly match the mean characteristics of the cluster. It should also be borne in mind that it is the dominant relevant features of the Ward that are being described. It is highly unlikely that all parts of the Ward will comply with that description. We then get into a discussion about the scale of analysis which I have discussed before on this blog.

So what is the answer to the question? Classification is the next stage in the process after clusters have been created; classification involves the description of clusters based on the characteristics of the cluster members.

Domestic Violence incidents mapped

The map shows the distribution of police incident domestic violence in London in 2009. I have included these data in my classification of violence in London. This was an interesting decision I include the data for a number of reasons. First is shown in the table below;

In my first post about the analysis of violence explained I would be using the three result code fields for incident, "1" is "violence against person". This is the basis of all this analysis. I am now looking at the combination of "1" with "29" - "Domestic Incident". The table shows the total  number of incidents that contain these codes and the number of incidents that contain both. The combination of 1 with 29 is more than 4 times higher than 1 being recorded with any other incident code; it is therefore an important combination in the data set.

Second domestic incidents are a high priority for the Metropolitan Police with incidents of domestic violence a particularly high priority. Thirdly the data set for my clustering was biased towards non-residential locations as I used late night weekend incidents mostly associated with entertainment areas. Domestic violence by its nature mainly occurs in residential locations and thus counterbalances the previous bias.

There are arguments why I should not include DV incidents. The main one is that it may not properly record the occurrence of domestic violence as the more police treat it as a priority the more the demand will be for police to deal with it. So to a certain extent the demand is associated with positive police activity. This may be the arguments of police in the boroughs of Lewisham and Greenwich where domestic violence appears high. The fact is all demand is associated with police activity creating or maintaining confidence that police will deal with the incident appropriately. The point is the data are about the nature of the demand on police, but the above concerns are valid as domestic violence tends to happen in private and depends to a large extent on the victim reporting it to police whereas other types of violence tends to happen in public places and therefore is reported by witnesses and/or discovered by police.

Geopolisographics - an explanation

In the last post I invented the word "geopolisographics" whether it will stick remains to be seen. It attempts to mimic the ancient Greek etymology of "geodemographics" which if broken down has the following meaning - "geo" meaning earth or world literally but is used to mean location, place, space; "demo" people; and "graphic" description. As the data I am using do not describe people directly such as age, gender, income, interests; but instead describes demands on police and police activity that is related to place and time I feel that replacing demo with poliso still makes the word pronounceable and better describes what I am doing. (My data of course indirectly describes people.) "Poliso" is derived from the ancient Greek word "polis" meaning city and "polissoos" meaning guardian of the city.

Tuesday, 21 September 2010

"Geopolisographics" - the analysis of police incidents by where and when they occur

I am quite proud of the map above. It is my first classification map. It is in the style of geodemographic classifications but it is not geodemographics it is "geopolisographics" - the analysis of police incidents by where and when they occur. This is again based on "violence against person" police incidents in 2009. I have added domestic violence incidents into the mix. I will explain more later.

Sunday, 19 September 2010

Mapping violent crime patterns

This post is a continuation of the analysis presented in the last three posts.

In this post I am going to outline how I have clustered the Council Wards in London based on the "violence against person" incidents in 2009. I am clustering the demand profile that these incidents present to the police. The idea is to find similar Wards in London based on the variables I use.

The choice of variables is critical. I have hopefully shown that Wards can be grouped by number of relevant incidents in the year and whether these occur between midnight and 4am on Saturday and Sunday mornings or whether they do not occur at these times. I am therefore only using two variables at this stage.

This means they can be plotted in a two dimensional graph based on the incidents that occur at those two separate times. This is shown above.

I then load the simple three column spreadsheet into SPSS17 and perform two different types of clustering calculations. I have outlined in some detail how these calculation work in six posts starting here. I used two different methods - K means and Ward's Hierarchical (do not be confused Ward is the person who devised the method and nothing to do with Council Wards, its just a burden I have to bear when writing about my clustering analysis).

So this is what the map of London looks like when produced by ARCMap and the accompanying graphs in MSExcel.

I have specified six clusters and I have sized the graphs so the x and y axis have approximately the same scale.  I have tried to keep the same colours in the maps and the graphs. If you look at the graphs closely you will see that the two different methods split the clusters up in similar but slightly different ways. What becomes obvious is that because the y axis has many more incidents in most cases that the clustering does not really take any notice of x axis. Therefore the maps are just clusters based on incidents that happen outside the Saturday and Sunday midnight to 4am. This is not good enough. I have decided that both variables are equally important as each other so I will have to make adjustments to ensure this is reflected in the clustering.

Mathematically this quite simple (I hope I have got this correct having made that bold statement). First I calculate the proportion or percentage of the x axis value is to the y axis value in each Ward. That is (x/y)*100. These percentages are then all added together and divided by the number of Wards to give an average percentage, which is in this case  is 10.70% (to two decimal places). So to make the x axis variable have the same scale as the y axis; 100%/10.70%  gives a figure of  9.34 (to two decimal places).  This 9.34 is then used to multiply the x variable incidents for each Ward.

The resulting three column table is loaded into SPSS17 etc. and the following maps and graphs are produced.

The clustering now takes both variables equally into account but the two methods split the clusters differently especially the brown and orange. More in subsequent posts.

Thursday, 16 September 2010

Identifying different London Ward violence profiles

I have been analysing violence in London based on the demand profile for each council ward using police incidents that have been resulted as including "violence against person". I have discovered by this analysis that the period after midnight to 4am on Friday night Saturday morning and the same period Saturday night Sunday morning can be used to group wards. I am developing that idea further in this post.

London 2009 police "violence against person" incidents on Saturday and Sunday midnight to 4am showing council wards
 The dark blue/purple show the most violent wards in London after midnight on Friday and Saturday nights. I have not checked each one but I think the blue/purple and red/purple cover most of London's major night drinking and entertainment venues.

London 2009 police "violence against person" incidents NOT on Saturday and Sunday midnight to 4am showing council wards
 This map shows the violent areas of London for the rest of the week. Some wards appear purple on both charts but others change quite markedly.

London 2009 showing the percentage of  police "violence against person" incidents on Saturday and Sunday midnight to 4am showing council wards
 This map is slightly different. It is not based on the number of incidents, it is based on the proportion of incidents that occur in the early hours of Saturday and Sunday mornings. This allows the profile of less violent areas to be judged. It also justifies my decision to carry out MPS wide analysis to help interpret my Camden Borough data because it shows that the Gospel Oak one of only three wards (very light green) that have the lowest percentage of incidents that happen in the weekend period under analysis.

Next stage is to see if this data provides good clustering and classification material.

Wednesday, 15 September 2010

Temporal Profiles of Violence in the top Ten Wards in London

These are again the top 10 violence wards in London in 2009 that I mapped in the previous post. This post shows the temporal profile of the occurance of these incidents. To me there are two distinct groups with one or two being in both groups. See if you agree with me.

Group 1 are those where most incidents happen in the early hours of the morning at weekends - Marylebone, St James, Fairfield and Romford Town. These are dominated by town centre entertainment venues. The other group is the rest, with Coldhabour being probably the best example, here the violence is spread more evenly over the week, so I suspect less dependent on pubs and clubs. Stratford and New Town and Shepherd's Bush show combination of both. It must remember that wards are relatively large units which can contain diversity of all types. Smaller geographical units of analysis may help with this.

Tuesday, 14 September 2010

Violence mapped by Ward

London 2009 police incidents that included "violence against person"
This map shows the violent Wards in London according to police incident data. The Wards highlighted in dark blue/purple are shown in the table below.

Monday, 13 September 2010

Resilience Index and the Index of Multiple Deprivation

In this post I am in danger of giving you more information that you will ever want or need about the Resilience Index (RI) and the Index of Multiple Deprivation (IMD) but here goes. The post is prompted by Matt Ashby's comment on the previous posting about the RI. My reply is relevant this post.

First a little table that shows the correlation between three indices, the RI, the IMD 2007 rank summary of Local Authorities (which can be downloaded here) and the rank index of IMD which is used as a data element within the RI - RIIMD. I have correlated all Local Authorities in England and London Local Authorities, called boroughs separately.

The first row shows a far higher internal correlation with the RI for England than for just London. The second row shows there is virtually no correlation between the RI and 2007 IMD for England but a negative (in practical terms positive as the indices are orientated inversely) correlation for London. The third column shows that the two ways in which the IMD has been interpreted at LA/borough level has no correlation for England but a strong correlation for London. What this shows is London is not typical of the rest of England.

The next table shows the rankings for London. The equivalent table for England is too large to include.

The Charts below show the above information in scatter graph regression charts.