I am aware that I have not posted for sometime. This is because I have been trying to get my thoughts together to start writing my PhD thesis. I thought I would post part of my introductory section discussing Science and Geographic Information Science (GISc). Without GISc there would be no crime maps so it is quite important. Everything I know about the subject and a lot more is contained in the excellent bestselling book in the world on the subject, that has recently been published in its 3rd edition - "Geographic Information Systems and Science" whose lead author just happens to be one of my supervisors - Professor Paul Longley.
What I am trying to do is explain why GISc is undoubtaby a science (something I needed to be convinced about when I first approached the subject) and to start laying the ground work to justify creating and comparing classifications between areas using crime and policing data.
This section discusses the scientific context of the research. The discussion adopts a simplified approach, guided by others who have studied and written about the nature of science, its definitions and its academic divisions. These processes of definition, simplification, clustering and classification, often in an inexact and overlapping way, are of course central to the scientific method and central to the research carried out in this thesis.
The word ‘science’ is derived from the Latin via Old French and Middle English to denote ‘knowledge’ (Oxford online Dictionary). The purpose of all science is to transform raw data by analysis and/or experimentation into knowledge (Oxford online Dictionary, Longley et al 2010). Knowledge in this sense is an understanding of the system that is being studied so that it can be described by relationships, laws or theories that model its key dynamics. This is a process that goes further than describing a system by providing an insight into how it works. Part and parcel of understanding a system is to be able to identify uncertainties and where possible quantify them; the statistical use of confidence levels of results is an example of this. This quantification of variables and relationships allows research to be reviewed by peers with a view to validation and the generation of new knowledge through further scientific research.
Geographic Information Science (GISc)
The over-arching methodology used by this thesis is Geographic Information Science (GISc). Geography is defined by the (Oxford online Dictionary) as;
“the study of the physical features of the earth and its atmosphere, and of human activity as it affects and is affected by these, including the distribution of populations and resources and political and economic activities.”
The use of the word ‘Information’ refers to the organised collection of data, for instance in a database. GISc relies on Geographic Information Systems (GIS) for the storage of geographic data and organising it so that it can be used for scientific purposes. The data in GIS come from a myriad of different sources, some of which are specific to the creation of mapping the surface of the earth such as satellite imagery and others that are by-products of other data gathering activities such as census data and land-ownership. The data from the disparate data sources are linked by the fact that each row of data in every data set in the system is geo-referenced. This allows the joining of the different data by location or the keeping the data sets separate by layering the data by location or a combination of both. This leads to mentioning the special nature of spatial data that is based on the Tobler’s first law of geography;
“Everything is related to everything else, but near things are more related than distant things” (Tobler 1970).
Tobler’s law refers to spatial autocorrelation which affects the independence of variables which are analysed spatially. This spatial autocorrelation is two dimensional (unlike temporal autocorrelation which is one dimensional) because it radiates out from a fixed point. Both these facts necessitate the use of spatial statistics if the spatial nature of variables is being assessed. The complex nature of the application of Toblar’s Law to crime and policing data is discussed further in this thesis.
The creation of maps is a scientific process, representing the real world is a process of scale selection, data choice and emphasis based on purpose, generalisation and geovisualistion. These methods of representation are by their nature incomplete and imperfect as the only way to perfectly represent the real world is using a 1:1 scale which is pointless and totally impractical. This means that irregular lines on maps, such as coastlines and roads are simplified by excluding the smaller twists and turns; polygon zones are created to show features that are pertinent to the purpose of the map that generalise or amalgamate entities being represented. This can either be completely objective based on measurement and statistical criteria or purely subjective based on the eye of the mapmaker; often it is a combination, this where science meets art. Maps are designed to be intuitive (once the spatial language is understood) by duplicating how human-beings perceive and make sense of the real world both visually and intellectually. Our perception of things goes from the general to the more and more specific as the resolution increases; in a geographical sense, without the aid of maps, this involves travel, for instance moving from seeing a wood in the distance to entering the wood and only being able to see the trees. This raises the question, “Is it better to see the wood or the trees?” The answer is of course “It depends on your purpose.” The conclusion is that, in line with the saying “You cannot see the wood for the trees”, in many circumstances the general can be more important and informative than the specific.
Imperfect clustering and classification are fundamental to GISc. They can be applied to label areas on a map as forests even though there are areas within them of agricultural land or showing the residents of an area as “small town seniors” even though youth offenders are known to live there. The importance of this methodology to this thesis is that it allows hypotheses of relationships between different layers at the same locations to be proposed and comparisons between different locations to be investigated on a single or multi-layer basis. To continue with the tree example, it would be logical to have a polygon denoting timber production on a layer to spatially coincide with a polygon on a separate layer denoting forest but not necessarily all forest polygons spatially coinciding with timber production: If a polygon denoting timber production is nowhere near a forest polygon, this is worth further investigation and explanation.
GISc is a pure science because it is used to empirically measure and represent geographic characteristics and attempts to understand how they interact with each other. It is also an applied science because much of this knowledge is put to practical use, for example Internet crime maps for the public.