Data mining, text mining and network association are all statistical tools that have come into their own as the shear quantity of available computational power increases. True, you do not need to have a strong basis in math to use these programs, but math can help determine where they may be used.
Text data mining takes the standard associative keyword based search techniques and increases their effectiveness through the ability to map associations with other words and to create visual representations of the data. This allows an investigator to drill down into previously undetermined associations and also allows the investigator to analyze immense amounts of data. One of the problems in the past has been in how to represent this data.
This is where visualisation technologies come to play. These allow the investigator to uncover previously hidden relationships in the data. More importantly, the visualisation techniques that are available today make the reporting to a lay jury simpler.
In the visualisation network:
A dot represents a person and is also called a node.
A line connecting two dots represents an existing conversation and is also called an edge.
The GEOMI software developed at NICTA is an Open source project that consists of a set of Java scripts developed at the Systems Biology Initiative, University of New South Wales (Ho et al. (2008) J Proteome Res., 7:104-12).
The benefits that come from this type of visualisation come from the simplification of complex datasets (such as social networks, chats and logs) into an easily comprehensible 3-D map that a user can rotate, zoom and otherwise interact with.
My team has used this type of program in modelling chat logs. In the image above (the names have been altered to remove the details related to a case), the social networks are displayed with the tightly connected groups being packed together and the "outsiders" to the conversations are displayed further apart in the network display. This program has allowed for the display of social relationships between chat users. Additionally, it has been used to model changes to logs and to detect tampering with evidence.
The GEOMI program is developed by the Systems Biology Initiative, UNSW. [Prof. Marc Wilkins, Director, firstname.lastname@example.org, Simone Li & Edwin Ho]. With their help, Ignatius and I shall be publishing a paper on the use of this and other visualisation programs in the following months.
Craig Wright, GFCA Gold #0265, is an author, auditor and forensic analyst. He has nearly 30 GIAC certifications, several post-graduate degrees and is one of a very small number of people who have successfully completed the GSE exam.