1219th Ordinary General Meeting - Royal Society of NSW News & Events - The Royal Society of NSW

1219th Ordinary General Meeting

Wednesday, 5 March 2014

"Big data knowledge discovery: machine learning meets natural science"

Professor Hugh Durrant-Whyte FRS, CEO, National ICT Australia

Hugh Durrant-Whyte is an internationally-recognised expert on the analysis of "big data" – the mass of information that is being generated around current information and communication technologies. Much of this is "metadata" – data that is captured as part of some activity (for example, when a digital photograph is taken also recording camera settings, capture date etc or the data kept by telecommunication companies every time a mobile phone call is made).

2.5×1018 bytes of data are generated every day – there is immense value in mining this data but this requires sophisticated analytical techniques. "Data analytics" is the term coined for technologies to analyse this data in areas as varied as the finance industry, the health industry, planning infrastructure, failure analysis in mechanical and electronic equipment and environmental analysis, to name but a few examples. Data analytics utilises Bayesian probability theory (named after Rev Thomas Bayes, an 18th century mathematician) to prepare quantitative models of existing data, gathering new data to address remaining problems and then updating model to incorporate both the old and new data.

Data analytics can be modelled using three types of mathematical functions: discrete functions that describe, for example, events or people's actions; finite probability functions, such as signals or locations and infinite probability functions such as spatial fields or temporal fields. As the masses of data available increase, the analysis can converge on knowledge. For example, payment patterns exhibited by individuals can be aggregated to behaviours of bank branch customers, giving an understanding of consumer behaviour. On the other side of the table, customers can utilise masses of data to take advantage of the best deals available or to customise internet-based content that they may wish to buy.

Where masses of historical data are available (for example, managing water assets) readily available historical parameters can be analysed for such applications as predicting equipment failures. In the case of water asset management, pipe age, soil type etc can be analysed to give a probabilistic analysis of when a water main might fail.

The mining industry has invested large amounts of money in developing systems to utilise masses of existing information to automate mine operation. This can take all available data around the surface of the mine, the subsurface, mapping, drilling, to create a completely integrated data model into a single, real-time representation of the mine.

The purpose of National ICT Australia (NICTA) is to utilise these data analytics approaches to produce leading-edge technologies and models for such varied applications as financial modelling, creating large-scale fully integrated data maps of regions (perhaps even as large as continental Australia). There is also a particular focus on data-driven discovery in the natural sciences in applications as varied as modelling ancient plate tectonics to predict mineralisation (on a timeframe of as much is 1.5 billion years) or ecological modelling, for example, predicting the growth of trees. Ultimately, these may be able to be integrated into one massive model of the Australian continent.

Site by ezerus.com.au

Privacy policy  |  Links to other societies

All rights reserved; copyright © The Royal Society of NSW.