The Collective of Transform Ensembles (COTE) for Time Series Classification
Find Similar History 46 Claim Ownership Request Data Change Add FavouriteTitle
CoPED ID
Status
Value
Start Date
End Date
Description
Time series classification is the problem of trying to predict an outcome based on a series of ordered data. So, for example, if we take a series of electronic readings from a sample of meat, the classification problem could be to determine whether that sample is pure beef or whether it has been adulterated with some other meat. Alternatively, if we have a series of electricity usage, the classification problem could be to determine which type of device generated those readings. Time series classification problems arise in all areas of science, and we have worked on problems involving ECG and EEG data, chemical concentration readings, astronomical measurements, otolith outlines, electricity usage, food spectrographs, hand and bone radiograph data and mutant worm motion. The algorithm we have developed to do this, The Collective of Transform Ensembles (COTE), is significantly better than any other technique proposed in the literature (when assessed on 80 data sets used in the literature). This project looks to improve COTE further and to apply it to three problem domains of genuine importance to society. In collaboration with Imperial, we will look at classifying Caenorhabditis elegans via motion traces. C. elegans is a nematode worm commonly used as a model organism in the study of genetics. We will help develop an automated classifier for C. elegans mutant types based on their motion, with the objective of identifying genes that regulate appetite. This classifier will automate a task previously done manually at great cost and will uncover conserved regulators of appetite in a model organism in which functional dissection is possible at the level of behaviour, neural circuitry, and fat storage. In the long term, this may give insights into the genetic component of human obesity.
Working closely with the Institute of Food Research (IFR), we will attempt to solve two problems involving classifying food types by their molecular spectra (infrared, IR, and nuclear magnetic resonance, NMR). The first problem involves classifying meat type. The horse meat scandal of 2012/3 has shown that there is an urgent need to increase current authenticity testing regimes for meat. IFR have been working closely with a company called Oxford Instruments to develop a new low-cost, bench-top spectrometer called the Pulsar for rapid screening of meat. We will collaborate with IFR to find the best algorithms for performing this classification. The second problem aims to find non-destructive ways for testing whether the content of intact spirits bottles is genuine or fake. Forged alcohol is commonplace, and in recent years there has been an increasing number of serious injuries and even deaths from the consumption of illegally produced spirits. The development of sensor technology to detect this type of fraud would thus have great societal value, and the collaboration with Oxford Instruments offers the potential for the development of portable scanners for product verification.
Our third case study involves classifying electric devices from smart meter data. Currently 25% of the United Kingdom's greenhouse gasses are accounted for by domestic energy consumption, such as heating, lighting and appliance use. The government has committed to an 80% reduction of CO2 emissions by 2050, and to meet this is requiring the installation of smart energy meters in every household to promote energy saving. The primary output of this investment of billions of pounds in technology will be enormous quantities of data relating to electricity usage. Understanding and intelligently using this data will be crucial if we are to meet the emissions target. We will focus on one part of the analysis, which is the problem of determining whether we can automatically classify the nature of the device(s) currently consuming electricity at any point in time. This is a necessary first step in better understanding household practices, which is essential for reducing usage.
More Information
Potential Impact:
We have chosen our case studies to demonstrate the breadth of domains in which time series classification arises and we hope these will act as a catalyst for other biological, food and climate scientists to work with us and/or our code. The investigators on this project have a strong track record of working with industry, and we aim to exploit our research to have a direct impact.
The work with Institute of Food research has perhaps the greatest potential for immediate impact on society and the economy. The horsemeat scandal shook the public confidence in the sector and the complexity in the international market for meat make it hard to guard against further occurrences. Devices like O.I.s Pulsar offer a cost effective mechanism for screening against contamination. If we can find a better algorithm for classification there is a simple and direct path to implementation within Pulsar. Forged alcohol is commonplace, and cases vary from simple economic crimes through to fraud with serious health implications. In recent years there has been an increasing number of serious injuries and even deaths from the consumption of poor-quality, illegally produced spirits. The development of sensor technology to detect this type of fraud would thus have great societal value, and the collaboration with Oxford Instruments offers the potential for the development of commercial hardware to facilitate the usage of the algorithms our research produces. Improving Pulsar and developing a new product will both have a positive economic and societal impact. Devices like Pulsar help with the public engagement with science, as demonstrated by its appearance on the BBC1 program Ripoff Britain http://youtu.be/t8zWLat8NQ0.
The collaborative research with Imperial is part of the important drive to understand the genetic components of obesity. Model species are useful in this respect as it is possible to directly connect behaviour to genetics in a reproducible way. Hence, if we can automatically detect worms that are exhibiting aberrant behaviour, we can then determine what mutations caused it. Conversely, we can cause mutations in the worm then observe behaviour. Both of these tasks require a laborious, manual identification of mutants. This project will not be involved with performing the experiments. We will instead help look at the best ways of automating this time consuming task.
Smart meters will soon be in all of our homes collecting detailed data on our electricity usage. This massive investment in technology must yield a significant reduction in our carbon footprint to justify the cost. The key to altering patterns of consumer behaviour is providing useful and relevant information. This in turn requires the ability to extract knowledge from the raw data. We will concentrate on the problem of identifying the nature of devices being used in a household. This offers the potential for constructing more complex models of behaviour based on combined device usage which in turn may lead to more informative advice on how to modify behaviour.
University of East Anglia | LEAD_ORG |
University of Rennes 1 | COLLAB_ORG |
University of California Riverside | COLLAB_ORG |
Alan Turing Institute | COLLAB_ORG |
Medical Research Council | COLLAB_ORG |
Scotch Whisky Research Institute | COLLAB_ORG |
Loughborough University | PP_ORG |
Oxford Instruments plc | PP_ORG |
Vermont Energy Investment Corporation | PP_ORG |
The Whisky Tasting Club | PP_ORG |
University of Bath | PP_ORG |
MRC London Institute of Medical Sciences | PP_ORG |
University of California Riverside | PP_ORG |
Anthony Bagnall | PI_PER |
Jason Lines | COI_PER |
Stephen Cox | COI_PER |
Subjects by relevance
- Consumer behaviour
- Time series
Extracted key phrases
- Time series classification problem
- Time series classification
- Transform Ensembles
- Problem domain
- Collective
- Second problem
- Smart meter datum
- C. elegan mutant type
- Meat type
- Device usage
- Electricity usage
- Bone radiograph datum
- Datum set
- Eeg datum
- Raw datum