Machine learning and data analytics for nurturing talent and performance profiling in cricket

Project reference: MN76

Application deadline: 10th April 2023

How to apply

Cricket is considered the world’s second most popular sport. The plethora of available cricket data and the development of AI technologies have created a massive demand for cricket data analytics. The applications of AI in the cricket domain have increased dramatically during the last two decades.

This PhD project is about developing and validating customized machine learning and data analytics methods for mining cricket-related data to support decision-making in talent nurturing and performance profiling.

The project will be done in collaboration with the England and Wales Cricket Board (ECB), which is the national governing body of cricket in England and Wales. ECB will also provide access to several years of detailed national and international cricket match data, (longitudinal) performance data of cricket players, and scouting data. The data available to hand is of huge volume, and hence another question to be answered by the successful student is which of the many datasets to use, from which periods, and at what granularity.

Detailed project description:

This project will focus on two main questions/research gaps:

1. What are the driving factors for the development of England’s cricket players?

Addressing this question will involve using statistical analysis and machine learning (e.g. feature importance, dimensionality reduction, clustering) to explore the demographic and developmental sporting history of different player roles (e.g. pace bowlers, batters and wicket keepers) with differing expertise levels to assess the impact these factors have on the development of England’s players. Having an improved understanding (that is backed up by the data) of the driving factors, will be useful for nurturing talent and customize training regimes.

2. What data and performance metrics can be used to predict success for different formats of cricket (e.g. Test, ODI, iT20)?

Addressing this question will involve using feature engineering and prediction methods to identify performance metrics in the First Class (county) game that predict success at International level in Test, One Day International and T20I) formats for English players (potential to explore overseas players). Ultimately, one may want to influence the driving factors for development (Q1) according to the performance metrics identified in Q2 to provide cricket players with a personalized career path.

The successful student would be able to develop the research programme within the scope of the two broad topic areas.

The project will be done in collaboration with the England and Wales Cricket Board (ECB), which is the national governing body of cricket in England and Wales. The board oversees all levels of cricket in England and Wales, including the national teams: England Men (Test, One Day International and T20I), England Women, England Lions (Men’s second tier), Physical Disability, Learning Disability, Visually Impaired, and Deaf.

ECB will also provide access to several years of detailed national and international cricket match data, (longitudinal) performance data of cricket players, and scouting data. The data available to hand is of huge volume, and hence another question to be answered by the successful student is which of the many datasets to use, from which periods, and at what granularity.