Introducing our fourth student cohort

Hello! We are the fourth cohort of the CDT Data Analytics & Society program. Navigating our first semester on this integrated masters and PhD course has been a very different experience, but we have adapted successfully to online teaching. We have found it challenging as a cohort to connect, considering many of us have never even met in person!

Our first module was named Programming for Social Scientists with The University of Leeds. This was a two-week intensive module taught by Andy Turner who brought our group together, as well as equipping us with the necessary Python programming skills which our PhD projects will require. Even though this two-week intensive course tested our limits, we can all agree we gained valuable skills including introduction to Agent Based Modelling, creating our own website and learning how to use GitHub. We began our social research modules taught by our home universities – either Leeds, Sheffield, Liverpool or Manchester. Since our undergraduate backgrounds varied from Mathematics to Psychology these modules introduced us to new ways of thinking, preparing us for undertaking our own research in the coming years.

We are excited to continue our journey of learning with the upcoming Data Science Studio module delivered by Dr. Daniel Arribas-Bel at The University of Liverpool. Although, sadly we cannot live and learn in Liverpool as many of us were expecting, we are still looking forward to supporting each other online.

We would also like to give a huge thank you to all the team members who have supported us and helped us to integrate into the Data Analytics & Society program. We are all looking forward to starting semester 2 in January!

Article by Cameron Ward (University of Liverpool) & Shivani Sickotra (University of Sheffield)

 

PhDing in a pandemic: a guide on surviving

There is no escaping the fact that 2020 has been an unprecedented year. One way or another we have all been affected, some of us more than others. It is important for us to recognise that things are not business as usual and to try to adjust working arrangements to fit into the ‘new normal’.

PhDing has always been known to be a lonely venture. For us in the Data and Analytics CDT, however, it came with the advantage of having a cohort of people going through a similar journey. This alleviates some of the loneliness that can come with working on a solo project. Some of the advantages of being in a cohort – grabbing coffee breaks together, asking for help for the coding error, or just breathing the same air – have been lost with the pandemic and all of us working remotely; some on ironing boards, others on dining tables and a few with a dedicated home office. Regardless of our working conditions, there are two key changes that we have faced: we are mostly doing our reading, writing and analysis in a confined space, and many of us are now doing this alone (possibly surrounded by family but this still does not equate to being surrounded by colleagues).

Figure 1: PhDing in a Pandemic – powered by candles

With this in mind, I asked fellow CDT colleagues to share some of the things they found made working from home easier, and thanks to all of those who responded, I have definitely picked up a few tips. I hope you find them as helpful as I did.

  • Create a suitable working pattern– one of us says they are maintaining a proper working week, so working 9-5 and no work over the weekend or evenings. This I think is very useful, as routine and consistency is a sure way of getting things done. For me, a 9-5 workday isn’t possible, so I typically work from 10pm until about 12 midnight as my kids are all in bed and I have few to no distractions.
  • Adopt the Pomodoro technique – this has been found to increase productivity. The technique, in summary, means you work four 25 minutes slots with 5 minutes break. After the fourth pomodoro, take a longer 20 minutes break and then start the cycle again. Another colleague advised the use of Kanban Flow which is an app with an inbuilt pomodoro clock and the ability to itemise and track a to-do list. With this, you can track what tasks has been completed (this gives a feeling of accomplishment) and you can document interruptions (this may give you insights to identify better a working pattern). I have just started using Kanban Flow myself (see below screenshot) and it has been very helpful in keeping me on track.

Figure 2: Kaban Flow Task list and Pomodoro Record

Pro tip – shut all socials down during work sessions (there are apps that help with this too).

  • If you can, get a proper workspace which includes a desk, chair, monitor and/or laptop stand, keyboard, mouse, back support, basically the whole nine yards if possible. AND KEEP IT OUT OF YOUR BEDROOM (if you can).
  • Change your work environment. This may be instead of working on your dining table, working outside, or using an ironing board in another room. With universities gradually opening up and some letting us book in days to work in the office and libraries, this should act as a good second location. Otherwise you might want to consider visiting a local café.
  • Don’t feel guilty when you have less productive days, or you engage in things you love that are outside your PhD. Make it a habit to take weeklong breaks – we are entitled to annual leave.
  • Join a shut up and write session with other PhDers – I want to try this; I probably won’t shut up though😜
  • Identify things (PhD and non-PhD related) to look forward to, both short term – end of the day – and long term – a month or more.
  • Another common tip shared is to Exercise. This helps to break up your day and keep you reenergised, it can be Zumba, yoga, walking, running. One of us said “Going for a run at lunchtime to distract myself has broken many a (code-related, writing) wall”. It works!
  • Create some dope playlists to keep yourself company.
  • Be kind to yourself!

 

I hope some of these make PhDing easier for you.

Article by Noelyn Onah 

Project available with Global Law firm Taylor Wessing

We have a project available at the University of Liverpool working with a global law firm looking at using data science in the area of legal decision making.

The deadline for applications is the 21st August and full details can be found here – https://datacdt.org/projects/legal-decision-making/

Please note that this opportunity is only available to UK and EU applicants.

For details on how to apply please see here – https://datacdt.org/entry-criteria-applying/

If you have already submitted an application to the Centre, please email datacdt@leeds.ac.uk to express interest in this project, please do not submit a new application.

New project advertised at Leeds with Procter & Gamble – deadline 21st August

P&G Logo

A further PhD opportunity has been made available at Leeds working with one of the largest consumer goods companies worldwide, Procter & Gamble. The deadline for applications is the 21st August and full details can be found here – https://datacdt.org/projects/rasch-theory-and-application-to-consumer-goods/

Please note that this opportunity is only available to UK and EU applicants.

For details on how to apply please see here – https://datacdt.org/entry-criteria-applying/

If you have already submitted an application to the Centre, please email datacdt@leeds.ac.uk to express interest in this project, please do not submit a new application.

Contribution to the UK2070 final report documenting the extent of spatial inequalities & proposing actionable strategies

On the 27th February the UK2070 Commission launched the final report documenting the extent of spatial inequalities & proposing actionable strategies. One of our Liverpool students, Nikos Patias contributed to this report along with his supervisors, Franciso Rowe and Dani Arribas-Bel.  The policy brief can be found here – http://uk2070.org.uk/wp-content/uploads/2020/02/07-Neighbourhood-Inequality.pdf

 

 

Partner engagement – working with Improbable

One of the first things I learnt on this PhD programme, was understanding that I was in a partnership – not only with my supervisors and the university, but also with my Sponsor. In the first year of our programme, we all undertook an internship module with the objective to work alongside our sponsors on a small project. And so, for two weeks I had the opportunity to work with Improbable, out of their London office.

Before I continue, I should properly introduce Improbable to those who may not know who or what they are. Improbable is a British Multinational technology company that was founded in 2012. They created SpatialOS, a platform at which you can perform large-scale simulations and create virtual worlds and environments, for uses not limited to, video games and corporate simulations. They have a history of partnerships with Google, Softbank and Epic Games.

And so, learning all this about them and what they do, I knew I had to maximise my experience with them by gaining as much knowledge and skill from them as possible.

Whilst with them in London, I got to learn the projects in development, their areas of interest and how I fit into the grand scheme. Another became conscious of was, when having discussions about research, regardless of who with, transparency and respect are integral to the betterment of that relationship – and so in the context of my relationship with Improbable, I was glad to have a clear, open and professional channel of communication with them throughout the internship and afterwards.

A few months later, following a successful two weeks with Improbable, they sent Katie who official job title is Applied Scientist, to work with us out of LIDA for two weeks. At the start of her two weeks here, we introduced her the other PhD students and to some of the research going on within our CDT. Then, over the following days she worked closely on a project with my supervisors, which she was kind enough to summarise as the following:

I was investigating dynamic fidelity – which can be described as the ability to switch between an ABM and a higher-level, less computationally expensive model – in simulations of civilian movement around a city. The high-level model would be learnt from simulating using the ABM for an initial period before switching to the high-level model. The simulation would continue to run using the high-level model until there was some material change in the environment, causing civilian patterns of movement to significantly change in some way and resulting in a switch back to the ABM.”

There are quite a few PhD students, including myself, that are using ABMs i.e. Agent-based modelling methodology within our projects, so what was helpful about Katie’s work was the broadening of my understanding of ABMs as a tool.

On reflection of my first year on this programme, some of my most interesting experiences have occurred whilst collaborating with my sponsor. Though working in partnerships and in teams can be challenging at times, from brainstorming ideas and intense debates, the knowledge and the friendships gained are invaluable. Over the next academic year, Improbable are happy to give me another opportunity to intern with them again, but this time for much longer than two weeks. I am truly excited to see what we get up to next.

 

This article was written by Deborah Olukan, one of our second year students based in Leeds. You can view her full profile here.

Centre for Data Analytics and Society welcomes its third cohort of students

Hi! We are the newest University of Liverpool based cohort, part of the third year intake for the CDT in Data Analytics and Society. In our group we have James, Cillian, Hope and Peter. James is working with Marlan Maritime Technologies to design resilient coastal cities using big data. Cillian is working with Ordnance Survey on improving the Geolocation of emergency service response using big data. Hope is working with the Liverpool City Region researching human dynamics within an urban and regional context. Peter works with Data Fusion Ltd looking at the Geodemographics of British streets. We are part of the Liverpool University Geographic Data Science Lab: a research and teaching centre focussing on making sense of the world by using spatial data to make intelligent decisions.

We have recently hosted our fellow Year 1 students from Leeds, Manchester and Sheffield for a short course which involved using Python for Data wrangling and machine learning, taught by Dr Dani-Arribas-Bel from the GDSL. The course was designed as an introduction to data cleansing, manipulation and visualisation through use of the Python programming language, as well as both supervised and unsupervised machine learning techniques that will no doubt form a fundamental part of our research. Following this week, we have been sent away with an assignment to choose our own datasets and conduct some in-depth data analysis using the techniques we have learned from the course.

Over the last couple of months we have been studying for the Masters component of our CDT programme, as well as developing the ideas and thesis structures of our CDT projects with data partners and supervisors to stand us in good stead for the coming years. We are currently gearing up for a two-week internship with our data partners which will provide us with invaluable industry experience, as well as the opportunity to work with data that we will likely be working on for the next couple of years.

The Liverpool course was a great way to start the New Year and we are looking forward to the Manchester and Sheffield courses coming up in the next couple of months!

Turing data study group – April 2018

5 months into our PhD, we (Keiran and Noelyn) applied and got accepted to attend the April Data Study group at the Alan Turing Institute in London. Data Study groups are intensive five-day collaborative hackathons, where data scientists of all levels are brought together to solve interesting real-world data problems submitted by Challenge Owners. Challenge Owners typically come from diverse backgrounds, e.g. industry, government, academia and the third sector, providing participants with the opportunity to work on a wide range of problems that they wouldn’t encounter in their day-to-day work. It takes place at the Alan Turing Building in London, located at the iconic British Library.

 

Unlike more traditional application processes that focus on CVs and cover letters, the application process for the Data Study group focuses more on participants showing off their technical skills (as well as their ability to collaborate and communicate) by sharing a portfolio of work that illustrates their strengths. Dr Kirstie Whitaker, a Turing Research Fellow, shared her thoughts (https://www.turing.ac.uk/blog/how-write-great-data-study-group-application) on what she looks out for when assessing an application. Noelyn shared a google drive link which contained her MSc footballer’s value prediction script and report, and a script for a time series prediction model, both written in Python. without prior work experience as a data scientist, her application highlighted recently gained coding skills and zeal to apply them, ability to be a team player, and desire to learn during the process. Keiran, on the other hand, sought to demonstrate his coding and teamwork skills by drawing on the experience of working in industry as part of a development team.

 

Once accepted, Noelyn’s greatest hindrance to attending was childcare provisions; however, the organisers were very accommodating suggesting she brought her kids along and offering to provide accommodation that would fit. Although she ultimately made other arrangements, this alone cemented her desire to be there and highlighted their agenda of inclusivity.

 

The five days

Keiran was fortunate enough to be provided with accommodation  in university halls just 5 minutes walk from the Turing Institute, making for an easy commute to the venue. This was particularly valuable given that the programme really is what it says on the tin (‘intensive five-day collaborative hackathon’), starting at 9am on the Monday (most participants arrived on the Sunday) and finishing at 4pm Friday. In-between, participants work up until 9pm, sometimes 10pm. This is made much more tolerable by the breakfasts, lunches and dinners provided, as well as  an array of snacks, iPad powered coffee and fridge full of fizzy drinks.

 

The first day included registration, a briefing from the organisers and introduction of the challenges by the owners, an icebreaker, group assignment and after lunch group work begins. Starting group work on the first day, gives participants an opportunity to meet other group members and scope working solutions. This is also an important opportunity to rethink group membership (your suitability), which is what Noelyn  had done and by the next day, joined another group after speaking with organisers.

 

The 2nd, 3rd and 4th day were really straight into the deep end. The end result is not meant to be a fully functioning solution, instead it would be a collation of several ways to tackle the problem which the company can take forward and improve on. This meant that we, the participants, were not restricted and thus given the opportunity to use our expertise while working with team members to ensure that typical data exploration and pre-processing steps were undertaken. To ensure cohesive working and non-duplication of work, each team had a facilitator who worked as the ‘project manager’. Here (https://www.turing.ac.uk/blog/data-study-group-researchers-perspective),  Chanuki Seresinhe, a visiting researcher talks about her role as a facilitator.

 

We ended up in the same group, working on a large dataset of training and user records provided by eGym (a company that develops and manufactures advanced products for the fitness market) along with other researchers from a range of backgrounds as well as the project owner. Given the nature of the Data Study Group, we were allowed free rein over the direction in which we took our investigation. This culminated in members of the team splitting off into smaller groups to work on subproblems. The two of us ended up working together, focusing on clustering and segmenting gym users based on their characteristics. This work could then be used to specialise later modelling processes which aimed to estimate the performance gym-goers based on their information and previous performances. Working collaboratively on this project was made possible through the Turing Institute’s cloud virtual machine system, using slack for communicating within and across teams, the use of overleaf for report writing and using Git for code repository.

 

Although the days may have been long, time was made for socialising in the evening, with a trip to the Namco Funscape arcade allowing the groups to bond as teams.

 

Whilst each of us became progressively more fixated on our respective corners of the group project, regular catch-up sessions were organised throughout each of the days by our facilitator ensuring that we were all aware of each other’s work and how it might relate to our own, and keeping spirits up when things got tough. Beyond this, he ensured that we each documented our contributions such that by the end of Thursday, we had a cohesive report and presentation which we proudly presented to the other participants, challenge owners and academics on Friday morning.

 

Final presentations were followed by lunch and a well-earned trip to the pub where we were free to let our hair down and pat ourselves on the backs for a frantic (but fun) week of work.

Kia Ora from New Zealand: CDAS students present their work at the International Medical Geography Symposium, Queenstown

LIDA and GeoHealth Lab researchers at the University of Canterbury, Christchurch

 

This summer, CDAS students Francesca Pontin and Vicki Jenneson from the Leeds Institute for Data Analytics (LIDA) took their research to the other side of the world at the International Medical Geography Symposium (IMGS) in Queenstown New Zealand (1 – 4 July).  Here they summarise their experiences.

“We feel so privileged to have been given the opportunity to present our work as part of a diverse conference programme, which brought together geography, epidemiology and policy. It was an added bonus to have the opportunity to explore New Zealand’s natural beauty, before, during and after the conference.”

Vicki’s journey started in Auckland with a trip to the University of Auckland to meet Professor Cliona Ni Mhurchu, a key figure in healthy retail interventions. This new connection could lead to exciting future collaboration prospects and add value to the existing relationship with Vicki’s UK retail data partner.

Fran and Vicki then met for two days of workshops at the University of Canterbury in Christchurch. They were joined by fellow Leeds students Charlotte Sturley and Rachel Oldroyd as well as supervisor Michelle Morris and LIDA and CDAS director, Mark Birkin.

“It was a pleasure to meet contacts from New Zealand’s Ministry of Health and the Canterbury Geohealth Lab and to learn about their unique collaborative model. Along with Rachel and Charlotte, we’ll continue to work closely with the New Zealand team to write an upcoming commentary paper about the health research-policy landscape in New Zealand. We hope that this relationship will continue to grow in the future and look towards the potential for overseas exchanges between LIDA and the Geohealth Lab students and staff.”

After the workshop in Christchurch, it was on to Queenstown for the IMGS conference, but not before an impressive pre-flight Parkrun effort by the team.

Charlotte, Michelle, Rachel and Fran endured -5 degrees at the Hagley Park Parkrun in Christchurch

The Queenstown conference took place in a spectacular setting framed by mountains and Lake Wakatipu. Although a daunting prospect for some of the students it was their first international conference, they were soon encouraged by its friendly and supportive atmosphere. The students embraced the unique opportunity to engage with a diverse global community of multi-disciplinary researchers across health, geography, policy and more. The conference provided a great platform for them to develop their networking skills, and along with their supervisors, Fran and Vicki fostered new and cemented existing connections with researchers both in the UK and worldwide.

The IMGS celebrated cultural diversity, with traditional Māori singing and dancing, while speakers addressed the serious issues of health inequalities affecting indigenous Māori and Pacific populations in New Zealand, providing great context to local problems.

View of Queenstown from the conference venue at dusk

Fran enjoyed presenting her work on the use of smartphone data for monitoring physical activity, and said “It was great to be able to present my research to such a specialist and knowledgeable audience. The ensuing conversation around using commercial smartphone data to monitor activity highlighted the potential such data provides in extending the current sphere of knowledge. IMGS has allowed me to make great connections in the UK and further abroad, with potential collaborations on the horizon.”

Fran Pontin presenting at the IMGS, Queenstown

 

Of her talk about spatial and demographic patterns in fruit and vegetable purchasing in Leeds, Vicki said: It was a really encouraging experience to see the level of discussion and interest that my talk generated. It motivates me to know that I’m doing something truly new and valuable to the wider research community. The dataset that I’m working with is novel and there was lots of excitement about it; I feel that presenting at IMGS helped to put myself and LIDA on the international scene for healthy food retail and big data research.”

Vicki Jenneson presenting at the IMGS, Queenstown

The students also found time for plenty of downtime to explore the breath-taking surroundings. Day trips were invaluable team-building opportunities which strengthened relationships between students and their supervisors within informal settings including; boat cruises, a very muddy bike ride, a winery tour, climbing mountains, skiing and the conference dinner!

Mark Birkin, Fran Pontin, Michelle Morris and Charlotte Sturley pre-muddy cycle ride

Fran Pontin and Michelle Morris on the summit of Ben Lomond

Students Charlotte Sturley, Rachel Oldroyd, Vicki Jenneson & Fran Pontin with supervisor Michelle Morris

Scenic walk around Lake Wakatipu for LIDA students and staff

The students would like to thank the Leeds for Life Conference Award scheme, the Centre for Spatial Analysis and Policy at the School of Geography and their supervisors for their funding and support which enabled them to embark on this exciting experience.

The next IMGS meeting will take place in Edinburgh in 2021. Both Fran and Vicki hope to return to this meeting to present further findings as they approach the end of their PhD research project. The IMGS conference is highly recommended to PhD students with a focus on epidemiology and spatial applications to health research.

 

Introducing Dr Henri Kauhanen, Postdoctoral Fellow

Dr Henri Kauhanen is ESRC Postdoctoral Fellow with the Data Analytics & Society CDT from October 2018 to September 2019. Affiliated with the division of Linguistics and English Language at the University of Manchester, Henri works on mathematical and computational models of the population dynamics of language, looking for explanations of universal factors of linguistic variation and change that recur from one language to another. Originally trained as a cognitive scientist, Henri received his PhD in linguistics from the University of Manchester in April 2018, supervised by linguists Ricardo Bermúdez-Otero and George Walkden (now at the University of Konstanz) and theoretical physicist Tobias Galla.

 

For his one-year ESRC Postdoctoral Fellowship, Henri is taking a data-driven approach, concentrating on a focal topic in language dynamics but one that has so far received surprisingly little attention. This is the question of what basic rates different features of human language evolve at, and what is to be made of cases where several changes are governed by identical rates of change. To this end, he is writing computer software that aids in fitting the predictions of different models of linguistic change to empirical data, as well as conducting Monte Carlo power analyses of existing methods for model selection, utilising Manchester’s HTCondor high-throughput computing framework. The software will be released as open source packages for the R statistical computing environment by the end of the fellowship.

 

In addition to work on implementing computer code and writing research articles, the project includes a significant networking and skills development component. Henri attended the 2019 Complex Systems Summer School run by the Santa Fe Institute [link: http://www.santafe.edu] in Santa Fe, New Mexico in June and July, attending lectures on complexity, chaos, networks, nonlinear dynamics and related topics, but also working on group projects with physicists, biologists and social scientists. In August, Henri is organising a symposium on language change at Manchester [link: http://rusesymposium.org.uk]; alongside regular talks, the two-day long event will feature three keynotes by eminent scholars in the field, focusing on resolving some of the often considerable tension between computational modelling and empirical, data-oriented work.

 

In the future, Henri is planning to continue working on computational models of language change, aiming in particular to increase the realism of currently available models. In October, he is moving to Germany to take up a second postdoc at the Zukunftskolleg [link: http://www.uni-konstanz.de/zukunftskolleg], an Institute for Advanced Study for Junior Researchers at the University of Konstanz.

 

To find out more about Henri’s research, visit his website at http://henr.in [link: http://henr.in].