Unstructured data from health forums is an ‘untapped’ resource for policymakers
Pilot study suggests qualitative data from online forums could improve healthcare professionals’ understanding of patient needs
The wealth of unstructured, qualitative data available in online forums is an untapped resource that healthcare professionals and policymakers should use to their advantage, a report has said.
The think tank Demos and health charity The King’s Fund have published a joint report on how algorithms could be used to help analyse posts about mental health made on publicly available websites or forums.
Data from these forums offers a different perspective on health, the report said, as it gives an insight into the lived experience of those with mental health problems, as well as the advice and support given by their peers and assessments of interactions with health services.
The report said that the data could be analysed at scale by using natural language processing algorithms, which involves training computers to better process and manipulate human languages.
The King’s Fund calls for clarity on funding for fully-digital NHS
DH trials algorithm tool to analyse consultation responses in effort to handle ‘click democracy’
GDS expands data science training programme for civil servants
The pilot study used a modified web scraper on more than 1 million posts made between June 2004 and May 2016 on six online forums. The data was psuedonymised and then used to train natural language processing algorithms to understand how people discuss mental health online.
The software was tested on three questions, asking whether it could accurately identify: cries for help, where people wanted guidance from other users; discussions about cognitive behavioural therapy; and cases of co-morbidity, where a mental health problem coincided with long-term physical conditions.
The report said that the software had accuracy rates of around 65% for both cries for help and identifying posts about CBT, with this increasing to 72% accuracy in identifying posts where the person had had CBT. The team also claimed a 98% accuracy for the 50 posts they assessed for co-morbidity.
According to the authors, there is huge potential for analysis of publicly available data to be used to inform policymaking, for instance by offering health regulators more insight into the performance of providers and giving service providers themselves a better understanding of their users.
Josh Smith, a co-author on the report and researcher at Demos, said the study “highlights the potential for new technology and methodologies to provide a whole new perspective on mental health”.
However, the report also acknowledged the “significant technical, methodological and ethical challenges still to overcome”, including concerns that free text entered in online forums might include identifiable data, which would make it difficult to fully anonymise data.
The report stressed that the approach “is not and never will be a silver bullet”, saying that the data should only be seen as a complementary source of information.
The work received ethical approval from the University of Sussex Ethics Review Panel, but was not considered a clinical study as it did not recruit patients from the NHS and didn’t gather clinical data or make interventions that would affect anyone’s care.
The Department of Health recently revealed that it was looking into the use of algorithms to make sense of unstructured data, with digital strategy manager, Laurence Erikson, saying that it was using machine learning to help analyse responses to public consultations.
“The findings so far are intriguing,” Erikson said in a blogpost about the work, published in January. “We found that the machine learning approach reinforced some of the findings of the manual approach, but also identified new insights from the consultation responses.”
In February, MPs announced plans to investigate the use of algorithms in decision-making, to look at whether, and how, they can be used in a transparent or accountable way.
Consultancy will support data flows and booking platforms
Report from Public Accounts Committee calls for Cabinet Office to set out plan to improve availability, quality and use of information
Work aims to make it easier for NHS number to be found, which contract says will help reduce issues such as false death notices
Potential of data economy – and importance of protecting information – highlighted by study