Privacy of Unstructured Data in a Public Health Authority
Maintaining the privacy of patient data through anonymisation of unstructured data
Butterfly Data’s client - a public health authority - collects and stores large amounts of patient data, both structured and unstructured. When used in research and analytics, this data provides valuable insights to support evidence-based decisions and improve healthcare efficiency, however data privacy is a challenge.
The Butterfly Challenge
The health authority faces challenges in utilising unstructured data (e.g., free-form text fields) for analysis due to privacy concerns and therefore required a capability to assess the considerations and issues surrounding the process of de-identification while retaining its value.
They contracted Butterfly Data to help them understand the basic principles and the next steps for their Analytics Unit in developing a robust, holistic approach to maintaining the privacy of patient data.
Developing Best Practices for Anonymising Data
To address the challenge, Butterfly Data conducted research and analysis through interviews, written insights, and a workshop with experts in anonymisation, de-anonymisation, and health data.
This provided a collective perspective on current best practices, from which a set of principles and guidelines was developed.
These principles focused on the technical aspects of anonymising unstructured data and included considerations for potential technical attacks.
The project also addressed consent and data protection issues but only when articulated in technical terms.
The project successfully delivered a report that provided guidance on the process, principles, and qualities that could serve as a minimum requirement for any tool or process used to anonymise unstructured data.
This report laid the foundation for understanding privacy requirements in free-text data and informed the development of future tools and processes to manage privacy risks effectively.