The quality of your data directly determines the quality of your decisions. Discover the framework that separates trusted data from noise.
The challenge
Every organisation holds vast amounts of data, used to draw insights and guide decisions. Yet the value of those insights is only as strong as the quality of the data behind them.
The key challenge is identifying which data is essential to your operations — understanding the decisions that need to be made and the data required to support them, then defining minimum quality standards.
Poor data inevitably leads to poor decision-making. But assessing data quality is itself valuable: it reveals where gaps exist, which formats work best, and what users truly need.
Poor data inevitably leads to poor decision-making. But assessing data quality is equally valuable in its own right.
Butterfly Data · What Does Good Data Look Like? 2025The DAMA Framework
DAMA — the UK government's recommended best practice — defines six dimensions for assessing data quality, together providing a comprehensive picture of what good data looks like.
No fields are illegitimately missing. Missing data can appear as empty fields or placeholders like null, N/A, or 0. Some blanks are legitimate — a conditionally mandatory field only needs a value when another field triggers it.
If 'UK Born?' is 'N', then 'Country of Birth' must be populated. If 'UK Born?' is 'Y', a blank country field is legitimate — not a quality failure.
Values conform to defined rules — covering length, format, permitted characters, and allowed values. Rules may be externally defined (e.g. UK postcodes) or set internally by your organisation.
A UK phone number must be exactly 11 digits, start with 0, and contain only numeric characters. Similarly, multiple date formats (21/12/1995, May 12 2005, 12-4-01) in one column signals a validity failure.
Data aligns with other records within the same dataset or across datasets. An address must match its postcode area. A person's birthplace should not contradict their country of origin elsewhere.
A record showing 'UK Born?: Y' and 'Country of Birth: Germany' fails on consistency — Germany is not part of the UK, so the fields contradict each other.
The data reflects reality — the most important and hardest dimension to assess. Accuracy can be verified through common sense checks or comparison against authoritative sources such as Companies House.
An adult patient's weight recorded as 50g is clearly inaccurate. Likewise, a company name that doesn't match Companies House records may cause issues in tax administration.
Information is available when needed. Critical datasets may require real-time feeds; others only annual refreshes. Data quality deteriorates as circumstances change — stale data can be as damaging as inaccurate data.
Relying on ten-year-old census data to estimate current populations leads to poor planning decisions.
No record is duplicated in a way that introduces conflicting information. Duplication creates unreliable records. Unique identifiers are invaluable for detection; composite keys handle legitimate historical duplicates.
A National Insurance number appearing twice with conflicting data likely means one record is outdated. An analyst must use date metadata to determine which entry is reliable.
Who's responsible?
Just as everyone is responsible for data security, everyone is responsible for data quality. If you work with data, you must assess whether it is fit for purpose — and processes must exist to address issues when they arise.
In practice, a tiered approach works best:
Good vs poor quality
Practical steps
Replace free-text address fields with dropdown or postcode-lookup inputs to eliminate format inconsistencies at entry.
A calendar selector removes free-text date ambiguity — no more dd/mm/yyyy vs mm/dd/yyyy confusion.
Disable form submission until all required fields are complete, with clear inline guidance when criteria aren't met.
Agree an organisation-wide standard for empty values — differentiating numeric and text fields — so missing data is always obvious.
Large datasets require automated scripts or COTS tools. Identify poor quality first, then apply comprehensive remediation to bring data up to standard.
The MoSCoW method divides data into Must haves, Should haves, Could haves, and Won't haves — allowing critical datasets to be prioritised while less-essential data is queued.
The hardest part is agreeing shared standards. Define organisation-wide baselines for dates, addresses, country codes, and identifiers, while allowing teams to use more granular data where needed.
The Data Management Association provides the UK government's recommended framework for data quality assessment and management.
COTS tools reduce development time; bespoke solutions offer greater flexibility. Implementation varies by organisational size and capability. Butterfly Data can help you evaluate both.
Free download
Our 14-page guide covers the complete DAMA framework, worked examples, enrichment strategies, and a practical remediation roadmap. Written by Dr Kimberley Green, Data Analyst at Butterfly Data.
Download now — it's free
Fill in your details below and we'll give you immediate access to the whitepaper.
Ready to act?
Book a free call with our experts. We'll assess where your data stands and build a practical improvement roadmap.