Discuss your challenges with our solutions experts

For details on how your data is used and stored, see our Privacy Notice.
Opinion

Avoiding sour data

Good quality metadata, much like good quality data, results in better quality decisions

1 minute read

Picture it, you are about to make a quick morning coffee ahead of a busy day. You open the milk, and it fails the sniff test. The milk is sour, and a quick glance at the use-by date proves that you are a day late.

The metadata on the carton of milk was correct –it stated the amount of milk, nutritional value and the all-important use-by date that ruined your morning coffee. Metadata is important, it gives humans and machines the ability to see what is in the box/file/data and judge whether to use the data and what for.

Definitions of metadata vary between organisations. For some, it’s simply the column headers, datatypes, and table names. For others it’s a little more complex, with quality metrics, data sources, significance, location, and data ownership. We can categorise these into descriptive, administrative, or structural metadata attributes.

But not all organisations are paying attention to their metadata. Many organisations are not capturing or tracking metadata. On the other hand, some organisations are capturing metadata, but have not considered the full extent of the use cases that metadata can enable.

Metadata has power. Although what you use metadata for depends on the needs of your organisation. To date, it is the most powerful way to bring order to chaos. With a data lake, for example, metadata allows the ability to navigate through a data lake to find the golden nuggets needed for your analysis, or subsequent data transformation into your data warehouse. Metadata is the high-level map for navigating the data estate. One often overlooked capability that good metadata provides is for the maintenance of historical records and versions of data that have different characteristics as they change through time.

With an increasing focus on data management, metadata is now considered for many other use cases. At WoodMac, much of our data is unstructured and before processing, we must apply context. Some metadata is curated manually, such as the terms and conditions for data use. In other scenarios, metadata population is automated, such as the website date and source for data captured using a web scraper. At Woodmac we use metadata for decisioning in our data pipelines and to flag records that require further attention.

Good quality metadata, much like good quality data, results in better quality decisions. Metadata should sit amongst the most important facets of the modern data organisation. If we ignore its’ capabilities, we risk mixing gourmet coffee with sour milk.

As a senior data modeller, Connor Boyle helps the data organisation develop an enterprise wide data model to govern and strengthen Woodmac’s data capabilities. He leads and manages the solving of complex data problems, collaborating with highly technical and highly commercial focused teams.

By guiding and helping to govern data Connor helps to improve the usability, quality, and availability of Woodmac’s data. This in turn allows a more enhanced analytics capability that allows better quality decisions and an excellent client experience.