Redefining data products: calming the noise
The term data product has become so wide ranging it offers little value to strategy conversations, but there is a way forward.
The great irony of the data profession is that whilst we preach the importance of consistent definitions and taxonomies, our field desperately lacks them.
‘Data product’ is a great example of this issue.
When I first came across the term, I assumed it referred to things like Amazon’s or Netflix’s recommendation algorithms. Data work that redefined entire industries!
Data products, I thought, could answer many of the challenges of a data industry that had become bogged down in technical activities with unclear business merit or promising proof of concepts that never quite scale to deliver their full potential.
I had high hopes of using data products to reset data strategy and focus the multitude of data teams and initiatives around a handful of data products that added value to customers. To have a data strategy that warranted the same level of board discussion and investment as a well thought through product strategy.
However, I soon realised that data products would not unite data teams as everyone had, like myself, defined the term around their own biases and beliefs of what data should be for.
Calling a derived or processed data source a data product helped those working on enterprise data warehouses to make the case for more dedicated resources (data product managers for those data sources).
Calling dashboards a data product helped those responsible for sprawling self-serve analytics platforms to argue the need for accountable dashboard owners.
And the list goes on until the point we are at today, where everything from a raw data source to an automated decision-making tool has been called a data product.
So what?
A term this broad lacks credibility. It doesn’t pass any reasonable sniff test.
As Chief Data Officer of a major car company I did not feel that I could credibly tell the board that we had 100s, if not 1,000s of data products. A £23 billion turnover car company has around a dozen products, so how can a CDO credibly claim to have more?
Depending on how you count Apple, a $383 billion turnover company, has only 8 products with a number of product variants for some of the products.
This reveals an interesting difference between the data world and everyone else.
Whilst a car comprises thousands of engineered parts and numerous customisable features, none of the engineers responsible for those parts or features feel the need to call them products. They understand that, in the context of a car company, a product is a model, like the Range Rover.
However, in data, we’ve taken to calling everything a product.
Why the mix up?
The mix up has come from the rise of product management and the desire to manage [x] as a product.
Product management is a powerful tool and I’m as proud of using the tool as anyone else. Next week I will have an article on managing data culture as a product. But, I would never claim that data culture is a product. It isn’t, for much the same reasons, that whilst a raw data source can be managed as a product, it is not an actual data product.
Actual products deliver value by fulfilling a need or solving a problem for customers, and crucially, they can be marketed, sold, and used independently.
This definition is an impossibly high bar for most, but not all, data products. Credit scoring agencies, geospatial data services, market and economic forecast providers, LLM providers and many others all develop, market and sell actual data products.
Benefits of actual product management and strategy
Organisations have to manage many things, and the concept of managing something as a product is a great one that should stay and be widely applied.
However, actual products are the heart of what an organisation does. They’re where the organisation’s value, cost, and complexity come from.
A good product strategy has as much to do with understanding target markets, unique value propositions, and financials as it does with the art of portfolio management. This involves balancing the need for diversification with the reuse of existing assets and capabilities to minimise complexity and waste.
Understanding target markets and value propositions. Reusing assets and minimising complexity and waste. Data has A LOT to learn from actual product management and strategy.
In a well-managed engineering company, you can’t design a new part and certainly can’t incur the costs and overhead of putting a part into production unless it meets the needs of a genuine product. If only that were true of data. If it were impossible to create and share new derived data sources unless they were needed for genuine products, we’d have a lot less complexity to manage and add a lot more value.
An intentional data product taxonomy
However, there are challenges. As mentioned above, only some data products meet the bar to be called genuine products. We need a product taxonomy based on their intention and proximity to the customer.
If such a taxonomy existed, we’d be able to distinguish between data products where the goal is to minimise costs whilst delivering requirements and data products that can justify continued investment because additional iterations deliver additional value to the customer and organisation.
This taxonomy wouldn’t take away anything from existing frameworks. Instead it would add an organisation strategy lens onto data products and better connect data with corporate strategy.
I’ve been working on a taxonomy to do just this and over a series of four articles I’d like to complete the proposition, with your feedback.
Next week’s article will take a long view on product management, starting with McElroy’s 1931 brand memo, to identify further requirements that a intentional data product taxonomy should fulfil.
The following week’s article will take a similar journey but through the history of the term data product, to ensure a new taxonomy adds to and does not compete with existing approaches.
Then, at the end of the month, I’ll propose a taxonomy and approach to data products based on intent of the product.
For now I’d love your comments and thoughts on the problem I am describing.
Sorry, couldn't make my comment earlier since I had very similar feelings about the definition of data products as you discussed and started to look into the relationships between data products and other datasets in data space. I'm recently working on a study on complexity analysis and design considerations for data products and Data-as-a-Service (DaaS). The initial study show we may need to do more work before we can effectively handle the complexity of the current and fast changing data operation and management environments in data space, which cannot be easily reduced can only be controlled and managed if we can find right enablers and methods.
We know not every dataset or data entity in data space needs to or should be a data product. But, each data product must have its associated dataset(s) after certain efforts in design and processes to move, transform, package and present it (or them) through using various tools, platforms and activities. This new and different form of data which has a special value or use cases such that we need to not only name or call it differently but also treat it differently in catalogs, metadata management, applications and management. And, we also need to make a distinction between the internal data products and the external data products.
Looking forwards to reading your following-up articles on data products taxonomy and approaches if they have not been shared here yet.
I have in recent years been looking for a clear definition of a data product, to date I have found a number definitions and while they made sense, they always seemed to be somewhat open ended. As data professionals we seem to like order and governance but openness at the same time. It feels a little like a get out of jail free card. I look forward to seeing where you take this.