Photo by Stephen Dawson
Originally Posted On: How to Build an Actionable Data Strategy Framework | phData
This guide is intended to help your organization develop a more practical and focused data strategy framework that delivers value.
The eight steps covered in this guide will help ensure you’re building a foundation for your organization when starting to utilize data as a strategic asset.
In the not-so-distant past, leveraging data to make more informed decisions was once a strategic advantage only available to large businesses that had access to massive amounts of capital.
Today, advances in cloud computing allow companies of ALL sizes to build fit for purpose data platforms. Whether you work for a small company with a limited data platform or if you’re not getting any results from your existing platform, wielding data as a strategic advantage has never been more accessible.
Building a trusted (and effective) data platform starts with a proper data strategy that provides a prescriptive architecture, design, and implementation plan.
phData has had the pleasure of building actionable data strategies across many different businesses and what we uncovered is that ANY business can benefit tremendously from data as long as they have the right data framework in place.
With a combined 300 years of data strategy experience, we’ve made several mistakes, unlocked eye-opening results, and acquired an expert level understanding of how to build a data strategy framework that delivers actionable results.
This guide is not meant to be an all-encompassing library of information on building a complete data strategy, rather, it’s a hyper-focused plan to help your business create an actionable data strategy framework that delivers value.
Our approach is based on a six-week data strategy timeline that consistently delivers the best results for businesses of all sizes.
Throughout this guide, we will walk you through the exact steps we recommend our customers take along with examples and templates you can use to help build a foundation for an actionable data strategy framework.
Step 1: Identify Key Stakeholders
The first step of formulating a proper data strategy is to identify the key players. The key stakeholders ideally should have a vested interest in the data platform, a healthy dose of excitement, and a genuine passion to make more data-driven decisions across your organization.
Who Leads the Data Strategy Engagements?
The best organizations create a cross-functional team that is typically led by someone in a Data/Analytics role or a leader within the IT organization. In some cases, this team is led by a leader within a business unit or a central business team. This individual serves as the “point person” who is responsible for driving success.
Pro tip: It’s important to ensure the point person has a clear line of sight and understanding of your current data platform architecture. They should be comfortable making technology decisions for the organization and ideally not develop the data strategy in a vacuum.
Who are the Key Data Strategy Stakeholders?
This team is very collaborative, working cross-functionally to gather input and support from numerous stakeholders across your organization. Here are a few examples of primary stakeholders:
What to Expect as a Stakeholder?
As a valued stakeholder, you will be asked to participate in many activities throughout the data strategy project. It’s important to note that not every stakeholder takes action in each aspect of the project, rather they add value within their spans of control and help influence the overall success of the project.
Step 2: Discovery
The initial discovery sessions are intended to catalog the current state of data assets, data platform technologies, and any current data use cases. Once all of this information is captured, the next step is to identify any gaps or challenges and then create a prioritized list of potential future use cases.
Discovery is performed through a series of interviews and documentation reviews. Interviews are performed with each stakeholder group where detailed notes are taken to document relevant findings. Additional interviews may be performed when new information is uncovered in a related discovery session. Interviews are complimented with documentation that covers things such as:
- Process flows
- Business Requirements
- Business Plans
- Org charts
Common Example Questions to Ask During Interviews
- How do you get value out of the data you use?
- What tools do you use to answer difficult questions that come from your BU leadership?
- Describe the architecture of the current data platform.
- What source systems are inscope for the platform? Describe the data domains available, size and any transformation that happen within that source system.
Near the end of Discovery, it is important to catalog and summarize use cases, gaps, and priorities. This summarization will allow for the identification of a Primary Use Case that can drive the development of the platform.
Looking for Help Defining a New Data Product?
This handy checklist will help ensure that the basic use case components are thought through. Additionally, it provides the context for any initial prioritization that may be required.
Identify a Primary Use Case That Drives Action and Decision-Making
Your primary use case is the focal point of your data strategy, it’s what’s going to drive the value home. The ideal primary use case should align with your business’s top priority and goals while also having the potential to be completely supercharged by data. There are two main objectives that the primary use case should facilitate achieving:
- Exercise and require implementation with a sufficient number of capabilities of the platform for subsequent use cases to be accelerated.
- Be impactful enough to share with executive leadership to see the value of the platform and provide more support for subsequent approval of use cases.
Questions for Consideration as You Identify Your Primary Use Case:
- How long would it take to solve?
- How likely are you able to deliver?
- Can you tackle it within your existing tech stack?
- Is this solvable with your existing team or do you need to hire more people?
- Are there any regulatory risks involved?
Analytical vs. Operational Use Cases
There is a spectrum that splits use cases, at one end you have Operational Data Products and on the other, you have Analytical Data Products. While these use cases rely on the same underlying data, they have very different requirements. The primary dimension that differentiates these are the impacts they have on generating revenue, producing products, or interacting with clients.
What Are Analytical Data Products?
Analytical data products are most commonly used to inform decision-making and analyze certain business functions. When they are not functioning, there is little impact on customers, revenue, or production. These are the use cases that we typically recommend to prioritize. They typically have a significant impact on the overall business but do not require large upfront investment or support to manage.
What Are Operational Data Products?
Operational data products are used to run day-to-day business operations. Typically, when they go down, there is a large impact on customers, revenue, or production.
Example Use Cases
Analytical – BI
- Basket Analysis – Ad hoc analysis that determines which products customers typically purchase together.
- Product Drill Downs – Help determine product or feature sales by region/customer/distributor.
- Inventory Forecasting – Estimates inventory levels required in a specified period.
- Real-Time Equipment & Process Monitoring – Monitors the health of manufacturing equipment in real or near real-time to increase efficiency.
Step 3: Data Platform Architecture
Drafting an initial architecture of the solution based on information gathered in the discovery phase gives life to the platform. Even if it is incomplete, it will still bring visual representation to subsequent discussions. It also helps frame people’s thinking and tells the story of the platform. Without a visual representation, conversations often end up repeating, causing confusion and ultimately slowing down progress.
The architecture is organized by capabilities. Capabilities represent the logical components of the platform necessary to deliver on a requirement. For example, most data platforms require data warehousing as a capability. The data warehouse allows for efficient storage and querying of data for business intelligence, advanced analytics, and machine learning.
Architecture Diagram #1
The first architecture diagram focuses on the capabilities of the platform. The capabilities are laid out in the order in which data will be processed. Like a good story, this architecture diagram tells the audience how data gets into the data platform, processed, and consumed. The architecture highlights specific capabilities and data requirements. The goal of this diagram is to get buy-in on a capability view of the data platform.
Capabilities are composed of technologies that have features that can align to specific business requirements that encapsulated in the capability. For example, the ingestion capability might have different technologies that manage real-time data ingestion vs. batch.
Architecture Diagram #2
The second architecture diagram incorporates another level of detail, specifying technologies to support a capability. The technology assessment should provide reasoning and justification to technology choices (see Technology Assessment). Justification is derived from discovery phase interviews.
Technologies can be deployed and configured in a number of different ways. Everyone must understand how the technology will be operated and leveraged. This often requires a deeper technical representation of the technology. For example, Airflow can be used in a standalone server, it can be deployed as a service on kubernetes or it can be used “As a Service” in AWS Managed Workflow. Having a detailed representation will make it clear how the technology will actually be deployed in the data platform.
Architecture Diagram #3
The final architecture diagram will be a deep technical document that will be a reference for how the actual platform is used. This version will be most useful to technology domain owners and it should offer clear guidance on the scope and magnitude of deployment. It will also allow for more accurate cost estimation of the platform.
The architecture is meant to be a living document. Each architecture diagram should be kept up to date as the discussion with business and technology stakeholders progresses. The refined architecture iterates until there is general acceptance. We’ve found high-level diagrams give the illusion of it being simple to set up whereas having the details lets the reader understand how much work it takes to get all the configurations just right.
Step 4: Technology Assessment
As the capability architecture diagram comes into focus, technology assessments will be conducted to determine the technology stack of the platform. This will involve the creation of a document detailing the technology options, selection criteria, and applicable business factors that determine the selected technology.
For example, a company might be deciding whether to continue leveraging Hadoop for its data warehouse or moving over to Snowflake.
The assessment may reveal the need to do a proper Proof of Concept (POC) of technology to build a better understanding of how the proposed technologies stack up to the selection criteria. This is usually noted in the proposal and would represent a phase 0 implementation scope to solidify technology selection.
At phData, we’ve been fortunate enough to have a strong background in implementing data platforms using a variety of technologies. The selection criteria list below is what we use to help our customers maximize their technology investments.
- Create a pros and cons list
- Create a cost profile
- Create a migration calculator
- Create a feature comparison
- Create alternatives
- What about internal comfort or experience with the tool?
Technology decisions are critical to a successful data platform. The decision drives everything from costs to recruiting for roles on the platform team.
Step 5: Data Governance
As data platforms mature, the impact of not having specific components of a Data Governance program becomes more and more important. On the other hand, over-engineering a Data Governance program can slow down progress and limit business value. The key is balance.
Pay special attention to certain business units that have non-negotiable Data Governance requirements. Compliance reasons in certain industries require specific governance of the platform. Identify aspects of Data Governance that are non-negotiable and those that can be developed later in the data platform life cycle. Core Data Governance capabilities are:
Step 6: Organizational Structure
Organizations evolve through growth, contraction, acquisitions, mergers, and a whole host of other factors. The teams that support the platform will need to ebb and flow with the organization and platform.
Typically, organizations have an executive leader who is responsible for data and analytics broadly. As organizations grow, sub-teams can form to support specific business units or functions within the organizations and a central IT team can manage fundamental aspects of the platform infrastructure. Below is a simple view of an organizational structure vs. a more complex one.
Step 7: Implementation Plan
All of this work culminates in a clear plan for how to get from where you are to where you want to be. Typically platforms go through a similar implementation process of platform build, migration, use cases development, testing/validation, deployment to production, and management. Each of these phases needs to have clear timelines and costs. Balancing quality, speed, costs will vary by organization.
Phased Implementation Plan
Listed below is an example of a phased implementation plan that phData uses for most customers:
If your business is ready for the migration step, we have a vast library of common migration approaches including Hadoop to Snowflake, Oracle to Snowflake, Hadoop to AWS, and much more!
Step 8: Recommendations
The last step brings everything together. The recommendation process includes the target future state and justification for what data asset will be developed and how the business will create a strategic advantage through the use of data. Here are the five key components that should be included in the recommendations:
- Summary of Discovery
- Data platform current state
- Platform delivery model & team structure
- Identification of use cases
- Recommended Data Platform Architecture
- Key technologies
- Technology justification
- Initial platform roadmap
- Recommended Delivery Organization
- Roles and responsibilities
- Team size
- Required technology skills
- BU partnership model
- Execution method (Agile vs. Waterfall vs. Hybrid)
- Recommended Data Governance Model
- Initial data governance requirements
- Recommendation for how data governance progress
- Implementation Plan
The intention of the recommendation is to convey and convince stakeholders of the direction they should be heading. It will be the basis for any needed investment in technology, people, process changes, or potential changes in organizational structure to support this new direction.
When developing the recommendation, it is important to bring stakeholders along in the process. They should not be caught off guard by the recommendation. Not all stakeholders will agree with the recommendation but getting their input, understanding their concerns, and addressing them is key to managing change within your organization.
Even though not all stakeholders will need to agree with the recommendations, key decision-makers will. Identifying these decision-makers and include them early in the recommendation process will increase your chances of success.
We hope this guide is helpful and serves as a resource for you as you evolve your organization. The general principles and best practices in this guide truly apply to organizations of all sizes, industries, and data/analytic maturities.
At phData, we know this work is foundational to your success and are here to help. Drawing from years of experience, learning, iterating, and doing this for customers at every stage of their life cycle, we’ve built a set of tools, processes, reference architectures, and a team that is ready to help you get on the right path towards better-utilizing data and analytics within your organization(s).