I am gradually working through the Kimball Model through various blog postings. Reading the postings in order would make more sense, and I plan on collating and editing these entries into something more holistic. In order, previous blogs are:
In this blog entry, I delve into the Kimball Model as a Design Pattern or a standard solution that can help you avoid the pain of reinventing the wheel.
If we think of the Kimball Model as a design pattern, here are the goals (from The-Data-Warehouse-Toolkit-3rd-Edition):
Provides the analysts with access to answer business questions easily
Presents the data in a consistent model
Handles changes in incoming data and downstream data products
Explains the data to the analysts on time
Secures the data so that analysts see only what they should see
Offers trustworthy data by providing a means to earn that trust
Overcoming the Curse of Technical Minutia to Build Analytics
The curse of a Data Engineer is the generality of the profession and the technical minutia the job entails. As a Data Engineer, I write SQL transformations and analytics queries daily. Sometimes, I program applications and scripts in Python and Java. I have used Kubernetes and Airflow to run my data pipelines. On a blue moon, I use tools like Cloud Formation or Terraform to build the infrastructure to load data. However, only SQL is a critical tool I need to deliver real business value. The other tools are only a means to an end. Because analytical value is the crucial metric, I spend considerable mental and emotional energy to avoid getting lost in technical minutia and focus on building data products that answer business questions.
The challenge of building an analytical system recognizes the scope expected by the target audience: Data Analysts. The customers on an analytical project are interested in business insights. Therefore, the analytical database's operational data needs transformation to align with the bullet list above to meet their needs.
First and foremost, an analytical data project is recognized as not an operational project but instead gathers the operational data and transforms it into a product that analysts will *want* to use.
Easy Access to Data that I can use to make Business Decisions
The market has been active in providing quality tools that provide access to Data. My favorite BI application is an open-source offering from Metabase. I also appreciate the power of Tableau, and I maintain an analytics engine to discover new music on the Tableau Public website. Even the latest Data Warehousing Gartner group leader, Snowflake, provides the Snowsight BI tool directly from the data warehouse.
As the first item on the list of needs for a data warehouse, the access point is too often thought of as an afternoon. A "build it, and they will come" mentality introduces the risk in a data initiative because usually, the team will know very little about the goals of the end-user and will be instead composed of developers and project managers.
Know your destination before you begin your journey.
BI Tools are primarily interchangeable, each having different strengths or weaknesses, so it doesn't matter what BI tool you start with, but it does matter what questions you compose as a goal.
Business Questions from Selected Industries
In line with the Kimball book, Here are some common questions one might ask in various industries before starting any other development. Then, drive the story by focusing on what you will need to give the customer the ability to answer these questions using the BI application.
Retail Sales
Here are some common questions one might want to ask in the Retail Sales domain, along with related properties and metrics you will need to gather in your data warehouse.
What are the top-selling products in our stores and online, and what trends can we notice?
How does each of my product categories compare to last year's performance?
What are the top geographical regions for my product sales?
Financial Servies
Similarly, we also have a set of base questions in financial services. To name a few:
What are the profits and revenue of my services and products, services, and markets?
How are my profits and revenues trending over time, and what patterns can I observe?
How do my metrics compare to industry benchmarks?
Healthcare
Also, for healthcare, here are some common questions to ask.
What impacts can we see on our business regarding demographic trends in our patient population?
What prevalent diagnoses, procedures, and treatments do we see in our patient population?
How do my clinical outcomes compare to healthcare industry benchmarks?
Here are some example questions to ask before starting a data warehousing initiative. The questions themselves will drive the design of the data pipelines and architecture of the system.
Comments