Ryan discusses the critical phase of any project which is to prepare a data source for use with Tableau. Learn the four steps Playfair Data follows and hear some thoughts on how long this aspect of the project should take.
Hi, this is Ryan with Playfair Data TV. And in this video, we’re going to be continuing to discuss the Decision-Ready Dashboard framework.
As I mentioned in the first video related to this framework, this is called the Decision-Ready Dashboard framework, because whenever somebody gets done looking at one of my visualizations, I want them to be able to make a decision from it. And this is truly the bread and butter of how we do everything at Playfair Data on the consulting side of our business. This is the actual checklist that we personally follow to get to an efficient and effective deliverable for our clients.
This framework has four different stages– Discovery, Data, Dashboards, and Distribution. And currently we’re in the Data phase. Once we have talked to our stakeholders, identified who the audience was, found out what our objectives were in the Discovery phase and found out which metrics we need to answer whether or not we are meeting our objectives, we move into the Data stage. And we list where do those data sources reside. Or I’m sorry, where do those individual fields reside in our data sources.
This can be a very interesting exercise. Let’s think about a specific objective. If we are trying to increase customer satisfaction, for example, by 25% by the end of 2020, well, one of the metrics that we need is how satisfied are our customers now, how satisfied were they in the past. We also need a date that tells us how are we tracking and improving against that objective.
Well, if we find out that we’re not tracking customer satisfaction, we find out we have to go back to the drawing board and begin to track it. We need to either need to add a survey to our website, maybe conduct focus groups, or do some type of outreach to start tracking that metric. But pretty practical step here. Once we know what fields we need, I’m just trying to find out where do those fields live in our database.
Once we find out where those fields reside and that we’re tracking everything that we need to answer whether or not we’re meeting those objectives, I identify the keys or the things that are in common between those data sources. The data sources have to have something in common if we want to combine them. Not to say that we can’t use data sources that are disparate and don’t have a key, especially within Tableau. For example, you can use parameters, which work across data sources. But if we want to do any type of combinations, such as a join or a union or we want to do cross data-source filtering, which Tableau also allows you to do, those data sources have to have something in common.
So I’m essentially just making a schema. I might even sketch this out on a piece of paper. If I need five metrics and five dimensions, I will sketch that out. These three things live in database 1. This is over here in database 2. Maybe we got something in database 3.
If they have a key, we can consolidate those data sources, which makes it much easier to work with through a join or a union. Not always possible. But if so, that’s what I’m trying to get to.
The most important stage of the Data step in the Decision-Ready Dashboard framework is shaping the data. This can look very different depending on your use case. It can be as simple as transposing the data source.
I mention this a lot of times as what I view as the single biggest barrier to Tableau adoption, which is when people take an existing Excel report which is often laid out in a very human-friendly way for people to read it. So it might be horizontal with dates going left to right to try to look at trends. But more often than not, when you’re dealing with that same dataset in Tableau, you want to transpose it is in a more vertical orientation. If at all possible, I try to get one field name per column header in the underlying data source.
So the shape data stage could look as easy as that or be as simple as that. That’s such a common scenario by the way that I have a related blog post that I’ll link to in the related content below this video. I also have a shortened URL for it. If you go to playfairdata.com/pivot, it will show you how to transpose a data source like I’m describing.
This could also involve a team of data engineers trying to consolidate. For example, I’ve got a visualization that consolidates 50 different data sources. So I definitely do not mean to belittle this step at all. A lot of times this may even end up being half of your project or more.
But once I have shaped the data and prepared it so that it is ready to use with Tableau, I do my first of two quality assurance steps. The reason it’s so important to do this first of two quality assurance steps is if I find something wrong at this point within the data, I know that it’s on the database side or perhaps on the shaping data side. I might have run into an issue with aggregation when I was doing a join or a union.
But any error that I find I know is within the data. I’d rather know that now and build trust with the data source before I start using it in Tableau versus down the road finding an issue on the Tableau side. Then I have to look in two places. I have to do extra troubleshooting on the Tableau side as well as the data side, because I don’t know exactly where the issue is. That’s why it’s important to do this first of two QA steps before we even open Tableau.
And on that note, notice here we’ve got four phases of the Decision-Ready Dashboard framework. And we’ve gone through half of them. We’ve gone through two out of the four before we’ve even opened Tableau.
This is one of those things in business and analytics that follows the 80-20 rule. I often hear people complain that the data prep and the planning, that takes 80% of a project, leaving only 20% for what I believe is a little bit– at least as valuable, if not more valuable, which is doing that human analysis and finding insights that cause some type of action in an organization.
For me, my goal going into any project is to flip that ratio. With good planning, I can often keep the planning and data step down to 20%, which leaves 80% to do that more valuable type of analysis that will actually cause positive actions within my organization. But in reality, it really does play out like this, where half of the time is spent doing discovery and data prep. And then the other half is for dashboard development, doing the analysis, distributing those insights.
But this has been the Data phase of a Decision-Ready Dashboard framework. In a future video, we’ll cover Dashboards and Distribution to finalize the strategic framework.
This has been Ryan with Playfair Data TV – thanks for watching!