Ryan Sleeper
Meet the instructor, Ryan Sleeper, and learn the fundamentals of Tableau Desktop including how to make scatter plots, line graphs, text tables, dual-axis combination charts, how to use the “Show Me” function, how to use preattentive attributes, and the advantages and disadvantages of pie and area charts.
All right. As for introduction, my name is Ryan. Ryan Sleeper. So we built one chart type. We’re now ready to build a second worksheet. There’s a couple of ways to do that. First of all, you can click Worksheet in the top navigation and then New Worksheet. That’s option one. But what I typically do is I click the first icon at the bottom of the authoring interface after the sheets that I’ve already built. So we’ve made one. The very first icon. We’ll create a new worksheet. So here’s worksheet two.
And we’re going to jump ahead to my third favorite chart. Bar chart’s my favorite. This next one is my third favorite. But we’re going to go a little bit out of order to share another important concept in Tableau, which is called level of detail. And to illustrate, we’re going to make our first scatter plot together. To make a scatter plot, you need zero or more dimensions and two to four measures. The reason you need at least two measures is because that’s what forms the X and the Y-axis. So you need at least two. Then you have the option to layer in a third and optional fourth.
For this first example, I’m just going to double click Profit, which will add it to the Rows shelf and double click Sales, which will add it to the Columns shelf. Nope. I had those backwards. Luckily, I showed you a very easy fix. Click the Swap button. And we’ve got our first scatter plot. Sales is on Rows. Profit is on Columns. I’m also going to change the Mark type. I didn’t show you this yet. Because this is a scatter plot, it chose a Mark type of Shape for me. But you have the option to change that in this dropdown.
So if instead of a Shape, I chose Circle, you’ll get a slightly different look. It’s just a closed in circle. And I’ll make that a little bit bigger by clicking on the Size property, dragging that to the right just so we can see that circle a little bit better. All right. Now for the important concept. Every single visualization that we build in Tableau has what’s called a visualization level of detail. I think of that as the most granular level of the analysis. And the reason it’s so important to understand what the level of detail is in the View is because that’s how all the numbers are being aggregated. That’s kind of the level of the numerical breakdown in your analysis.
Right now on the screen we haven’t specified anything more granular than the entire data set. So we’re just seeing a single circle. And that single circle represents the sum of all of our profit values across all 9,994 records by the sum of all of our sales values across all 10,000 of those records. Doesn’t really tell us much because we haven’t changed the level of the analysis at all. Any time you add fields to either the Marks card, the Columns shelf, or the Rows shelf, you’re changing the visualization level of detail, and you’re adding context to an analysis.
Like I said, there’s several ways or places we could change that level of the analysis. But there’s also a fifth Marks card that’s already on the Marks card, or fifth property rather, that’s on the Marks card. So if I add a field directly to that property called Detail, it will change the level of detail for me. So for example, if I drag Category to Detail, now instead of one circle I see three. There’s one circle for each of the three categories. This Sum of Profit and Sum of Sales is now being aggregated at the category level instead of the entire data set level.
If I add Customer Name to Detail, this will make this very granular. We now have 2,182 circles instead of three because it’s taking these Sum of Sales and Sum of Profit by every single customer name in our data set. By the way, how I knew the number on the View so quickly is Tableau in the bottom left corner has a feature much like Excel where it gives you a very high level summary of what’s on the View. One of those things it’s showing me is there’s 2,182 circles. So that’s a scatter plot. That’s a little bit about visualizations level of detail. This will be an important concept that we continue to come back to. But just wanted to give you an introduction to that for now.
To try to be as thorough as possible, I’ll also change this Mark type from Circle back to Shape to show you that a sixth Marks card or sixth Marks property appears. So those first five show up no matter what chart type you’re making. If you choose a few of these Mark types, such as Shape is what we’re looking at now. The same is also true for Pie. The same is also true for Line. You will see a sixth property appear. For Shape, it’s called Shape. And it works just like the others. Any field I drop here will encode the shapes on the View.
So if I drag Category to Shape, we should see three different shapes. And we do in this Shape legend that has appeared. All right. I do like scatter plots a lot. Forgot to point out the benefits of these. I’ll leave this on the screen for a moment to explain them. First of all, it’s one of the only charts where you can look at a very dense amount of data in a single view. Like I mentioned this chart has over 2,000 marks on it. It’s one of the only charts where we can see everything in a single snapshot. No, we’re not going to be able to analyze all 2,000 of them that are kind of bunched up here.
But we’re able to see outliers extremely quickly. So they’re very good at that. We’re also able to see correlations between two different measures. So that’s helpful. And then you also have the ability to create kind of a natural four quadrant segmentation. Just imagine that you had a line going across the chart at the average of the Y-axis and a line going across the chart at the average of the Profit axis. Well, that would create this criss-cross with four different squares in it. And each of those four squares you could treat as a segment that’s behaving in a different way.
So those are just a few of the benefits of scatter plots. Again, my third favorite. I think it’s very effective. So I’d like for you to get some practice with it. And that will be in our next exercise. Most of these you can reverse engineer just by thinking about what’s on the View. I’ll kind of talk you through it since this is only our second exercise so far. But Sales is drawing a row. So Sales will go on Rows. Profits is drawing a column. So Profit should go on Columns.
I’ve got this Color legend telling me that the circles are being colored by Region. There’s only one way to color marks on the View. It’s to put something on the Color property of the Marks card. So you’ll put Region on Color. And then the only thing that you couldn’t really derive from just the screenshot alone, I put a hint up here that Sub-Category is also on the Detail property. That’s going to make this chart go from 4 circles up to 68 circles. Hopefully, because I talked to you through most of it this is another two-minute variety. But we’ll break for two minutes. I’ll come back and show you how to go about creating this one.
All right. So we built a lot of this. But I will start over just in case you weren’t able to follow the whole thing. First of all, I’m going to start a new worksheet just to clear the slate again. Looking at the View, I’ve got Sales on Rows and Profit on Columns. This is another one of those charts that’s so pervasive, by the way, that Tableau kind of automatically defaults to this chart type. When you just double click Sales, it puts that on the Rows shelf. And if I double click Profit, it puts that on the Columns shelf. So there’s your foundation. One gotcha that I forgot to point out when I was talking you through it, I also changed it to Circle, the Mark type to circle. So hopefully, you figured that out. I’ll make that just a little bigger so we can see it.
And then from here we’re just encoding the mark and changing the viz level of detail is what it’s referred to. First I saw that Color legend for Region. I know there’s only one way to color marks, and it’s to put something on to the Color property of the Marks card. So I’ll drag Region to Color. That will get me four circles, one discrete color for each of the discrete dimension members. Then the hint at the top of the screen was to put Sub-Category onto Detail. That’s going to make the analysis more granular. So I’ll put Sub-Category onto Detail. And we now see 68 circles on the tab at the bottom and then give us a name so that we can find it later on. That’s our scatter plot.
But for now, we’re going to keep moving. And we’re going to go back to my second favorite chart, which is the line graph. This is another one that was invented by William Playfair. I already mentioned, but it was the same year, same book in 1786. The difference with a bar chart and a line graph is because their technical criteria, as we’re about to see, is actually very similar. But a line graph is one of the best options for visualizing trends over time. So the technical criteria to make this chart, it’s actually the same as a bar chart, except for one giant caveat.
So first of all, the part that’s the same. Zero or more dimensions and one or more measures. That’s the exact same technical criteria as a bar chart. But line graphs, because we’re looking at something over time, we’ll also typically include one date. It is possible to make a line graph or use a Mark type of line when you’re not looking at dates. But this is their primary use. So that’s going to be the first example that I share with you. This is also happens to be one of the very few times that my opinion on how to make a certain chart differs pretty dramatically from Tableau’s defaults. So what I’m going to do in this section is show you my very highly preferred way to make a line graph, but then I’m going to take a step back and show you the defaults and a couple of alternatives.
So first, let’s say that we wanted to make this line graph that looks at sales by continuous month to look at a trend over time. Well, over in Tableau desktop I’ll start a new sheet. Because of rule of thumb number one, I always start with my measure. I’m just going to double click on Sales to add it to the View. So there’s my measure. I now want to slice and dice that by a dimension. My highly preferred way to make a line graph is to right-click on my element of time. In the case of the Sample Superstore data set, my element of time, the field is called Order Date. So I’m going to right-click Order Date while I drag it to the Columns shelf.
And much like when I right clicked on a measure and dragged it to the Columns or Row shelf, I’m given some options before Tableau draws anything. What these options allow me to do is choose whether that date is going to be treated as discrete or continuous and at what granularity. The granularity of dates is referred to as a datepart That’s things like day, week, month, quarter, year. Those are all considered dateparts This list probably looks intimidating if it’s new to you. But I’m going to show you how you can very quickly narrow these options down to the exact one that you want to get the exact result that you’re looking for.
So first, I’m going to talk you through these top to bottom. The first two are going to be at the most granular level of that field. So for Order Date, I happen to know in the Sample Superstore dataset that it’s most granular level of detail is Day. Well, if we’re wanting to make a line graph that looks at sales by month, both these options are out because I don’t want to look at it by day. The next set of options between this line and this line have a blue icon next to them indicating that those dateparts are going to be discrete. Again, I’m trying to draw a continuous trend over time. I don’t want discrete headers that can be sorted. So I’m skipping this entire section. So I’m already about halfway down this list of options.
The next four options are actual aggregations like you would do with a number. Count, Count Distinct, Min, Max. Those are not relevant for my analysis at the moment. So I’m very quickly into this single section, contains five options, they all have a green icon next to them, telling me that it will draw a continuous axis. And at this point, I’m simply choosing the granularity of the date. So if I want month, I’ll choose Month with the green icon. We have our first line graph. That’s how easy it is. That’s my highly preferred way, gets you there in two clicks we made this. And it wasn’t confusing, hopefully.
That differs from the default. So let me undo that and show you what happens if you just click on Order Date. I’ll just double click Order Date. There’s two defaults that are off from what we just made. First of all, the default discrete versus continuous is discrete. The default classification, I should say. It’s blue instead of green, indicating that these are discrete headers. It’s a coincidence that these are in chronological order. So the Mark type of line, it’s not throwing us off yet. This actually is fine so far. But the issue with this being blue or discrete is I could sort those headers. Those two options are not available to me when this pill is green.
And here’s the problem. If I go sort this in descending order, this is no longer chronological. It’s sorting it by the sales amounts. So we have 2020, 2019, 2017, 2018. Tableau recognized that it wasn’t in order anymore and actually helped us by changing the Mark type automatically. But I’m going to undo that. And so that’s why I don’t like this default. It’s easy to fall into this trap and end up with the result that you don’t want.
To argue the other side, I’ll show you one thing that I do like about these defaults. And then we’ll go back to the way that I recommend. But first, what I like about the defaults, it has this plus sign. And by the way, I forgot to mention the second thing that was different, obviously. It’s Year instead of Month. So we’re a little bit off from that use case that we were trying to recreate. But one of the things I like is this blue pill has a plus sign on it indicating that I can click and drill down to the next level in this hierarchy. Dates come with a natural hierarchy that goes from Year to Quarter to Month to Day.
So if I click the plus sign on Year, it gave me Quarters. If I click the plus sign again, it gives me Months. If I click the plus sign again, it gives me Days. Those plus signs have now been replaced with minus signs, which means I can roll these dates back up the other direction. So if days is too granular for me, I can click the minus sign. That rolls the days into the months. Here’s what I really like about this. These pills are all independent. So there’s two things I can do. I can remove certain aspects of the date that I don’t want. So maybe quarter isn’t relevant for our business. I can simply remove it from the View. And the Year and Month are still intact.
And even more importantly, because these are independent, I can re-order them. These pills are processed in order. So because Year’s on Columns first, it’s drawing us four columns for our years first. Month is on Columns second, so then it draws 12 months for each of our four columns second. This is your kind of traditional seasonal analysis that’s going in chronological order. There is some value in that. I can see things like 2020 looks like it was our best year. But definitely had our peak over here in November 2020. So there is some value in that. But if I put Month in front of Year, I would get a very different analysis because now it’s drawing columns for my 12 months first, followed by columns for my four years second.
So now each of those columns is a four-year trend by month. And now it’s a lot easier to see things like October and November had these giant spikes in the last couple of years. That would have been a lot harder to see if they were spread out across the View in this traditional way. So there’s the benefits of the defaults. But to get back to my use case, again, right-clicking is a shortcut. But if you forget to do that, you can change the date part as well as if it’s being used as discrete or continuous by clicking into that Date pill.
The only gotcha with this method is for whatever reason this is the one place in Tableau where discrete versus continuous is not color-coded. It’s a little bit confusing because you just have to remember that all the choices between this line and this line are discrete. That’s why I kind of wish they were color-coded blue. Every option between this line and this line are continuous. So if I wanted to get back to that continuous line graph by month, I’d have to choose the second occurrence of the word Month. And that’ll get us back to that first line graph.
This is a nice chart again, but I’ll give you one tip on line graphs, just like I did with bar charts. If you click the Color property, there’s an effect at the bottom called Markers. And if you choose the option in the middle, this will put a very small circle on every data point. And this has two purposes. One is just aesthetics. I think it looks a little bit nicer, adds a little bit of professional polish. But it’s more practical purpose is showing you where there is a data point. If you’ve got any kind of steep decline or steep jump– so maybe from here to here. If I didn’t have these markers, I wouldn’t know whether or not there was a data point in between. See if I can find a good one from here to here is similar.
There might be a tiny little hitch that would tell you there is a data point. But that marker is telling you that there is, in fact, a data point between there and there. So that’s the practical purpose. Just like with bar charts, I can now layer in extra context. So if I wanted to look at this line graph by category, I could drag Category to Columns. And now I’ve got that monthly trend by category. And my categories are drawing three columns. If I drag Category down to Rows instead, then we’ve got the same analysis, but we have a row for each category instead of a column for each category.
This is definitely going to be a useful chart type for you. So it is our next exercise. Take a shot at this one. I’ll come back, show you how it was built, and then we’ll keep moving. But it’s sales by continuous quarter. I’m never trying to trick you with these exercises. But I do mix up some of the fields we’re using and maybe some of the granularity just to get you to kind of practice getting to this expected result. So here’s a case that’s like that. Very similar to what we just built. Sales is on my Row shelf. But this time we’re looking at sales by continuous quarter instead of continuous month. You can then look at the color legend to see how to color the chart and then practice with putting your markers on the View. Hopefully another two minute variety.
All right. So back to our exercise for today. How I would go about this. I’m going to start a new sheet. And I always start with my measure. So I’ll just double click Sales. My highly preferred option for creating line graphs. I’ll right-click Ordered Date, drag it to Columns, choose Quarter Continuous. And there’s the foundation of the line graph. It really is that easy if you follow my recommended approach. From there, we’re just encoding and adding polish to this line graph. I see Region is on Color because I have a Color legend for Region. So I’ll drag Region to Color, get my four lines, and then I added those markers by clicking the Color property. Under Effects, there’s one called Markers. I’ll click the option in the middle. And that is the entire exercise.
All right. To this point in the training I have very purposely built my three favorite chart types manually. I think it’s important for you to know how Tableau ticks, and how things are being encoded, and what’s controlling the orientation of the charts. So I wanted to build those three manually. But I’m going to introduce now a feature called Show Me which allows you to lay the foundation for 24 popular chart types with just a couple of clicks.
But to illustrate how it works, back on the screen here I’ve zoomed in to just the top left corner from that last exercise, and I’ve whited it out the rest of the chart. And I did this to point out that over time you’ll realize that you’ll know what Tableau is going to draw even without seeing the visual. There’s lots of clues just by looking at the location of the pills, whether they’re blue or green, what’s the Mark type, how are things being encoded.
So for example, if I looked at this chart or this portion of the interface, I could see that Sales is drawing a row. So that’s creating a Y-axis. Continuous Quarter is on Columns drawing an X-axis. So I’m kind of thinking this is a line graph. It has an element of time. It’s a continuous axis. So I’m thinking it’s a line graph. Sales by quarter. I could confirm that by looking at the Mark type dropdown, which is showing me it is a line. So now I’m feeling even more like this is a line graph. And then I’ve got Region on Color, indicating that those lines are going to be encoded by color, providing a color per region.
I point this out– and I’m not going to read you this whole paragraph. But I do have a rule of thumb for this. Just keep in mind that everything we just described is how Tableau draws every single chart no matter how fancy we end up getting during this course. But starting with bar charts, line graph, scatter plot, all three of those and along with every other chart in Tableau is controlled by what fields are on the View, whether they’re on the Columns or Rows shelf controls their orientation, whether they’re green or blue controls whether they’re making a discrete header or a continuous axis. Whatever’s on the Marks card is controlling how those marks are being encoded. That’s every single chart. So if you realize that, you can kind of make small tweaks to get to the exact result that you’re looking for.
To give you a shortcut. I’m going to show you how to make what we just made with Show Me this time instead. To use Show Me, I’m going to start by preselecting the three fields on this View. So Sales, Order Date, and Region. I go to a new sheet. I’m going to click Sales. Much like a lot of software programs, if you hold down the Control key on Windows– hopefully, you know what that is on a Mac. If you’ve got Mac, I believe it’s Command. But if you hold down the Control key, you can do a multi-select in Tableau. So I’ll hold the Control key while I click Order Date. Hands are off the keyboard. But I’ve selected two items now.
If I hold Control again and click Region, the third field hands– hands are off the keyboard again. But I’ve preselected three fields. To use Show Me, you click this button in the top right corner called Show Me, and that will open a pain that shows you thumbnails of 24 different types. Couple of things about this interface. First, Tableau is looking at the fields you’ve selected or the fields that are on the View already. So there’s two ways to use Show Me. We’re using Show Me from scratch. I’ve preselected the fields before anything’s on the View.
You also can use Show Me when stuff is already on the View. You would just click the Show Me button, and it’s going to consider the fields that are already on the View. But with this version from scratch, I’ve got three things selected. And Tableau is drawing an orange box around its recommendation. Another feature of this interface is its coloring the thumbnails that I am able to create with this combination of fields. So some of those are grayed out. Let’s see. I’ll hover over this first grayed out thumbnail.
If something is grayed out and you don’t know why– so maybe you think you have the right combination– you can just hover over it and Tableau kind of tells you at the bottom what is needed to create that. So this is called a symbol map. It says I need one Geographic dimension. Well, that’s where I got off on the wrong foot. I don’t have that, looking at my field on the left. That’s why it’s grayed out. Let me try to find a better example.
Here’s one. Dual Combination. It says try one date. I’ve got Order Date. So so far so good. Zero or more dimensions. I’ve got one dimension Region. And plus, I didn’t need any dimensions. So we’re good there. And then two measures. There’s where I went wrong. I’ve only got one measure, Sales, not a second measure. So it’s grayed out. All right. Now back to the recommendation. Let’s see what I want us to do. It says lines discrete. That’s the default line graph. That was what I just pointed out is one of the few defaults that I’m not a big fan of. So pointing out that it will make a recommendation doesn’t mean you need to take it. You can make any of these other charts that are in color.
The one that we actually want to create if we’re wanting to match the last exercise is this one right next to the recommendation. This is called Lines Continuous. If I click on that one, we see a continuous line graph appear. This is very close to what we made in the last exercise. But there are two changes that we need. So here’s what Show Me created. Here’s what we wanted to create. The two things that need to be changed are the datepart The default datepart is Year. We want it Month. And then by default, Tableau doesn’t add the markers. So there’s no markers on this version. There are markers on this version.
So if we were wanting to match this exactly, we would have to go back in here, click on this Year of Order Date pill, and change the date to month. And then if we wanted the markers– actually, it was quarter in the exercise. So there’s quarter. And then click the Color property and add those markers. So those are the two changes we made. And the reason I point that out– I’ve got a rule of thumb to cover this idea. But I highly recommend that you don’t rely on Show Me. I see a lot of people getting started with Tableau that just kind of mash on buttons until they kind of get a result that they wanted. But we just covered the three most effective types, in my opinion, from scratch.
Now you know how Tableau ticks. You can make those adjustments manually. So don’t rely too heavily on Show Me. If it helps you lay the foundation of a certain charge very efficiently, go for it. You can use it. But you still need to know what is causing the orientation and the encoding in order to get the precise result that you’re looking for. All right. Onto the next one. So those are my three favorites. I’m now going to kind of take a step back and show you some more charts, starting with the least sophisticated. And then we’ll build up throughout the course and get more and more sophisticated.
So probably the least sophisticated, if you even count this as a data visualization, is a text table, also known as a crosstab in Tableau. Essentially, if you’ve got one or more dimension and one or more measure, you can make a text table. So you can make it out of just about anything. I wanted to point out a couple of things I think you’ll find value in with text tables. First of all, any visualization can be duplicated as a crosstab by simply right-clicking on its tab. And about halfway down, it says Duplicate as Crosstab. If I click that button, a new sheet appears. It has taken the data in that chart and converted it to a table. That’s what we’re looking at here. I did that with a line graph. But you can do that on any chart.
You can also export the data from that chart directly to Excel as a crosstab. So this is valuable if you ever want to do your own math or maybe you’re more familiar with Excel and how to pivot things. Just know that you can always get the data out of Tableau into a Excel file by clicking Worksheet in the top navigation. Hover over Export. And then there’s the option Crosstab to Excel. Depending on the number of rows and the dataset, this should take just a few seconds to load. There is Excel. And it’s taken that same data, and now it’s ready to go for me here in Excel.
There are also several ways that I attempt to use these text tables as a visualization, or sometimes it’s related to the user experience that I’m trying to provide. The first and probably main way that I use text tables is as what I call a callout number. If you just drag a field to the Text Marks card, it just puts that measure with whatever aggregation you’ve got as text. And then once it’s on the Text Marks card, you can click Text, resize it, reformat it, and just make one big number. I call those a callout number.
You can also use text tables as control sheets. I’m going to show you a dashboard feature which allows you to click in one portion of the dashboard and have it influence or filter something else on a dashboard. So one way that you can use text tables is you can kind of set up a menu of what to click on so the rest of the dashboard gets filtered. And then the last way that I like to use text tables is for quality checking or providing raw detail. So if I build something in Tableau, I might just make one big crosstabs so I can spot check individual numbers, make sure they match my database. And then once I have trust in the numbers, I’m confident to just use Tableau only. I’ll no longer rely on those text tables.
I’ve got an exercise here. But I’ll go ahead and walk you through it. So I’ll build it with you. Start a new sheet in Tableau. Let’s double click and go ahead and call this Sales Callout Number. And here’s what I was describing earlier with callout numbers, if I just left-click Sales and drag it to Text, I will get the default aggregation of Sum. And I just get one number. That’s my Sum of Sales. Now that it is on text, I can click Text, and I’ve activated this ellipsis because there is at least one thing on that property. If I click that ellipsis, it opens this little word processor where I can do things like make the font larger. Maybe I’ll pick 24 point font. I’ll click Apply just to see how that’s looking. I’ll call that good for now. That text is dynamic.
But I can also come in here, maybe add a break, it’s called, before my dynamic number. And I’ll just type ‘Sales’. Click Apply. See how that’s looking. It looks OK. The alignment doesn’t match. This is a weird nuance of Tableau that I’ve actually never figured out. But in this word processor, the text is centered. But even when I click Apply, this seems to be right-aligned. And the way that, if it doesn’t work here, the way you can fix that alignment, first you have to click OK to close that box. But there’s another place here to update the alignment. Right now, it’s automatic. If I click Center, now we’re back to centered.
And then again, I don’t like to spend too much time on formatting. But if it’s driving me nuts on this one, if you want the dollar sign– why don’t we actually just change the default property this time. I’m going to right-click Sales, hover over Default Properties, and click Number Format. And I’ll choose Currency Custom with no decimals. And now if I click OK, we should see a dollar sign. And this time that format is going to roll across whenever I use Sales for the first time. So it is not sheet-specific. From now on in the course, if I use Sales, we should see a dollar sign.
All right. That is the sales callout number. I do want to argue both sides throughout this course and kind of play devil’s advocate a little bit. I know that tables are pervasive in analytics. And they’ve stuck with us. A lot of C-level executives, they use text tables exclusively. This is, of course, about Tableau, which is pretty much best in breed for visual analytics software. A text table is not visual analytics. This is also, of course, not data visualization. Text tables are not data visualization.
So to kind of transition into visual analytics, I’m going to share my favorite exercise for kind of getting people to evolve past the Excel mentality. And to do this– and I encourage you to copy this. I think it does make a difference when people see this side by side. What I’m going to do is show you a large– I call them wall of text, which looks very similar to a lot of corporate reporting that I usually see. And we’re going to try to just answer the most basic question that I can possibly think of, which is, what is the highest number in this data set? So on the screen here. This is one I typically love to do in person. I wish we could hang out and do this. But I’ll just have to talk you through at this time.
So if we were trying to find the highest number in this data set, we would have to make a decision to either– so first of all, we have to look at every single number just to be sure. So we’d have to make a decision to either go left to right or top to bottom. So I could browse through either the columns or the rows. Let’s try rows. So I’ll skim this row. The largest number in that row is 2,132. But I got to keep going. So it looks like I got a bunch of negatives on the second row. Maybe I’ll try to be faster by kind of snaking from left to right. And I’m just looking for a number that’s higher than 2,132. So I’m snaking, snaking. Down here. There’s one. 6,810. But I got to keep going. I will not bore you with that.
On the next screen what I’m going to do is do one of the most basic encodings, which is to encode these numbers by color. The higher the number, you’ll get a darker blue. The lower the number, we’ll get a darker red. And you’ll see the punch line right away. You’re pretty much instantly drawn to 9,004. You may have looked at three or four other numbers, such as 7,474, 6,810. But you found that answer in a fraction of the time as when we had that raw text table. And the point of this– it’s almost comical, but it’s not that bad of an example. I really do see reports that are like this. So I hope that that point is driven home.
If we struggle to answer the most basic question possible, how can we really be doing anything that is more important than what’s the highest number what’s the lowest number? It’s just almost impossible to do with the raw text table. We need to visualize that data to translate it, help us find answers that we can take action on. Put a lot of thought into this. This really is what my whole career and company is based on. And it’s kind of become a mission statement for me where I’ve boiled my observations into these three benefits of data visualization that I call it. And I have one more example before I get to that. And this is a fun one. So before I get there, let me show you a couple more things.
This is another example that’s a little more interactive. Sorry to jump back and forth here. But what we just did, it kind of borrows from a concept from a kind of modern data visualization pioneer named Stephen Few. And he has this exercise where he asks you to count the nines. And I’ve made this workbook to do that. And how it works– let me see if I can get this working on the screen share. All right. Looking good. So how it works is he shows a wall of numbers and asks you to count them and will time you on how long it takes to find the nines when there’s no encoding.
So what you would have to do is– I’ll start the timer and we’ll go through this. And I’m trying to be fair here. So I’m waiting to start the timer because I really want to show you and provide some quantitative evidence that we should be visualizing data versus looking at text. So what I’m going to do is soon as I hit that Start button is count the number of nines as quickly as I can. So start. And again, I’m going to have to go either left or right or top to bottom. I click Start. And here we go. First row. Skimming, skimming, skimming, skimming. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. 12 is what I got. And I’m actually not that confident. I had to squint once. I heard a phone ring. But we’ll say it’s 12.
I came up with 32 seconds. Let me even see if I was right. I’ll click Answer. All right. I got 12 nines. I’ll type in a 32. That was round number one. Let me turn the answer off, reset my clock. This time what I’m going to do is encode those numbers by color. So I’ll click Color. That’s going to color them red. I will choose a different number before I start the timer, and we’ll go through this same thing. So I’m going to click Start and 8. Here we go. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. Actually, I kind of cheated. Let me try that again. I’ll do four.
All right. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17. Make sure I got that right. 17. Came in at around up to 11 seconds. There was even five more numbers to count. And I did it in 1/3 of the time. So hopefully that gives you some quantitative evidence that how powerful this is. Color is one of several what are called preattentive attributes. That’s pretty much what data visualization is based on. And this next rule of thumb is to leverage these protective attributes to find and communicate insights or stories in your data. To show you a few other examples of preattentive attributes, I’ve got our first of several animated videos that we’ll share throughout the course because there are a few dozen of these. But this video covers some of the most common ones that you can use to improve your data analytics.
So I’ll hit the Play button. I might pause this a couple of times because it moves pretty quick. But the first one that we’ve just saw is color. You can have both a different hue and a different shade or intensity. So those are two ways that it’ll help us. Height or length. This is, by the way, the second category of preattentive attribute. We’re now into the form section. Height or length is the next preattentive attribute. This is why we’re so good at comparing the heights or lengths of bars and why that chart type is so effective.
Width. Orientation is another form. All of these aspects of the View our eyes are immediately drawn to. These are called preattentive attributes because we process them before we even slow down to pay attention to what we’re looking at. We are so good at processing them that it happens almost instantaneously, almost subconsciously. So keep that in mind as we look at the others. Your eye is immediately drawn to what’s different. Size. This is one of the reasons we might encode scatter plots by size because we’re immediately drawn to the larger circles. Same with position. Same with grouping. They’re all related to scatter plots.
And then the third preattentive attribute category is motion. So we’ll be able to very quickly see the differences with direction of motion as well as intensity or speed of motion. So there’s direction. There’s speed. We’re immediately drawn to what’s different. Fortunately, Tableau in recent years has made their animations much better– or they didn’t even really exist, actually. But that kind of helps us with that third category. But all of these are kind of what make data visualization go. And now I’ve made it to my formal benefits of data visualization, which I’ll point out next, which are reduced time to incite.
I don’t think that this first benefit is arguable. We just saw this with the count the nines Tableau example. We were able to analyze a trickier data set with more to count in 1/3 of the time. That one I don’t think is as arguable, or is arguable. Probably the most important benefit of data visualization is it increases the accuracy of the insight. Think back to that large wall of numbers that wasn’t encoded and we were going line by line. Even if we eventually found that 9,000 roughly number, I would not be that confident. I probably would want to count that a couple of times. And I just wouldn’t be that sure that I had the right answer. When it was encoded by color, I’m positive. There’s only three or four numbers even in the ballpark of being the highest number. And I would know for sure that that 9,000 value was the highest.
The reason that’s important is when we’re dealing with analytics, first of all, it can be dangerous literally if we’re taking action on the wrong insight because we didn’t do a good enough job of analyzing the data. Think of health care or engineers that are trying to build buildings. It’s critical, of course, that we have the right number. Even when we’re not physically in danger, I argue that it can be dangerous for you in your career regardless of what you’re doing to not use data visualization to improve the accuracy of your insights because if your organization takes any action based on a inaccurate insight, you’re wasting resources.
If we did an analysis and we find out that we do particularly well in a certain city and we start to invest marketing resources in that city only to find out later that we should have dug further or double checked that number, that’s actually not the case, well, now we’ve got to start over. So it’s dangerous for you and your career and wasting resources potentially. And the third benefit is kind of a bonus. But I argue that data visualization is more engaging. It gives people something to look at. It’s more interesting than just looking at yet another Excel spreadsheet. And when you have improved engagement and people are willing to stop and think about what you’re communicating to them in the data, it just naturally lends itself to improved adoption as well, which then naturally lends itself to action, which is really what we’re trying to do when we’re doing visual analytics.
I personally think this is fun. I feel very fortunate that I’ve fallen into this career of doing data visualization and using Tableau. But we’re not getting paid to have fun. None of us are. At the end of the day, we’re trying to find some insight that leads to a positive action in an organization. And I think data visualization will help you do that because it’s more engaging. More people will use it. When you get the right people using it that can actually make decisions and take action, it’s going to improve the chances that something actually happens as a result of your work. So that’s why we’re doing what we’re doing.
This is the first time that I kind of lose a couple of friends. But on a lighter note, I do have some thoughts around avoiding certain chart types. So we’re going to talk a little bit about pie charts. I’ll try to keep this soapbox moment to a minimum. But I do have another example of preattentive attributes. And the point of this video is to show you that they’re not all created equal. So the height of a bar is– what this example is going to show you as a height of a bar is easier for us to process than the area within a slice of pie. So to show you the example.
All right. So here’s a pie chart. And if I was taking an honest effort to evaluate this pie chart– so try to glean a couple insights from this. And again, I’m trying to honestly do this and show you some benefits potentially of a pie chart. I would probably glean that turquoise area that’s classified as A, that looks like the largest area to me. And then this pie chart is sorted in rank order, which is helpful. Most pie charts that I come across are not sorted. So they’re even worse than this. But they’re sorted in rank order. So I could theoretically kind of look around the circle and figure out, OK, turquoise is ranked one, red was second, black was third, and so on.
But where I start to get in trouble I’d say is around that fifth and sixth slice, that yellow versus green. Those almost look to be the exact same size to me. If I am assuming this is in rank order, that’s telling me that these are the same. But it is getting hard for me to tell. And then by the time I get into these– I call these longtail slices where they– again, this could be a lot worse. I’ve seen pies with a lot more slices. And then those long tail slices get so small that they’re almost adding no value. They’re impossible to evaluate. I’m going to hit play here and convert this pie chart into a bar chart. Same exact data. Hit pause here and just point out a couple of the differences.
Let’s go back to the E versus F. So E versus F I can now see that it’s eight versus seven. I honestly thought these were about the same size. And it turns out E is one full unit higher than F. So this reduced my accuracy. It also made it slower for me to process because I kind of slowed down and was thinking about it. Another benefit of bars versus pie is I can avoid that double encoding. On the bar chart, I’ve already got a column for each dimension member. I don’t also need a color legend so my user has to look back and forth.
And I admit, yes, we have real estate to show the mentioned numbers– or I’m sorry. The values or the labels. But even without those, I could tell at a glance that E was definitely higher than F. And that’s just the ability to even show those labels is yet another benefit of bar charts. We’ve got the real estate to actually show those labels where we might not otherwise on a pie chart. I know that this is not going to sell everyone on the line. So I do have some tips whether you’re stuck on pie charts or maybe you have a stakeholder that uses them all the time, I’ve got a few tips for you.
First of all, I recommend sticking to five slices or fewer. This is related to that idea of those long tail slices providing less and less value as they get smaller and smaller. What I suggest you do instead is force your pie chart to be a total of five slices. I would use the first four as the actual dimension members. And then I would just lump everything else into a category called other. The second tip I’ve got for you– I’m a little more firm on this one. I would absolutely never use a pie chart in a time series analysis. The reason being, we’re already bad– relatively bad compared to a bar chart– at analyzing the area and in a slice of pie.
When you break that out over time– let’s say we have four years of data, and thus four pie chart side by side. Now you’re asking your user to not only compare the area which they’re already bad at, but you’re trying to get them to evaluate how is that relative size of the area changing over time. Not going to happen. And again, this all goes back to that mission statement of, is it reducing my time to insight? Is it increasing the accuracy of insights? And is it make it more engaging? I would say that the pie chart is failing at those first two. It’s not reducing the time to insight. I have to the slow down and think about it. And it’s definitely not increasing the accuracy, as we saw when we were trying to look at the yellow versus the green.
And then my third and final tip for you is to use a bar chart instead. Some comic relief there. I do promise– I try to in good faith kind of argue both sides. And I will point out that the pie chart was also invented– I’m kind of cringing here– by my friend William Playfair. He did take 15 more years to introduce the pie chart. But the same person that invented the line graph, and the bar chart, and the area graph also invented the pie chart. I do like to say that he should have quit while he was ahead because it did take 15 years. These were introduced in 1801.
But on the bright side, I guess I should say if it was good enough for William Playfair, there has to be some kind of value. But hopefully, some of the examples I just shared kind of explained why people make the argument to use something that is probably more effective, scientifically. All right. While we’re at it, please do not make this chart on the screen. This is called packed bubbles. The reason I point this out specifically is I think every single person the first day they use Tableau, I believe that they made this chart. It’s very engaging. But very similar to the pie chart, it’s got very few insights in it, or at least insights that I would feel comfortable were accurate.
Once I get past these marks on the border that are very large– so one insight I might be able to legitimately see is New York is the largest value. But once I get past that, it gets extremely difficult to glean any insight from this because, again, I’m not as good at comparing the areas in these circles as I am with some of the other preattentive attributes like grouping, or color, or size– not size– but length and height. All those would be better options for my analysis. This one I might print out, put it on my refrigerator. But it fails at those first two out of the three objectives that I always have with data visualization.
And I kind of forgot to point this out. So let me clarify. When I think of that mission statement whenever I’m designing anything in Tableau– and it has to be unanimous. So in the case of packed bubbles, it does pretty good at the third. I would take that argument. It’s very engaging, perhaps. It looks like a piece of art. Might print it out. But it fails pretty miserably at the first two. Does not reduce the time to insight. Does not increase the accuracy of insight. So I would try not to use it. There’s probably something better out there. So when I’m making these decisions, I think through those three things. And all three of them have to be true for me to want to build it out in Tableau.
Does it reduce the time to insight? Does it increase the accuracy of the insight? And does it make it more engaging, which I think will help drive adoption and help drive action. Got one more video for you today. This one’s pretty fun. This is our newest one. And again, I’ll kind of talk you through it. But one more piece of evidence on why you might want to avoid certain preattentive attributes. Hit the Play button. And we’re going to try to answer this question. How much blue area does the red circle cover in percentage terms? So what percent of that blue circle is being covered up by the red circle?
Normally when I do this and I’m able to hear some responses– and if I’m just being honest that I had never seen this before, my honest answer at this point would probably be somewhere in the neighborhood of between 70% and 80%. That would be my honest answer if I had never seen this exercise before. But what I’m going to do now is convert these exact areas– so the values of these areas– into bars using the preattentive of attribute of height. And you’ll be able to see how much easier it is to make this estimate. So I’ll click the Play button. I’ll try to be quick on my pause button here. All right. Perfect almost.
All right. So at this point, my answer would be a lot different. In the current View, I would probably be guessing at this point somewhere between 40% and 50% instead of 70% and 80%. I’m going to make this even better by now putting these bars side by side so they’re closer together. Then we’ll add some tick marks which will make us even more confident in our answer. So there’s side by side. Before the tick marks show up, I probably am still– I don’t think it’s 40%. I’m probably now closer to 45% to 50%. I keep it going. Obviously, the labels are starting to show up there. But now that I’ve got the tick marks I’m already very confident it’s 50%. That’s the answer.
So I went from 70% to 80% to at least 40% to 50% if I didn’t have those labels on there. So much more efficiently I was able to glean the insight faster. Didn’t have to slow down to think about it. And I’m much more confident that my answer is going to be accurate. 10th and final rule of thumb for you is to pick the right visualizations for the job at hand. Tableau is extremely flexible. We’re going to build probably a couple more dozen charts from here.
But I typically think that there’s one chart– and I don’t have any science to back this up. Probably should do some kind of thesis or something about this. But just in my gut I feel like there’s one chart for each business situation that will get you somewhere around 50% to 60% of the value. Let me explain what I mean. If I’m comparing categorical data, I’m going to start with a bar chart. I know there’s other stuff out there. We need to have some diversity in charts. But that’s going to be my starting point if I’m just trying to do a quick analysis on my own. For that business situation, I’m going to start with a bar chart and go from there.
If I’m trying to analyze a trend over time, first chart I’m going to do is a line graph. That’s going to get me at least half the value that I’m looking for. There are alternative ways to visualize time. But every single time, I’m going to start with the line graph because that’s going to get me the majority of the value to answer that particular business question. If I’m wanting to look at correlations, or look at a lot of things on one chart, or make a four quadrant segmentation, I’m going to start with a scatter plot. Doesn’t mean these are the only charts out there. But I do think that there is one specific chart that helps get you about half the value for each business situation that you might come across.
Alright Last chart for today. And we are right on time. I might be five minutes over if you want to kind of plan out your schedule. But we’ve made a bar chart, which is my favorite chart type. We’ve made a line graph, which is my second favorite. We’re now going to combine those together and create what’s called a Dual Axis Combination Chart, which combines a bar chart with a line graph. So back over here in Tableau Desktop– I’m sorry, PowerPoint first. This will be our final exercise, by the way. So if you are interested in this, I encourage you to follow along. And then we’ll finish today with recreating this one on our own.
This is also the first time that I’ll point out kind of how I learned Tableau. And I use that in the present tense because I very much am still learning Tableau. I’ve always said it takes a day to start using but a lifetime to master. I’m learning new stuff about Tableau every day. I hope that doesn’t scare you because I’m in over 10 years. But to me, I really like it. It keeps me really engaged. It’s fun to use the software, solve problems. And I think the reason there are still problems to solve is its flexibility is almost infinite. You can never really figure out every single aspect of Tableau.
So I’m still learning to this day. And the way that I still learn is I try to break things into smaller pieces that I know. Because if I can be really good at two individual pieces, I kind of get two building blocks that I can then combine to make bigger and bigger building blocks. And that’s kind of how I learn. And this is the first example of that. So on the screen right now we have a bar chart on one side and a line graph on the other. We haven’t combined this yet. And if you’re new to Tableau today, you haven’t ever built this yet. But we have built a bar chart. We have built a line graph. So just start with one or the other.
So what I would do is I’d start with the bar chart. Sales is on the Rows shelf. So I’ll just double click Sales. And that is being broken down by year of order date. So I will just double click on Order Date because I know the default datepart is Year. Hopefully, this doesn’t confuse you too much. But this is the one time that I can think of anyway off the top of my head– if I remember another use case I’ll bring it up later this week. But this is the one time that I would– even though this is currently a line graph, and we will eventually have a line graph on the other axis, I would stick with this being discrete. And reason being is another nuance of this pill being blue or green is it will influence the formatting of the marks.
Watch what happens when I change this to bar. So first of all, that is the left side of the chart. That’s how easy it was. We broke this into a smaller component that we knew was very easy to make that left side of the chart. But back to why I want to leave this as blue. If I chose Year Continuous instead, my ability to format these marks becomes very different. The options are different. There’s what Fixed looks like. It’s very chunky. I definitely want some breathing room between those bars. If I click Manual, I’ve got some options. But none of these look good. Even if I drag this all the way to the right, those bars are way too skinny. So that is another side effect of this being green or blue.
So I’m going to change that back to discrete Year. That gives me better flexibility with how I format those bars. So there’s the left side. On the right side, we’ve got Profit by Year as a line. For the first time today to create that we’re going to put a second pill onto one of these shelves. I’m going to drag Profit to Rows. Oops. There we go. I’m going to drag Profit to Rows and let go. And now we’ve got two pills for the first time– or two measures for the first time, I should say, on one of the shelves. What is special about that is when you’ve got more than one pill on the Rows shelf– and I keep saying pill. It’s measures specifically.
If you’ve got more than one measure on either the Rows shelf for the Columns shelf, each of them gets its own Marks card. And what is useful about that is you can treat the marks differently depending on what axis they’re on. So for Sales, I can leave it with a Mark type of Bar. But if I navigate to the Marks card for Profit only, I can change the Mark type to Line. And I can edit those independently of each other. So now I’ve got a bar chart on the top and a line graph on the bottom.
Lastly, to combine these so that they’re using dual axis on the same layer. And by the way, the name of this chart, dual axis combination charts, comes from the fact that they’re dual axis. We have an axis here, an axis here. And the combination piece of that comes from the fact that we’re using a combination Mark types. On the left, we have a mark type of Bar. On the right, we have a Mark type of Line.
But to combine them, there’s actually several ways to do this. I’ll show you two of them now. You can either hover over the bottom axis and a green triangle will appear. You can click on that green triangle and drag it to the opposite axis. That dashed line is going to show you where it’s going to draw that axis. And let go. I’m going to undo. Can show you a second way. This is how most people learn it, by the way. If you click on the second pill, the third option from the bottom is called Dual Axis. That will do the same thing. It throws the profit axis on the right side.
I purposely picked these two measures to point out a very big pitfall with this type of chart. There are times where it doesn’t make sense for the axis to be on the same layer. And this is one of those where if I’m just glancing at this chart, remember, that the bars are sales and the line is profit. So without looking at the values on the axis, if I were just to glance at this, we actually have a couple of examples, like 2018 and 2019, where we have more profit than sales. That doesn’t even make sense. How could we be making more profit than we’re selling? There has to be some kind of cost.
And then for 2020, it looks like we’re almost exactly even. The reason that’s happening is our scales have very different ranges, or I should say our axis have very different scales. The sales axis is going from 0 to 750 while our profit axis is going from 0 to 90. It’s much less. As an option you can make sure that those axis are in sync by right-clicking on either one and choosing Synchronize Axis. You don’t have to do this every time, but only when it would be misleading otherwise I would suggest you synchronize those. Now this makes a lot more sense because profit should be a subset of sales. And it looks like it’s somewhere in the range of 15% or 20% of sales, which seems to make sense.
All right. We’ve made it to the last exercise. I’m going to let you rebuild this on your own. This might be more like a four or five minute variety exercise. All right. Let’s build this one. And I also realized I didn’t give you quite as many hints. And it’s probably the hardest exercise of the day. So you didn’t quite get there, I will build this out for you. First of all, again, probably my biggest tip just from a how to learn perspective, just think back to the smaller pieces that you know. Even if you’ve never built this chart before, you know how to build a bar chart. So let’s just start with that. I’m going to double click Profit, double click Order Date, which will create the start of the left side.
So on a new sheet, double click Profit. That adds it to the Rows shelf by default, which creates a Y-axis. Now if I double click Order Date, we’ve got profit by discrete year of order date. And then from there, it’s just a matter of changing the Mark type. The order in which you do the rest of the steps doesn’t matter at all. But what I would probably do is try to get a foundation of one dual axis combination chart before I layer on extra information. So what I mean by that is before I throw in Category on the Columns, I probably would try to get just one dual axis combination chart that looks at profit on one side by average quantity on the other.
So I’m moving on to the next measure that I need, which is average Quantity. Because I see that the aggregation is something other than the default, I will right click on Quantity while I drag it onto Rows shelf. And that allows me to choose Average before it even draws anything. If you forget that shortcut– so I’ll undo that one step and just double click Quantity. It’s getting the default aggregation, which is sum. If I want to change that to average, I’d have to remember to click into the pill, hover over Measure Sum, and choose Average instead.
Now that I’ve got two measures on the Rows shelf, they each get their own Marks card. And I can control those independently. There’s also a Marks card called All. If you choose that one, it controls the marks across both rows in this case. So we actually have three Marks cards at the moment. All Sum of Profit controls the first axis. Average Quantity controls the second axis. I’ll change the Mark type of the second row only to line. I will then make this a dual axis by clicking on the second pill, clicking Dual Axis. So far so good. This is a case where it does not make sense to synchronize the axis. These are, by their nature, very different scales.
This is sum of profit going from 0 to 90,000. This is average quantity going from 0 to 4. If I were to synchronize those axis, that red line would provide no value. It would just be squeezed at the bottom. It would just look like a flat line. So it’d be kind of pointless to include that in the analysis. So this time I am not going to synchronize the axis. The last step was to add this context of Category.
I can see that Category is drawing columns on the View, which tells me I need to put the Category dimension onto the Columns shelf. This is probably the biggest pitfall in this exercise. If you were to just drag Category and put it after Year, remember these pills are processed in order. So it’s going to draw a column for each of our four years first followed by a column for each of our three categories second. Looking at the screenshot, that’s not quite what I wanted. I wanted Category to be processed first followed by Year second. I can easily update that by just dragging Category in front of Year. And there we have our final exercise.