Ryan Sleeper
This video will show you how to make connected scatter plots in Tableau and how to leverage a dual-axis to (1) make the visualization more engaging and (2) display the point order of each mark.
Hi, this is Ryan with Playfair Data TV. And in this video, I’m going to show you how to make connected scatter plots in Tableau. I’m a big fan of scatter plots. They’re actually already my third favorite chart type. But by connecting the points in a scatter plot, at times, that makes them even better because it implies to the audience what order they should read the dimension members in, or which order they should read those points in.
You can make these with any fields that you want, of course. But just to illustrate, I’m going to start with a scatter plot that looks at Profit Ratio by Sales. If I double-click Profit Ratio, by default, that goes to the Rows Shelf and creates a y-axis. If I double-click on the Sales measure, by default because it’s the second measure, it goes to the Columns Shelf and creates a scatter plot for me.
I’m going to change the mark type to Circle and make that circle a little bit larger so it comes across better in the video. And let’s also change the level of detail for the scatter plot to the Category dimension. And I will put the Category dimension onto the Color Marks Card.
So scatter plot, we’ve got three dimension members in the Category dimension. They’re each colored respectively by Furniture, Office Supplies, and Technology. Not bad. There’s some interesting insight in this already. Particularly with Furniture, I can see that it’s right up there in sales with my Office Supplies and Technology categories. But it is way down on profit ratio. So not bad, but let’s add even more context to this and connect these dots so that the end user has additional information and knows how to read this chart.
The trick– the first trick to creating a connected scatter plot is to simply change the mark type from Circle to Line. That will connect those points. And also, when you use the Line mark type, you’ll see a special sixth Marks Card appear called Path. By default, this Line mark is being connected by these three category dimension members.
That’s not quite what we want. That doesn’t make a lot of sense. Usually, the best time to use a connected scatter plot is when you’re wanting to connect those points over time. So for example, we could use Year of Order Date in the Sample – Superstore dataset.
But these– that Path Marks Card allows you to change the way that Tableau connects the dots. If I were to put a dimension onto the Path Marks Card, it will overwrite how it’s currently being connected and connect those dots by that dimension. So like I mentioned, I will use Year of Order Date. I’m going to right-click on Order Date and drag it to the Path Marks Card. And the reason I right-click is so I can choose both the date part, as well as whether that data is being used as discrete, or continuous.
By default, the years go in order. So I don’t think I have to do this. But just to be sure they go in a continuous order, I’m going to click Year with the green icon next to it to ensure that these are going in chronological, or continuous order. Going to click OK. We now see four marks per category. And those dots are being connected by the year that they’re in.
So four years, each year gets a dot on the scatter plot. And because I put that on the Path Marks Card, that’s how these dots on the scatter plot are being connected. This is looking much nicer. I can now kind of see how each dimension member moved over time from 2016 to 2019. That’s currently as of 2019.3. That’s currently how long the Sample – Superstore dataset runs.
But if you were approaching this for the first time or your audience was, it’s still a little bit confusing. You don’t really know how to orient yourself, or what the lines mean, or which order they should be read. So we’re going to add more value to this connected scatter plot by converting it to a dual-axis combination chart. And on the second axis, we’re going to tell the end user which order these points should be read in.
There’s a video here at Playfair Data TV explaining how to make a dual-axis combination chart. My preferred method is to hold the Control key while I click a measure on the Rows Shelf. That creates a copy of the chart on a second row. But what’s important about this is we now have two measures on the Rows Shelf.
And they each get their own set of Marks Cards. And those Marks Cards can be edited independently of each other. So the first row, I will leave as is. The second row, I will change the mark type from Line to Circle. And let me make sure we still need all this. We do.
And to convert it to a dual-axis combination chart, you can click on the second pill and click Dual Axis. That’s now laying a dot plot on– well, a scatter plot on top of the connected scatter plot, I should say. But they’re not quite lined up. To ensure these are synchronized, you can right-click on either axis and click Synchronize Axis.
So that’s already served an aesthetic purpose. It’s already looking a little bit nicer. But let’s get back to the practical purpose of helping the end user understand which order these dots should be read in. To do that, I’m going to use a special table calculation called Index. I think of Index as being synonymous with Row Order. And there’s a couple of ways we could do this. We could type this formula in the flow of the analysis. But I use this Index table calculation so often that I typically just go ahead and create a calculated field for it.
So I’ll click Create Calculated Field. In this case, Index stands for my mark order or represents my mark order. So that’s what I will name the calculated field. But the entire formula is just the function INDEX, open parentheses, close parentheses. That is the entire formula. I’m going to click OK. And I’m going to add this to the Label Marks Card of the circles. So I need to make sure I’m on the correct Marks Shelf. And I’m going to drag Mark Order to the Label Marks Shelf.
And they all say 1. It’s not quite what I want. I’d rather it say 1 through 4 so that my end user knows which order to read these points. But notice there’s a delta symbol on that Mark Order pill. That delta symbol is telling me that there is a table calculation taking place. We have another video here that explains table calculations and how you can change the addressing and the partitioning to get these to calculate the answer that you want across the table of data being used.
In this case, I want to change the addressing. Which you can do by clicking into that pill with the delta symbol, hover over Compute Using, and instead of it computing by the default Table (across), I want it to be computed by Order Date. So if I click Order Date, we should see those numbers change. And we do. They now go from 1 through 4 for each of my three Category dimension members.
And just to ensure that those are giving me the correct mark order in chronological order, I will hover over each one just to make sure it starts at 2016 for the number one. Number two is 2017. That’s the second year in my dataset. So that’s correct. 2018 is third. 2019 is fourth. So perfect, that’s working.
I might clean this up a little bit by centering these. You could make the font bold, white, whatever you want to do. I also no longer need this second axis. It’s very repetitive because these are synchronized. So they’re on the exact same scale. So I’ll right-click and deselect– whoops, not Synchronize but deselect Show Header.
And we end up with this nice, connected scatter plot showing me not only how the dimension members are performing but showing me how they have moved across four years of time. We then leveraged a dual-axis to help our end user orient themselves with the chart and understand how to read this in chronological order.
This has been Ryan with Playfair Data TV – thanks for watching!