The larger a dataset becomes, the more challenging it can be to mange fields and their respective metadata. This video covers several ways to make your life easier including hiding unused fields, creating hierarchies, organizing fields into folders, and more.
Hi. This is Ryan with Playfair Data TV. In this video, I’m going to give you a few tips on helping you organize your data set and work with different fields. As your data sets become larger and larger, it can become kind of tricky to manage a data set. So all these tips are just going to help you get organized and just help you make your life easier, help you manage the data source.
If you’re trying to follow along, I want you to note that I am using the Sample-Superstore data set. But I’ve connected to it in its rawest format possible. What you can do, by the way, by just connecting to a new data set and clicking Microsoft Excel and going and finding this data source in your Tableau repository. It’s a slightly different version than if you’re on this screen and you just click Sample-Superstore.
So the first trick I’m going to show you, first couple of things, is on the Data Source tab. This is the screen that you will see when you connect to a data source for the very first time. The first thing I want to point out is you don’t have to bring in all of these columns. The Sample-Superstore data set’s been kind of perfectly set up. It’s not necessarily the most real world example.
When you connect to a database table at work, you could see 200 columns in that data set. Well, you probably don’t need all 200 columns. One of my biggest efficiency tips whenever I’m talking strategy in Tableau is to only bring in the data that you need. So if you ever are dealing with a circumstance where you’ve got a lot of extra columns, you can just hide them. There’s actually two places to do it. Well, there’s three, two on this screen.
But for each of these columns, if you were to hover over it and click this down arrow, you can hide that column. The second place is if you’re in List view, if you really do have that many columns, this might be a slightly easier way to navigate them. And same thing, if you click on this down arrow that appears when you hover over a field, you can hide it. And then it will not be seen within your data set. And it will be easier to search for individual fields.
Another handy thing to do on this screen is to change the data type. Each of these columns has a blue header at the top of it. I shouldn’t say they’re all blue. There’s some orange over here– or green over here. But those icons are telling me how Tableau interpreted each of the data types for each of these columns.
It doesn’t always get that right. It gets it right. But it may not be how you want to classify it. For example, if you’ve got dates with timestamps on them, Tableau will interpret that data type as Date and Time. If you don’t need to use the timestamps, you could change it back to be just Date.
To change a data type, all you have to do is click on that blue icon and choose a different data type. These are your six data types– Number with decimals, Whole Numbers or integers, Date and Time, Date, String, those are text values, and then Boolean is when it’s either true or false. So very easy to change there.
I’m going to jump into the Authoring interface and show you one more place and way to change how Tableau is classifying these fields. First of all, you can change the data type of a field the same way as you did on the last screen, by just clicking on this icon. You could change it there. So if that was Date and Time, I could switch that to be just Date.
Another thing you can do from within the Authoring interface is to change a measure to a dimension. The reason that this is handy is, by default, Tableau will classify anything that is quantitative as a measure and anything that is qualitative as a dimension. That’s true most of the time but not 100% of the time.
The best example I have to share is with any type of order ID, any type of numeric ID, I should say. Order ID is what it’s called in the sample data set. But any numeric ID, by default, Tableau will classify that as a measure because it is quantitative.
Well, an order ID is probably a dimension because instead of adding up those IDs and slicing and dicing them by another dimension, I’d probably want to use a measure like Sales, and then slice and dice it by ID to know how many sales I had per ID. So in that case, it is a dimension.
If you’re ever in a similar situation, there’s two ways to change how Tableau classified those. First, let’s just pretend that Discount is my ID that I want to change. You can right-click on the measure. And about halfway down, there’s the option to convert it to a dimension. That’s option one.
This is also just drag and drop. So I could drag Discount up here to the Dimensions area. And it’s now a dimension instead of a measure.
There are also times where you may want to use a field in both contexts. You might want to use Discount as a measure sometimes and as a dimension at other times. Here’s a little trick for that. You can just duplicate a measure by right-clicking on it and clicking Duplicate, and move one of the versions up to dimensions. So now I’ve got this Discount field both ways. I can use it as a continuous measure or as a discrete dimension.
Another way to help you get organized out of the gate here from within the Authoring interface is to change the date properties. And you can do this on a data source level by clicking Data in the top navigation, hovering over the connection, and then clicking Date Properties. And there’s several things you can change here.
At a minimum, I typically like to change my week start. Note that it’s starting on a Sunday. That is the default in Tableau. I’ve always thought of my weeks as starting on Monday going through Sunday. Tableau goes from Sunday to Saturday. So if your business looks at things Monday to Sunday, you could just change that there.
Note you can also change the fiscal year start. So if your business doesn’t start in January, you can change that to whatever month you would like. And then this is a handy one that is one of those things that no one ever really showed me. And this may be new to you. But it’s really handy. You can change the default date format.
So a lot of times, I like to use abbreviations. So I’ll go to Custom here and type just the month and the day. Of course, this is a case by case basis. If you’ve got years and years of data, that may not be appropriate for you. But I’m just using an abbreviation so that it saves real estate once I go to use my date field.
And note what happens now if I add Order Date to the Columns Shelf. All my dates will have that abbreviated format. And of course, you can always change that. But this is just a handy way to get it right and make it more useful to you out of the gate. I’m going to drag that away.
And speaking of dates, dates are one of the very few fields that come with a hierarchy. So they inherently have a hierarchy. So if I just double click on Order Date, there’s a plus sign on its pill. And that’s telling me that I can drill down to different levels in the hierarchy. So I click the plus sign once, it goes from Year to Quarter. Click it again, it goes from Quarter to Month, and so on. That comes out of the box like that.
Well, you can also create hierarchies out of any fields that you want. So a good example might be with my geographic data here. I’ve got State, Zip Code, Country, and City. If I were to use any of those individually, so just double click on Country, by default notice that that Country pill does not have a plus sign on it. There is no inherent hierarchy automatically.
But I could create one. To create a hierarchy, select the fields that you want in that hierarchy. So I’m just doing a control-click to do a multi-select, right-click on any of the selections, hover over Hierarchy, and click Create Hierarchy. And I’ll call this my Region Hierarchy. Click OK.
Note these were grouped together in the Dimensions area of the Data pane. They have a little icon here telling me that this is in a hierarchy now. There’s also a plus sign on the pill for the top level of the hierarchy.
However, let’s take a look of what happens when I click that plus sign. I click the plus sign. It does drill down to Zip Code. But I probably wanted it to drill down to State before it got all the way down to that Zip Code level. What’s happening is, by default, Tableau put those fields into alphabetical order within the hierarchy, which isn’t quite what I wanted.
Fortunately, it is extremely easy to change that order just by dragging and dropping these fields above and below each other. So I’ll put Country at the top. I had them all selected which is why it moved them all in unison. So if I have one selected, move it to the top above City, that black line is showing me where it’s going to place that field in the hierarchy. So Country first, followed by State, followed by City, followed by Zip Code.
If I click this minus sign on the pill, it will start over. So starting with Country, these now should go in this order. Click the plus sign once, it goes down to State level. Click the plus sign again, it goes down to the City level, and so on down to Zip Code. Those plus signs are now changed to minus signs telling me that I can drill back up the other direction. So if I wanted to roll my cities into states, click the minus sign. Cities now go back up to the State level.
And that can be done with any combination of dimensions that you want. Region was a very intuitive one so you might be thinking that might have been automatic. I just created a custom hierarchy that you can do with any combination of dimensions that you would like.
One last tip, which is to put your fields into different folders. By default, Tableau is grouping these fields by the data source that they come from. Well, we’re only connecting to one data source at the moment. So they’re all thrown in here to the same bucket. There’s one group of dimensions, one group of measures. So that type of organization is not really helpful. The reason that is the default is if you ever join or union in another data source, it will group the fields to tell you which data source each field came from.
But there’s one other choice which you can find by clicking on this down arrow in the top-right corner of the Dimensions area. And like I said, here’s the default, Group by Data Source Table. But if I click Group by Folder, these will be organized in a different way. You didn’t see anything change yet. But now if I wanted to put these fields into their own folders to organize them in a different way, I can.
So let’s say, just throwing out a scenario, I want to put my fields related to my customers into their own folder. I could click those fields, so Customer ID, Customer Name, and maybe Segment telling me what segment those customers are in. Now if I right-click on any one of those and hover over Folders, I can create a folder.
This is my very first folder in this data set. So there’s only one option. So I’ll go ahead and click that. And these are my Customer fields. Click OK. Let’s do one more for things related to products, so Product ID, Product Name, probably Sub-Category and Category. We’ll call that good for now.
Right-click on any of them, hover over Folders. And now note that there are two options. I can create a second folder or I can add to an existing folder since I had already created one. I’m going to create a second folder and call this Product.
And this is just another way to make it easier for me to find the fields. They’re now organized in a nicer way. I can close these folders to kind of clean up the Dimensions area of the Data pane there. If you don’t want to go through the effort of adding these to folders, there is a search functionality.
And this will autofill. So if I’m looking for a certain field, let’s say Sales, as I start to type it, it will go find it, find whatever fields start with those letters. However, you have to remember exactly what everything is called. So if you’ve got 200 columns, that’s not necessarily realistic.
Another reason I propose the folder organization is I’m always promoting using comparisons in Tableau. And so a lot of times I’ll have raw metrics as one set of KPIs. But then I’ll create calculated versions of them to isolate the performance for the current period. And then I’ll have another calculated version that isolates the performance for a prior period.
So just in that very simple scenario to get one period over period comparison, I’ve just tripled my number of metrics. I’ve got my raw set of measures that came with the data set. Then I’ve got the current set. Then I’ve got the prior set. So as you can imagine, that can cause my data set to get kind of unwieldy and hard to manage. So what I’ll do is make a folder, one for my raw KPIs, one for current, one for prior.
But hopefully, all these things helped you organize your data set a little bit more and will make your life easier as you move forward using Tableau Desktop.
This has been Ryan with Playfair Data TV – thanks for watching!