Six Ways to be Terrible at PowerPivotGator
Six Ways to be Terrible at PowerPivot
PowerPivot adds a lot of functionality to your spreadsheets (via an add-in in Excel 2010 and enabling it in Excel 2013). PowerPivot (and friends PowerQuery, PowerMap and PowerView) allow you to automate repetitive data clean-ups, crunch hundreds of millions of rows of data and build stunning visualizations.
In 2012 I had heard of a new Excel add-in that promised “never write a VLOOKUP ever again.” And at first using PowerPivot was like driving a Ferrari straight into a brick wall – corrupt files, non-stop Excel crashes, glacially slow calculation speeds and 80mb file sizes. Now three years later I look back on those early days and shake my head. But as a wise dog says, “Sucking at something is the first step to becoming sort of good at something.” Hopefully you can learn from my many mistakes, the top six are outlined below.
Import Large Excel Tables
In PowerPivot you can pull data in from places like the internet, a database, another excel file, a CSV file, Facebook or a table right in your spreadsheet. Excel table imports appear to be the most convenient – what better way to keep all your information in one place and make fast edits? However several tables of 300,000+ rows sitting in duplicate in your Excel file and PowePivot will bog down your file. For larger data sets using any other import method is the way to go.
PowerQuery is a bit like Excel’s Macro Recorder for importing data. Trim and clean your data the first time, and the next time you hit Refresh everything is performed in the background. Neat! But why not just import everything? You might need that Esparanto_Product_Name column later and you can always clean your data with by tons of new calculated columns. You guessed it – larger file size and poor performance. So use PowerQuery to remove unnecessary columns, handle errors, change data types and sort. Sort is a funny one – for some reason sorting columns directly in PowerPivot can bloat your files by up to 25%.
Use RELATED and Calculated Columns everywhere
Excel power users constantly join two+ tables with VLOOKUPs. Or they add helper columns to perform common calculations like unit price, profit or net days. And then they create a pivot table off of the resulting mega table. Raw data, VLOOKUP, helper columns, pivot. Rinse and repeat multiple times per day.
So of course a new PowerPivot user’s first instinct will be to construct a mega table. The RELATED DAX function is equivalent to VLOOKUP and pulls in matching data from other tables. However the beauty of PowerPivot is that pivot tables can be built from multiple data tables. As long as a relationships exists between two data tables you can use the Product Details table items as row fields and Sales Transaction table items as value fields.
Besides RELATED other DAX functions can create those helper columns and widen your data table. Because of the way PowerPivot’s calculation engine works wide and fat tables quickly lead to performance issues. A better option is to create DAX measures or calculated fields to do your calculations. As long as you don’t need to put your DAX measure in the row or column of a pivot table it’s always better to use measures instead of calculated columns.
Use 32bit Office
There are two versions of Microsoft Office – 32bit and 64bit. Using 32bit Excel with PowerPivot is not fun – suddenly Excel crashes every ten minutes and refreshing workbooks takes a long time. This happens because 32bit Office is limited to 2GB of memory. To get around this switch to 64bit Office or trick your computer into using 4GB with this patch.
To see which version of Office you have in Excel 2013 click File > Account > About Excel in the lower right.
Rely On Implicit DAX Measures
In a normal pivot table it’s easy to see the Sum of Grizzly Bear Attacks – you pull in the Grizzly Bear Attacks column to the pivot table’s value field. Doing the same thing in PowerPivot creates an Implicit Measure. The problem with implicit measures is sometimes they start to throw out incorrect values when you filter your pivot table or add information from other tables.
The better approach is to create a DAX Measure called Grizzly Bear Attack Total:=SUM(BearFactsTable[GrizzlyBearAttacks])
Check for Implicit Measures by clicking the Implicit Measure button in the Advanced ribbon of your data model.
Go It Alone
PowerPivot is not intuitive for people with years of experience in excel or databases. It’s easy to get overwhelmed by the new DAX functions, PowerQuery’s M language or those mysterious error messages. Luckily there are now a ton of resources online to help.