Friday 5 May 2017

Compress your database tables

SQL Server has offered data compression since SQL 2008. It is a very good feature but, as with almost everything SQL Server, it needs to be deployed judiciously. Data compression has been written about and reviewed by many SQL experts who are much more qualified and knowledgeable than me, so I won't go into it too much.

First, a quick summary of what the feature does as well as its benefits and effects:
As the feature's name suggests, when it is enabled on a table and/or its index(es), SQL server compresses (or shrinks) the data before it is written to the database. There are two types: ROW and PAGE.
  • Benefits
    • Disk space savings
      • Higher space saving for PAGE compression
      • Lower space saving for ROW compression
    • Memory usage efficiency gains
    • Logical and physical read performance: more data fits in fewer pages
  • Considerations
    • Increased CPU cost
      • Higher CPU cost for PAGE compression
      • Lower CPU cost for ROW compression

    Personally, I feel (and so does Brent Ozar) that the benefits outweigh the costs. Most of the SQL Server instances I have managed were much more disk bound than CPU bound. So if your server is running at an average of 20%-30% CPU and your I/O wait stats are high, it's a no-brainer in my mind.

    Additional resources:

    Now that we know about what data compression is, let's talk about how to implement it. The second link above describes how to compress tables and indexes. But what if you'd like to evaluate and compress your whole database. Well, I stumbled across a stored procedure by C. Meuleman ([dbo].[_Compress_Database]), which I downloaded from http://tinyurl.com/pf6ol77. Glenn Berry has one as well, but it doesn't do quite what I want: Estimating Data Compression Savings in SQL Server 2012

    It is a very nice script which loops through all the tables and indexes in your database and, based on the read/write thresholds you set (or just use the defaults) gives recommendations for which compression type to use for each object. I've used this stored proc many times. It is a real time saver and a good way of quickly getting an idea of what tables will get the best storage benefit based on their usage profile. It prints out the ALTER TABLE/ALTER INDEX scripts and there is even a parameter to state whether you want to immediately run the compression.

    The one thing I didn't like about the stored proc was that it needed to be created in the database that was being evaluated. It also created tables in that database. I prefer to leave application databases alone. Even if you are carrying out the evaluation on a non-production server (which would be the preferred method) you are adding objects to a database that may have ongoing development being carried out against it. And you then need to be careful about cleaning up after yourself and so on. 

    My database team has its own DBA databases on both production and non-production database servers. We prefer to put any tools like this stored procedure on this database. That way, if we create a mess, it's not affecting anyone else.

    With that in mind, I altered the above mentioned stored proc so that you can create and run it on a maintenance database and specify which database you want to evaluate. You can download the script from Google Drive.

    I hope you find this useful.

    No comments: