Let us have a look at SQL Compression, with this I mean Data compression not the backup compression which is also available in SQL Server nowadays.
This feature can be implemented in SQL Server without affecting the programs that use the database, because it is only used inside of SQL Server and the way it stores its data. Which means that SQL Server will compress the data as it is entered in the database. SQL Server will use the data compressed for all it’s internal operations, and YES the data is still compressed when it is put into memory resulting in more rows in the same memory space which increases performance. Then if the data is send to the application which requested the data, SQL Server will decompress it. Compressing and decompressing data is a CPU intensive task, which leads to an increase of 15% on CPU usage. The save on storage and the increase of the number of rows on a page and the increase of the number of rows in the memory banks will certainly outweigh the increase in CPU usage.
SQL Server versions with Compression
Data compression became available in SQL 2008 and was further improved in SQL 2008 R2 after which it is continued in all version including the latest SQL 2016. Because it is an advanced feature it is only available in Enterprise Editions of SQL. (And Developer Edition which isn’t to be used in a production environment).
Types of compression
There are 2 types of compression available in SQL Server. Row compression and Page compression the latter implements the first and more, but let me explain both of them to you.
Row compression
This type of compression does what it says it compresses the row of data in an index/table. And will do this in several ways, which I’ll discuss below.
Metadata compression
Metadata is information about the columns for some data types the meta data is larger then the actual data type (for example Bit).
Delete not used space in fixed size datatypes
Row compression will delete the space that isn’t used in fixed sized strings, which means that a CHAR(10) can become a CHAR(1) if only 1 character is in it. Remember it is ROW compression so it will determine the actual used space per value not per column over the entire table/index. Unfortunately Dynamics Ax doesn’t use fixed sized string (it uses NVARCHAR instead), so we cannot gain anything by this.
Fortunately SQL Server also sees numeric values as (a form of) fixed sized, and it will delete leading and/or trailing zero’s if there are any. In Dynamics Ax all amounts are stored as NUMERIC(32,16) but if the actual amount is $1,95 if compression is applied it will only store the 1,95 removing the space it would normally store additionally to make it (32,16). The same is applied to integers and BIGINT, and Date and Datetime are converted to integers which also saves on storage.
Not effected by row compression are memo fields (NVARCHAR(MAX)), Images (VARBINARY), GUID (UNIQUEIDENTIFIER).
Standard Compression Scheme for Unicode
SQL Server also uses an implementation of the Standard Compression Scheme for Unicode (SCSU) algorithm to compress Unicode values. Because SQL Server stores data as 2 bytes regardless of the language of the data. Depending on the language of your data you can save up to 50% on the storage of all nvarchar() and nchar() datatype columns. Languages with the highest compression rate are: Western European languages as English, Dutch, German, French, Spanish etc. but also Hindi. On the lower end we find languages with more characters like Japanese which has a compression rate of 15%.
Dynamics Ax uses Unicode for all strings stored in its database.. I’ll repeat that ALL strings. So if you’re language is for example English this means that ALL your stored text can be stared in half the space it is using right now. Great news I would say.
Page compression
Page compression uses Row compression as mentioned before, but it also uses the following to techniques.
Prefix compression
SQL Server will determine per column a prefix value that is used (fully or partially) by the most rows. After determining this SQL places this value in the header and references to it (or part of it) in the row value. Which is best visualized by these images of Microsoft MSDN:
Yes you can do all that… Or use below script to do all the hard work for you and even deliver you the scripts to update the indexes:
--COLLECT INDEX DATA IF object_id('index_estimates') IS NOT NULL DROP TABLE index_estimates GO CREATE TABLE index_estimates ( database_name SYSNAME NOT NULL ,[schema_name] SYSNAME NOT NULL ,table_name SYSNAME NOT NULL ,index_id INT NOT NULL ,update_pct DECIMAL(5, 2) NOT NULL ,select_pct DECIMAL(5, 2) NOT NULL ,CONSTRAINT pk_index_estimates PRIMARY KEY ( database_name ,[schema_name] ,table_name ,index_id ) ); GO INSERT INTO index_estimates SELECT db_name() AS database_name ,schema_name(t.schema_id) AS [schema_name] ,t.NAME ,i.index_id ,i.leaf_update_count * 100.0 / (i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count) AS UpdatePct ,i.range_scan_count * 100.0 / (i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count) AS SelectPct FROM sys.dm_db_index_operational_stats(db_id(), NULL, NULL, NULL) i INNER JOIN sys.tables t ON i.object_id = t.object_id INNER JOIN sys.dm_db_partition_stats p ON t.object_id = p.object_id WHERE i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count > 0 AND p.index_id < 2 AND i.range_scan_count / (i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count) > .75 -- only consider tables with 75% or greater select percentage ORDER BY t.NAME ,i.index_id GO --PREPARE ROW AND PAGE COMPRESSION IF OBJECT_ID('page_compression_estimates') IS NOT NULL DROP TABLE page_compression_estimates; GO CREATE TABLE page_compression_estimates ( [object_name] SYSNAME NOT NULL ,[schema_name] SYSNAME NOT NULL ,index_id INT NOT NULL ,partition_number INT NOT NULL ,[size_with_current_compression_setting(KB)] BIGINT NOT NULL ,[size_with_requested_compression_setting(KB)] BIGINT NOT NULL ,[sample_size_with_current_compression_setting(KB)] BIGINT NOT NULL ,[sample_size_with_requested_compression_setting(KB)] BIGINT NOT NULL ,CONSTRAINT pk_page_compression_estimates PRIMARY KEY ( [object_name] ,[schema_name] ,index_id ) ); GO IF OBJECT_ID('row_compression_estimates') IS NOT NULL DROP TABLE row_compression_estimates; GO CREATE TABLE row_compression_estimates ( [object_name] SYSNAME NOT NULL ,[schema_name] SYSNAME NOT NULL ,index_id INT NOT NULL ,partition_number INT NOT NULL ,[size_with_current_compression_setting(KB)] BIGINT NOT NULL ,[size_with_requested_compression_setting(KB)] BIGINT NOT NULL ,[sample_size_with_current_compression_setting(KB)] BIGINT NOT NULL ,[sample_size_with_requested_compression_setting(KB)] BIGINT NOT NULL ,CONSTRAINT pk_row_compression_estimates PRIMARY KEY ( [object_name] ,[schema_name] ,index_id ) ); GO --DYNAMICALLY GENERATE OUTCOME DECLARE @script_template NVARCHAR(max) = 'insert ##compression_mode##_compression_estimates exec sp_estimate_data_compression_savings ''##schema_name##'',''##table_name##'',NULL,NULL,''##compression_mode##'''; DECLARE @executable_script NVARCHAR(max); DECLARE @schema SYSNAME ,@table SYSNAME ,@compression_mode NVARCHAR(20); DECLARE cur CURSOR FAST_FORWARD FOR SELECT i.[schema_name] ,i.[table_name] ,em.estimate_mode FROM index_estimates i CROSS JOIN ( VALUES ('row') ,('page') ) AS em(estimate_mode) GROUP BY i.[schema_name] ,i.[table_name] ,em.estimate_mode; OPEN cur; FETCH NEXT FROM cur INTO @schema ,@table ,@compression_mode; WHILE (@@FETCH_STATUS = 0) BEGIN SET @executable_script = REPLACE(REPLACE(REPLACE(@script_template, '##schema_name##', @schema), '##table_name##', @table), '##compression_mode##', @compression_mode); PRINT @executable_script; EXEC (@executable_script); FETCH NEXT FROM cur INTO @schema ,@table ,@compression_mode; END CLOSE cur; DEALLOCATE cur; --SHOW RESULTS WITH all_estimates AS ( SELECT '[' + i.schema_name + '].[' + i.table_name + ']' AS table_name ,CASE WHEN i.index_id > 0 THEN '[' + idx.NAME + ']' ELSE NULL END AS index_name ,i.select_pct ,i.update_pct ,CASE WHEN r.[sample_size_with_current_compression_setting(KB)] > 0 THEN 100 - r.[sample_size_with_requested_compression_setting(KB)] * 100.0 / r.[sample_size_with_current_compression_setting(KB)] ELSE 0.0 END AS row_compression_saving_pct ,CASE WHEN p.[sample_size_with_current_compression_setting(KB)] > 0 THEN 100 - p.[sample_size_with_requested_compression_setting(KB)] * 100.0 / p.[sample_size_with_current_compression_setting(KB)] ELSE 0.0 END AS page_compression_saving_pct FROM index_estimates i INNER JOIN row_compression_estimates r ON i.schema_name = r.schema_name AND i.table_name = r.object_name AND i.index_id = r.index_id INNER JOIN page_compression_estimates p ON i.schema_name = p.schema_name AND i.table_name = p.object_name AND i.index_id = p.index_id INNER JOIN sys.indexes idx ON i.index_id = idx.index_id AND object_name(idx.object_id) = i.table_name ) ,recommend_compression AS ( SELECT table_name ,index_name ,select_pct ,update_pct ,row_compression_saving_pct ,page_compression_saving_pct ,CASE WHEN update_pct = 0 THEN 'Page' WHEN update_pct >= 20 THEN 'Row' WHEN update_pct > 0 AND update_pct < 20 AND page_compression_saving_pct - row_compression_saving_pct < 10 THEN 'Row' ELSE 'Page' END AS recommended_data_compression FROM all_estimates WHERE row_compression_saving_pct > 0 AND page_compression_saving_pct > 0 ) SELECT table_name ,index_name ,select_pct ,update_pct ,row_compression_saving_pct ,page_compression_saving_pct ,recommended_data_compression ,CASE WHEN index_name IS NULL THEN 'alter table ' + table_name + ' rebuild with ( data_compression = ' + recommended_data_compression + ')' ELSE 'alter index ' + index_name + ' on ' + table_name + ' rebuild with ( data_compression = ' + recommended_data_compression + ')' END AS [statement] FROM recommend_compression ORDER BY table_name --CLEAN UP DROP TABLE index_estimates;