Let us have a look at SQL Compression, with this I mean Data compression not the backup compression which is also available in SQL Server nowadays.

This feature can be implemented in SQL Server without affecting the programs that use the database, because it is only used inside of SQL Server and the way it stores its data. Which means that SQL Server will compress the data as it is entered in the database. SQL Server will use the data compressed for all it’s internal operations, and YES the data is still compressed when it is put into memory resulting in more rows in the same memory space which increases performance. Then if the data is send to the application which requested the data, SQL Server will decompress it. Compressing and decompressing data is a CPU intensive task, which leads to an increase of 15% on CPU usage. The save on storage and the increase of the number of rows on a page and the increase of the number of rows in the memory banks will certainly outweigh the increase in CPU usage.

SQL Server versions with Compression

Data compression became available in SQL 2008 and was further improved in SQL 2008 R2 after which it is continued in all version including the latest SQL 2016. Because it is an advanced feature it is only available in Enterprise Editions of SQL. (And Developer Edition which isn’t to be used in a production environment).

Types of compression

There are 2 types of compression available in SQL Server. Row compression and Page compression the latter implements the first and more, but let me explain both of them to you.

Row compression

This type of compression does what it says it compresses the row of data in an index/table. And will do this in several ways, which I’ll discuss below.

Metadata compression

Metadata is information about the columns for some data types the meta data is larger then the actual data type (for example Bit).

Delete not used space in fixed size datatypes

Row compression will delete the space that isn’t used in fixed sized strings, which means that a CHAR(10) can become a CHAR(1) if only 1 character is in it. Remember it is ROW compression so it will determine the actual used space per value not per column over the entire table/index. Unfortunately Dynamics Ax doesn’t use fixed sized string (it uses NVARCHAR instead), so we cannot gain anything by this.

Fortunately SQL Server also sees numeric values as (a form of) fixed sized, and it will delete leading and/or trailing zero’s if there are any. In Dynamics Ax all amounts are stored as NUMERIC(32,16) but if the actual amount is $1,95 if compression is applied it will only store the 1,95 removing the space it would normally store additionally to make it (32,16). The same is applied to integers and BIGINT, and Date and Datetime are converted to integers which also saves on storage.

Not effected by row compression are memo fields (NVARCHAR(MAX)), Images (VARBINARY), GUID (UNIQUEIDENTIFIER).

Standard Compression Scheme for Unicode

SQL Server also uses an implementation of the Standard Compression Scheme for Unicode (SCSU) algorithm to compress Unicode values. Because SQL Server stores data as 2 bytes regardless of the language of the data. Depending on the language of your data you can save up to 50% on the storage of all nvarchar() and nchar() datatype columns. Languages with the highest compression rate are: Western European languages as English, Dutch, German, French, Spanish etc. but also Hindi. On the lower end we find languages with more characters like Japanese which has a compression rate of 15%.

Dynamics Ax uses Unicode for all strings stored in its database.. I’ll repeat that ALL strings. So if you’re language is for example English this means that ALL your stored text can be stared in half the space it is using right now. Great news I would say.

Page compression

Page compression uses Row compression as mentioned before, but it also uses the following to techniques.

Prefix compression

SQL Server will determine per column a prefix value that is used (fully or partially) by the most rows. After determining this SQL places this value in the header and references to it (or part of it) in the row value. Which is best visualized by these images of Microsoft MSDN:

"">

 

Yes you can do all that… Or use below script to do all the hard work for you and even deliver you the scripts to update the indexes:

--COLLECT INDEX DATA
IF object_id('index_estimates') IS NOT NULL
	DROP TABLE index_estimates
GO

CREATE TABLE index_estimates (
	database_name SYSNAME NOT NULL
	,[schema_name] SYSNAME NOT NULL
	,table_name SYSNAME NOT NULL
	,index_id INT NOT NULL
	,update_pct DECIMAL(5, 2) NOT NULL
	,select_pct DECIMAL(5, 2) NOT NULL
	,CONSTRAINT pk_index_estimates PRIMARY KEY (
		database_name
		,[schema_name]
		,table_name
		,index_id
		)
	);
GO

INSERT INTO index_estimates
SELECT db_name() AS database_name
	,schema_name(t.schema_id) AS [schema_name]
	,t.NAME
	,i.index_id
	,i.leaf_update_count * 100.0 / (i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count) AS UpdatePct
	,i.range_scan_count * 100.0 / (i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count) AS SelectPct
FROM sys.dm_db_index_operational_stats(db_id(), NULL, NULL, NULL) i
INNER JOIN sys.tables t ON i.object_id = t.object_id
INNER JOIN sys.dm_db_partition_stats p ON t.object_id = p.object_id
WHERE i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count > 0
	AND p.index_id < 2
	AND i.range_scan_count / (i.leaf_delete_count + i.leaf_insert_count + i.leaf_update_count + i.range_scan_count + i.singleton_lookup_count + i.leaf_page_merge_count) > .75 -- only consider tables with 75% or greater select percentage
ORDER BY t.NAME
	,i.index_id
GO


--PREPARE ROW AND PAGE COMPRESSION
IF OBJECT_ID('page_compression_estimates') IS NOT NULL
	DROP TABLE page_compression_estimates;
GO

CREATE TABLE page_compression_estimates (
	[object_name] SYSNAME NOT NULL
	,[schema_name] SYSNAME NOT NULL
	,index_id INT NOT NULL
	,partition_number INT NOT NULL
	,[size_with_current_compression_setting(KB)] BIGINT NOT NULL
	,[size_with_requested_compression_setting(KB)] BIGINT NOT NULL
	,[sample_size_with_current_compression_setting(KB)] BIGINT NOT NULL
	,[sample_size_with_requested_compression_setting(KB)] BIGINT NOT NULL
	,CONSTRAINT pk_page_compression_estimates PRIMARY KEY (
		[object_name]
		,[schema_name]
		,index_id
		)
	);
GO

IF OBJECT_ID('row_compression_estimates') IS NOT NULL
	DROP TABLE row_compression_estimates;
GO

CREATE TABLE row_compression_estimates (
	[object_name] SYSNAME NOT NULL
	,[schema_name] SYSNAME NOT NULL
	,index_id INT NOT NULL
	,partition_number INT NOT NULL
	,[size_with_current_compression_setting(KB)] BIGINT NOT NULL
	,[size_with_requested_compression_setting(KB)] BIGINT NOT NULL
	,[sample_size_with_current_compression_setting(KB)] BIGINT NOT NULL
	,[sample_size_with_requested_compression_setting(KB)] BIGINT NOT NULL
	,CONSTRAINT pk_row_compression_estimates PRIMARY KEY (
		[object_name]
		,[schema_name]
		,index_id
		)
	);
GO

--DYNAMICALLY GENERATE OUTCOME
DECLARE @script_template NVARCHAR(max) = 'insert ##compression_mode##_compression_estimates exec sp_estimate_data_compression_savings ''##schema_name##'',''##table_name##'',NULL,NULL,''##compression_mode##''';
DECLARE @executable_script NVARCHAR(max);
DECLARE @schema SYSNAME
	,@table SYSNAME
	,@compression_mode NVARCHAR(20);

DECLARE cur CURSOR FAST_FORWARD
FOR
SELECT i.[schema_name]
	,i.[table_name]
	,em.estimate_mode
FROM index_estimates i
CROSS JOIN (
	VALUES ('row')
		,('page')
	) AS em(estimate_mode)
GROUP BY i.[schema_name]
	,i.[table_name]
	,em.estimate_mode;

OPEN cur;

FETCH NEXT
FROM cur
INTO @schema
	,@table
	,@compression_mode;

WHILE (@@FETCH_STATUS = 0)
BEGIN
	SET @executable_script = REPLACE(REPLACE(REPLACE(@script_template, '##schema_name##', @schema), '##table_name##', @table), '##compression_mode##', @compression_mode);

	PRINT @executable_script;

	EXEC (@executable_script);

	FETCH NEXT
	FROM cur
	INTO @schema
		,@table
		,@compression_mode;
END

CLOSE cur;

DEALLOCATE cur;

--SHOW RESULTS
WITH all_estimates
AS (
	SELECT '[' + i.schema_name + '].[' + i.table_name + ']' AS table_name
		,CASE 
			WHEN i.index_id > 0
				THEN '[' + idx.NAME + ']'
			ELSE NULL
			END AS index_name
		,i.select_pct
		,i.update_pct
		,CASE 
			WHEN r.[sample_size_with_current_compression_setting(KB)] > 0
				THEN 100 - r.[sample_size_with_requested_compression_setting(KB)] * 100.0 / r.[sample_size_with_current_compression_setting(KB)]
			ELSE 0.0
			END AS row_compression_saving_pct
		,CASE 
			WHEN p.[sample_size_with_current_compression_setting(KB)] > 0
				THEN 100 - p.[sample_size_with_requested_compression_setting(KB)] * 100.0 / p.[sample_size_with_current_compression_setting(KB)]
			ELSE 0.0
			END AS page_compression_saving_pct
	FROM index_estimates i
	INNER JOIN row_compression_estimates r ON i.schema_name = r.schema_name
		AND i.table_name = r.object_name
		AND i.index_id = r.index_id
	INNER JOIN page_compression_estimates p ON i.schema_name = p.schema_name
		AND i.table_name = p.object_name
		AND i.index_id = p.index_id
	INNER JOIN sys.indexes idx ON i.index_id = idx.index_id
		AND object_name(idx.object_id) = i.table_name
	)
	,recommend_compression
AS (
	SELECT table_name
		,index_name
		,select_pct
		,update_pct
		,row_compression_saving_pct
		,page_compression_saving_pct
		,CASE 
			WHEN update_pct = 0
				THEN 'Page'
			WHEN update_pct >= 20
				THEN 'Row'
			WHEN update_pct > 0
				AND update_pct < 20
				AND page_compression_saving_pct - row_compression_saving_pct < 10
				THEN 'Row'
			ELSE 'Page'
			END AS recommended_data_compression
	FROM all_estimates
	WHERE row_compression_saving_pct > 0
		AND page_compression_saving_pct > 0
	)
SELECT table_name
	,index_name
	,select_pct
	,update_pct
	,row_compression_saving_pct
	,page_compression_saving_pct
	,recommended_data_compression
	,CASE 
		WHEN index_name IS NULL
			THEN 'alter table ' + table_name + ' rebuild with ( data_compression = ' + recommended_data_compression + ')'
		ELSE 'alter index ' + index_name + ' on ' + table_name + ' rebuild with ( data_compression = ' + recommended_data_compression + ')'
		END AS [statement]
FROM recommend_compression
ORDER BY table_name

--CLEAN UP
DROP TABLE index_estimates;

 

All code samples, walkthroughs and other instructions are provided as-is. Use any and all information provided in this blog at your own risk. It is never advised to import or write code in a production environment without rigorous testing in a test or development environment.
All opinions expressed in this blog are solely my own and do not necessarily reflect the opinions of my employer.
The names of actual companies and products mentioned in this blog may be the trademarks of their respective owners.
Microsoft and Dynamics AX are registered trademarks of Microsoft Corporation in the United States and other countries.

Subscribe to our newsletter

Comments4
  1. Vesa JuvonenNovember 20, 2015   

    hi,
    Than you for good informational article.

    One important thing I would add to is that you must define compression settings also using AX admin features, because if not defined, when next time
    full database sync is done, all compression settings done at SQL Server are gone. And as a side effect, your sync will take A LOT OF TIME because
    SQL Server has to decompress all previously compressed tables and indexes.

Leave a Comment!

Your email address will not be published. Required fields are marked *