Written by 08:57 Automation, Database administration, Work with data

How to Automate Data Collection on the SQL Server Database Growth

Introduction

Often, there is a need to control the growth of all tables and files of all databases.

In this article, we are going to explore an example of how to automate data collection on the growth of SQL Server database tables and files.

Solution

  1. Create a view about the size of all tables for each database
    USE [DATABASE_NAME]
    GO
    
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    
    CREATE view [inf].[vTableSize] as
    with pagesizeKB as (
    	SELECT low / 1024 as PageSizeKB
    	FROM master.dbo.spt_values
    	WHERE number = 1 AND type = 'E'
    )
    ,f_size as (
    	select p.[object_id], 
    		   sum([total_pages]) as TotalPageSize,
    		   sum([used_pages])  as UsedPageSize,
    		   sum([data_pages])  as DataPageSize
    	from sys.partitions p join sys.allocation_units a on p.partition_id = a.container_id
    	left join sys.internal_tables it on p.object_id = it.object_id
    	WHERE OBJECTPROPERTY(p.[object_id], N'IsUserTable') = 1
    	group by p.[object_id]
    )
    ,tbl as (
    	SELECT
    	  t.[schema_id],
    	  t.[object_id],
    	  i1.rowcnt as CountRows,
    	  (COALESCE(SUM(i1.reserved), 0) + COALESCE(SUM(i2.reserved), 0)) * (select top(1) PageSizeKB from pagesizeKB) as ReservedKB,
    	  (COALESCE(SUM(i1.dpages), 0) + COALESCE(SUM(i2.used), 0)) * (select top(1) PageSizeKB from pagesizeKB) as DataKB,
    	  ((COALESCE(SUM(i1.used), 0) + COALESCE(SUM(i2.used), 0))
    	    - (COALESCE(SUM(i1.dpages), 0) + COALESCE(SUM(i2.used), 0))) * (select top(1) PageSizeKB from pagesizeKB) as IndexSizeKB,
    	  ((COALESCE(SUM(i1.reserved), 0) + COALESCE(SUM(i2.reserved), 0))
    	    - (COALESCE(SUM(i1.used), 0) + COALESCE(SUM(i2.used), 0))) * (select top(1) PageSizeKB from pagesizeKB) as UnusedKB
    	FROM sys.tables as t
    	LEFT OUTER JOIN sysindexes as i1 ON i1.id = t.[object_id] AND i1.indid < 2
    	LEFT OUTER JOIN sysindexes as i2 ON i2.id = t.[object_id] AND i2.indid = 255
    	WHERE OBJECTPROPERTY(t.[object_id], N'IsUserTable') = 1
    	OR (OBJECTPROPERTY(t.[object_id], N'IsView') = 1 AND OBJECTPROPERTY(t.[object_id], N'IsIndexed') = 1)
    	GROUP BY t.[schema_id], t.[object_id], i1.rowcnt
    )
    SELECT
      @@Servername AS Server,
      DB_NAME() AS DBName,
      SCHEMA_NAME(t.[schema_id]) as SchemaName,
      OBJECT_NAME(t.[object_id]) as TableName,
      t.CountRows,
      t.ReservedKB,
      t.DataKB,
      t.IndexSizeKB,
      t.UnusedKB,
      f.TotalPageSize*(select top(1) PageSizeKB from pagesizeKB) as TotalPageSizeKB,
      f.UsedPageSize*(select top(1) PageSizeKB from pagesizeKB) as UsedPageSizeKB,
      f.DataPageSize*(select top(1) PageSizeKB from pagesizeKB) as DataPageSizeKB
    FROM f_size as f
    inner join tbl as t on t.[object_id]=f.[object_id]
    
    GO
  2. Create a specific database and determine a table for storing information about the growth of all database tables:
    USE [DATABASE_NAME]
    GO
    
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    SET ANSI_PADDING ON
    GO
    
    CREATE TABLE [srv].[TableStatistics](
    	[Row_GUID] [uniqueidentifier] NOT NULL CONSTRAINT [DF_TableStatistics_Row_GUID]  DEFAULT (newid()),
    	[ServerName] [nvarchar](255) NOT NULL,
    	[DBName] [nvarchar](255) NOT NULL,
    	[SchemaName] [nvarchar](255) NOT NULL,
    	[TableName] [nvarchar](255) NOT NULL,
    	[CountRows] [bigint] NOT NULL,
    	[DataKB] [int] NOT NULL,
    	[IndexSizeKB] [int] NOT NULL,
    	[UnusedKB] [int] NOT NULL,
    	[ReservedKB] [int] NOT NULL,
    	[InsertUTCDate] [datetime] NOT NULL CONSTRAINT [DF_TableStatistics_InsertUTCDate]  DEFAULT (getutcdate()),
    	[Date]  AS (CONVERT([date],[InsertUTCDate])) PERSISTED,
    	[CountRowsBack] [bigint] NULL,
    	[CountRowsNext] [bigint] NULL,
    	[DataKBBack] [int] NULL,
    	[DataKBNext] [int] NULL,
    	[IndexSizeKBBack] [int] NULL,
    	[IndexSizeKBNext] [int] NULL,
    	[UnusedKBBack] [int] NULL,
    	[UnusedKBNext] [int] NULL,
    	[ReservedKBBack] [int] NULL,
    	[ReservedKBNext] [int] NULL,
    	[AvgCountRows]  AS ((([CountRowsBack]+[CountRows])+[CountRowsNext])/(3)) PERSISTED,
    	[AvgDataKB]  AS ((([DataKBBack]+[DataKB])+[DataKBNext])/(3)) PERSISTED,
    	[AvgIndexSizeKB]  AS ((([IndexSizeKBBack]+[IndexSizeKB])+[IndexSizeKBNext])/(3)) PERSISTED,
    	[AvgUnusedKB]  AS ((([UnusedKBBack]+[UnusedKB])+[UnusedKBNext])/(3)) PERSISTED,
    	[AvgReservedKB]  AS ((([ReservedKBBack]+[ReservedKB])+[ReservedKBNext])/(3)) PERSISTED,
    	[DiffCountRows]  AS (([CountRowsNext]+[CountRowsBack])-(2)*[CountRows]) PERSISTED,
    	[DiffDataKB]  AS (([DataKBNext]+[DataKBBack])-(2)*[DataKB]) PERSISTED,
    	[DiffIndexSizeKB]  AS (([IndexSizeKBNext]+[IndexSizeKBBack])-(2)*[IndexSizeKB]) PERSISTED,
    	[DiffUnusedKB]  AS (([UnusedKBNext]+[UnusedKBBack])-(2)*[UnusedKB]) PERSISTED,
    	[DiffReservedKB]  AS (([ReservedKBNext]+[ReservedKBBack])-(2)*[ReservedKB]) PERSISTED,
    	[TotalPageSizeKB] [int] NULL,
    	[TotalPageSizeKBBack] [int] NULL,
    	[TotalPageSizeKBNext] [int] NULL,
    	[UsedPageSizeKB] [int] NULL,
    	[UsedPageSizeKBBack] [int] NULL,
    	[UsedPageSizeKBNext] [int] NULL,
    	[DataPageSizeKB] [int] NULL,
    	[DataPageSizeKBBack] [int] NULL,
    	[DataPageSizeKBNext] [int] NULL,
    	[AvgDataPageSizeKB]  AS ((([DataPageSizeKBBack]+[DataPageSizeKB])+[DataPageSizeKBNext])/(3)) PERSISTED,
    	[AvgUsedPageSizeKB]  AS ((([UsedPageSizeKBBack]+[UsedPageSizeKB])+[UsedPageSizeKBNext])/(3)) PERSISTED,
    	[AvgTotalPageSizeKB]  AS ((([TotalPageSizeKBBack]+[TotalPageSizeKB])+[TotalPageSizeKBNext])/(3)) PERSISTED,
    	[DiffDataPageSizeKB]  AS (([DataPageSizeKBNext]+[DataPageSizeKBBack])-(2)*[DataPageSizeKB]) PERSISTED,--shows as the casting is changed
    	[DiffUsedPageSizeKB]  AS (([UsedPageSizeKBNext]+[UsedPageSizeKBBack])-(2)*[UsedPageSizeKB]) PERSISTED,--shows as the casting is changed
    	[DiffTotalPageSizeKB]  AS (([TotalPageSizeKBNext]+[TotalPageSizeKBBack])-(2)*[TotalPageSizeKB]) PERSISTED,--shows as the casting is changed
     CONSTRAINT [PK_TableStatistics] PRIMARY KEY CLUSTERED 
    (
    	[Row_GUID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    
    GO
    
    SET ANSI_PADDING ON
    GO

    TotalPageSizeKB indicates the table size.

    The total of TotalPageSizeKB of all the database tables + the size of system tables = the size of database data.

  3. Determine the procedure for collecting the information:
    USE [DATABASE_NAME]
    GO
    
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    CREATE PROCEDURE [srv].[InsertTableStatistics]
    AS
    BEGIN
    	SET NOCOUNT ON;
    	SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
    
    	declare @dt date=CAST(GetUTCDate() as date);
        declare @dbs nvarchar(255);
    	declare @sql nvarchar(max);
    
    	select [name]
    	into #dbs
    	from sys.databases;
    
    	while(exists(select top(1) 1 from #dbs))
    	begin
    		select top(1)
    		@dbs=[name]
    		from #dbs;
    
    		set @sql=
    		N'INSERT INTO [srv].[TableStatistics]
    	         ([ServerName]
    			   ,[DBName]
    	         ,[SchemaName]
    	         ,[TableName]
    	         ,[CountRows]
    	         ,[DataKB]
    	         ,[IndexSizeKB]
    	         ,[UnusedKB]
    	         ,[ReservedKB]
    			 ,[TotalPageSizeKB]
    			 ,[UsedPageSizeKB]
    			 ,[DataPageSizeKB])
    	   SELECT [Server]
    		  ,[DBName]
    	         ,[SchemaName]
    	         ,[TableName]
    	         ,[CountRows]
    	         ,[DataKB]
    	         ,[IndexSizeKB]
    	         ,[UnusedKB]
    	         ,[ReservedKB]
    			 ,[TotalPageSizeKB]
    			 ,[UsedPageSizeKB]
    			 ,[DataPageSizeKB]
    		FROM ['+@dbs+'].[inf].[vTableSize];';
    
    		exec sp_executesql @sql;
    
    		delete from #dbs
    		where [name]=@dbs;
    	end
    
    	drop table #dbs;
    
    	declare @dt_back date=CAST(DateAdd(day,-1,@dt) as date);
    
    	;with tbl1 as (
    		select [Date], 
    			   [CountRows],
    			   [DataKB],
    			   [IndexSizeKB],
    			   [UnusedKB],
    			   [ReservedKB],
    			   [ServerName], 
    			   [DBName], 
    			   [SchemaName], 
    			   [TableName],
    			   [TotalPageSizeKB],
    			   [UsedPageSizeKB],
    			   [DataPageSizeKB]
    		from [srv].[TableStatistics]
    		where [Date]=@dt_back
    	)
    	, tbl2 as (
    		select [Date], 
    			   [CountRows], 
    			   [CountRowsBack],
    			   [DataKBBack],
    			   [IndexSizeKBBack],
    			   [UnusedKBBack],
    			   [ReservedKBBack],
    			   [ServerName], 
    			   [DBName], 
    			   [SchemaName], 
    			   [TableName],
    			   [TotalPageSizeKBBack],
    			   [UsedPageSizeKBBack],
    			   [DataPageSizeKBBack]
    		from [srv].[TableStatistics]
    		where [Date]=@dt
    	)
    	update t2
    	set t2.[CountRowsBack]		=t1.[CountRows],
    		t2.[DataKBBack]			=t1.[DataKB],
    		t2.[IndexSizeKBBack]	=t1.[IndexSizeKB],
    		t2.[UnusedKBBack]		=t1.[UnusedKB],
    		t2.[ReservedKBBack]		=t1.[ReservedKB],
    		t2.[TotalPageSizeKBBack]=t1.[TotalPageSizeKB],
    		t2.[UsedPageSizeKBBack]	=t1.[UsedPageSizeKB],
    		t2.[DataPageSizeKBBack]	=t1.[DataPageSizeKB]
    	from tbl1 as t1
    	inner join tbl2 as t2 on t1.[Date]=DateAdd(day,-1,t2.[Date])
    	and t1.[ServerName]=t2.[ServerName]
    	and t1.[DBName]=t2.[DBName]
    	and t1.[SchemaName]=t2.[SchemaName]
    	and t1.[TableName]=t2.[TableName];
    
    	;with tbl1 as (
    		select [Date], 
    			   [CountRows], 
    			   [CountRowsNext],
    			   [DataKBNext],
    			   [IndexSizeKBNext],
    			   [UnusedKBNext],
    			   [ReservedKBNext],
    			   [ServerName], 
    			   [DBName], 
    			   [SchemaName], 
    			   [TableName],
    			   [TotalPageSizeKBNext],
    			   [UsedPageSizeKBNext],
    			   [DataPageSizeKBNext]
    		from [srv].[TableStatistics]
    		where [Date]=@dt_back
    	)
    	, tbl2 as (
    		select [Date], 
    			   [CountRows],
    			   [DataKB],
    			   [IndexSizeKB],
    			   [UnusedKB],
    			   [ReservedKB],
    			   [ServerName], 
    			   [DBName], 
    			   [SchemaName], 
    			   [TableName],
    			   [TotalPageSizeKB],
    			   [UsedPageSizeKB],
    			   [DataPageSizeKB]
    		from [srv].[TableStatistics]
    		where [Date]=@dt
    	)
    	update t1
    	set t1.[CountRowsNext]		=t2.[CountRows],
    		t1.[DataKBNext]			=t2.[DataKB],
    		t1.[IndexSizeKBNext]	=t2.[IndexSizeKB],
    		t1.[UnusedKBNext]		=t2.[UnusedKB],
    		t1.[ReservedKBNext]		=t2.[ReservedKB],
    		t1.[TotalPageSizeKBNext]=t2.[TotalPageSizeKB],
    		t1.[UsedPageSizeKBNext]	=t2.[UsedPageSizeKB],
    		t1.[DataPageSizeKBNext]	=t2.[DataPageSizeKB]
    	from tbl1 as t1
    	inner join tbl2 as t2 on t1.[Date]=DateAdd(day,-1,t2.[Date])
    	and t1.[ServerName]=t2.[ServerName]
    	and t1.[DBName]=t2.[DBName]
    	and t1.[SchemaName]=t2.[SchemaName]
    	and t1.[TableName]=t2.[TableName];
    END
    GO

    This solution can be modified to collect data on the sizes of tables of all the databases from all the required instances of MS SQL Server.

  4. Define the view by the collected information:
    USE [DATABASE_NAME]
    GO
    
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    create view [srv].[vTableStatisticsShort] as 
    with d as (select DateAdd(day,-1,max([Date])) as [Date] from [srv].[TableStatistics])
    SELECT t.[ServerName]
          ,t.[DBName]
          ,t.[SchemaName]
          ,t.[TableName]
          ,t.[CountRows]
          ,t.[DataKB]
          ,t.[IndexSizeKB]
          ,t.[UnusedKB]
          ,t.[ReservedKB]
          ,t.[InsertUTCDate]
          ,t.[Date]
          ,t.[CountRowsBack]
          ,t.[CountRowsNext]
          ,t.[DataKBBack]
          ,t.[DataKBNext]
          ,t.[IndexSizeKBBack]
          ,t.[IndexSizeKBNext]
          ,t.[UnusedKBBack]
          ,t.[UnusedKBNext]
          ,t.[ReservedKBBack]
          ,t.[ReservedKBNext]
          ,t.[AvgCountRows]
          ,t.[AvgDataKB]
          ,t.[AvgIndexSizeKB]
          ,t.[AvgUnusedKB]
          ,t.[AvgReservedKB]
          ,t.[DiffCountRows]
          ,t.[DiffDataKB]
          ,t.[DiffIndexSizeKB]
          ,t.[DiffUnusedKB]
          ,t.[DiffReservedKB]
          ,t.[TotalPageSizeKB]
          ,t.[TotalPageSizeKBBack]
          ,t.[TotalPageSizeKBNext]
          ,t.[UsedPageSizeKB]
          ,t.[UsedPageSizeKBBack]
          ,t.[UsedPageSizeKBNext]
          ,t.[DataPageSizeKB]
          ,t.[DataPageSizeKBBack]
          ,t.[DataPageSizeKBNext]
          ,t.[AvgDataPageSizeKB]
          ,t.[AvgUsedPageSizeKB]
          ,t.[AvgTotalPageSizeKB]
          ,t.[DiffDataPageSizeKB]
          ,t.[DiffUsedPageSizeKB]
          ,t.[DiffTotalPageSizeKB]
      FROM d
      inner join [SRV].[srv].[TableStatistics] as t on d.[Date]=t.[Date]
      where t.[CountRowsBack] is not null
    	and	t.[CountRowsNext] is not null
    GO

    Here I would like to draw your attention to Diff. If it is higher than 0, it means that the table grows faster every day.

    The collection is supposed to be made once every 24 hours.

    In the same way, we can automate the collection of the file growth of all the databases using the following view:

    USE [DATABASE_NAME]
    GO
    
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    select	t2.[DB_Name] as [DBName]
    			,t1.FileId 
    			,t1.NumberReads
    			,t1.BytesRead
    			,t1.IoStallReadMS
    			,t1.NumberWrites
    			,t1.BytesWritten
    			,t1.IoStallWriteMS 
    			,t1.IoStallMS
    			,t1.BytesOnDisk
    			,t1.[TimeStamp]
    			,t1.FileHandle
    			,t2.[Type_desc]
    			,t2.[FileName]
    			,t2.[Drive]
    			,t2.[Physical_Name]
    			,t2.[Ext]
    			,t2.[CountPage]
    			,t2.[SizeMb]
    			,t2.[SizeGb]
    			,t2.[Growth]
    			,t2.[GrowthMb]
    			,t2.[GrowthGb]
    			,t2.[GrowthPercent]
    			,t2.[is_percent_growth]
    			,t2.[database_id]
    			,t2.[State]
    			,t2.[StateDesc]
    			,t2.[IsMediaReadOnly]
    			,t2.[IsReadOnly]
    			,t2.[IsSpace]
    			,t2.[IsNameReserved]
    			,t2.[CreateLsn]
    			,t2.[DropLsn]
    			,t2.[ReadOnlyLsn]
    			,t2.[ReadWriteLsn]
    			,t2.[DifferentialBaseLsn]
    			,t2.[DifferentialBaseGuid]
    			,t2.[DifferentialBaseTime]
    			,t2.[RedoStartLsn]
    			,t2.[RedoStartForkGuid]
    			,t2.[RedoTargetLsn]
    			,t2.[RedoTargetForkGuid]
    			,t2.[BackupLsn]
    from fn_virtualfilestats(NULL, NULL) as t1
    inner join [inf].[ServerDBFileInfo] as t2 on t1.[DbId]=t2.[database_id] and t1.[FileId]=t2.[File_Id]
    GO

Result

In this article, we explored the example of automating data collection about the size and growth of all SQL Server tables and files of all the databases. It provides us with the complete control over changing the size of both the database files and its tables, as well as taking timely measures to reduce the table or file, to increase the data storage device or to split information into multiple data storage devices.

 

Tags: , Last modified: September 23, 2021
Close