Storing Files in SQL Database Using FILESTREAM – Part 2

Total: 6 Average: 3.7

In my previous article, I described how to configure FILESTREAM in SQL Server, create FILESTREAM-enabled database and tables. Moreover, I demonstrated how to insert and delete data from the FILESTREAM table.

In this article, I am going to demonstrate how to insert multiple files in a FILESTREAM table using T-SQL.

In this demo, we are going to use the PowerShell module to populate the list of files and store it in the SQL table.

Pre-requisite Checks and Useful Queries to Get FILESTREAM Configurations

For this demo, I am using:

  1. SQL Version: SQL Server 2017
  2. Database: FileStream_Demo database
  3. Tools: PowerShell, SQL Server Management Studio, SQL Server Data Tools.

In my previous article, I have created a database named FileStream_Demo. FILESTREAM feature is enabled On SQL Server instance, and the database has the T-SQL and Win32 access level permission.

To review the FILESTREAM access level settings, execute the following query:

Use FileStream_Demo
Go
SELECT Host_Name() as 'Server Name' ,NAME as 'Database Configuration',
       CASE
         WHEN value = 0 THEN 'FILESTREAM is Disabled'
         WHEN value = 1 THEN
         'Enabled for T-SQL'
         WHEN value = 2 THEN
         'Enabled for T-SQL and Win32'
       END AS 'FILESTREAM Option'
FROM   sys.configurations
WHERE NAME = 'filestream access level'
Go

The output of the query is as follows:

To review Database Files and location of the FILESTREAM data container, execute the following query:

Use FileStream_Demo
Go
SELECT Host_Name() as 'Server Name',NAME As 'Filegroup Name',        type_desc as 'Filegroup Type',       physical_name as 'Database File Location'  FROM   sys.database_files

The output of the query is as follows:

Insert Multiple Files using SQL Script

To insert multiple files in an SQL table:

  1. Create two SQL tables named, Document_List and  Document_Content. The Document_Content table has the FileStreamCol column with the VARBINARY(MAX) data type and the FILESTREAM column attribute. The content of files within the directory will be converted in VARBINARY(MAX) and stored in the FileStreamCol column of the Document_Content table.
  2. Create a dynamic SQL query that iterates through the Document_Location table to get the path of files and insert files in the Document_Content tables.
  3. Wrap the entire T-SQL code in a stored procedure.

Create SQL Tables

Firstly, create a global temporary table to store the details of the files. For this, execute the following query in the FileStream_Demo database.

USE [FileStream_Demo]
GO
Create table Document_List
(
    ID int identity(1,1) Primary Key clustered,
    fullname Varchar(max),
    name Varchar(max),
    attributes Varchar(250),
    CreationTime datetime,
    LastAccessTime datetime,
    LastWriteTime datetime,
    Length numeric(10,2)
)

Moreover, create a table to store the Files in the table. Execute the following query to create a physical table:

USE [FileStream_Demo]
GO
CREATE TABLE [dbo].[Document_Content ](
	[ID] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
	[RootDirectory] [varchar](max) NULL,
	[FileName] [varchar](max) NULL,
	[FileAttribute] [varchar](150) NULL,
	[FileCreateDate] [datetime] NULL,
	[FileSize] [numeric](10, 5) NULL,
	[FileStreamCol] [varbinary](max) FILESTREAM  NULL,
UNIQUE NONCLUSTERED 
(
	[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] FILESTREAM_ON [Dummy-Documents]
GO

To improve the performance of the select query, add a clustered index on the FileName and FileType columns of the Document_Content table. For this, execute the following code:

USE [FileStream_Demo]
GO
CREATE CLUSTERED INDEX [ICX_Document_Content_FileName] ON [dbo].[Document_Content]
(
	[FileName] ASC,
	[FileType] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] FILESTREAM_ON [Dummy-Documents]
GO

Create PowerShell Module to Populate File Details

Once the tables are created, execute the PowerShell script to insert details of files in the Document_List table.  The PowerShell script runs within the T-SQL stored procedure, hence to write the entire code in the SQL procedure, you need to create a PowerShell function. The directory path is a mandatory input parameter of the function. The script gets the list of files, resides in the directory parameter used to execute the PowerShell function.

The code is as follows:

  1. Create a function and declare mandatory input parameters. The code is as follows:
    function global:getFileList
    {
    param(
        [Parameter(Position=0,mandatory=$true)]
        [string[]] $FilePath
    )
  2. Construct a string that has an “Insert” query. See the following code:
    $sqltmplt=@'
    INSERT INTO ##Document_List(
    	fullname,	name,	attributes,	CreationTime,	LastAccessTime,	LastWriteTime,
        Length
    ) 
    VALUES (
    	'{0}',
    	'{1}',
    	'{2}',
    	'{3}',
    	'{4}',
    	'{5}',
        '{6}'
    )
    '@
  3. Get the list of files using Get-ChildItem -Recurse command format the output of the command. The code is as follows:
    Get-ChildItem -Recurse $Directorypath | 
    select @{Label="FullName";Expression={split-path($_.FullName)}},	name,	attributes,	CreationTime,	LastAccessTime,	LastWriteTime,@{Label="Length";Expression={$_.Length / 1MB -as [int] }}
  4. Using the For-Each loop, store the output in the Document_content table. To run the query on the FileStream_Demo database, the script uses Invoke-Sqlcmd. The code is as follows:
    ForEach-Object {
    		$SQL = $sqltmplt -f $_.FullName, $_.name, $_.attributes, $_.CreationTime, $_.LastAccessTime, $_.LastWriteTime,$_.Length	
    		Invoke-sqlcmd -Query $SQL -ServerInstance TTI412-VM\SQL2017 -database FileStream_Demo
    	}

Entire code of the PowerShell function will look like as the following:

function global:getFileList
{
param(
    [Parameter(Position=0,mandatory=$true)]
    [string[]] $FilePath
)
Write-Output "Inserting files"
$sqltmplt=@'
INSERT INTO dbo.Document_List(
	fullname,	name,	attributes,	CreationTime,	LastAccessTime,	LastWriteTime,
    Length    
) 
VALUES (
	'{0}',
	'{1}',
	'{2}',
	'{3}',
	'{4}',
	'{5}',
    '{6}'
)
'@
Invoke-Sqlcmd -Query "Truncate Table Document_List" -ServerInstance TTI412-VM\SQL2017 -database FileStream_Demo
Get-ChildItem -Recurse $FilePath  | 
select @{Label="FullName";Expression={split-path($_.FullName)}},name,attributes,	CreationTime,	LastAccessTime,	LastWriteTime,@{Label="Length";Expression={$_.Length / 1MB -as [int] }}|
	ForEach-Object 
{
$SQL = $sqltmplt -f $_.FullName, $_.name,$_.attributes, $_.CreationTime, $_.LastAccessTime, $_.LastWriteTime,$_.Length		

        Invoke-sqlcmd -Query $SQL -ServerInstance TTI412-VM\SQL2017 -database FileStream_Demo
}
Write-Output "File Inserted successfully... Below Is a list of files."
}

To use the PowerShell function within the SQL Stored procedure, we need to register the above script as PowerShell Module. For this, create a directory named getFileList at C:\Windows\System32\WindowsPowerShell\v1.0\Modules. To register any PowerShell script as a module, the script and directory names should be the same. Hence save the above script asgetFileList.psm1 in the getFileList directory.

Now, when we executed the PowerShell script from T-SQL, we need to import the getFileList module. For this, add the following code in the PowerShell profile. PowerShell profile will be created at the C:\Windows\System32\WindowsPowerShell\v1.0 location.

import-module getFileList

If the profile does not exist, execute the following command to create a profile.

New-Item -Type File -Path $PROFILE.AllUsersAllHosts -Force

Create a Stored Procedure to Import Files

Once we store the file list and information in SQL Table, we will insert the files into the Document_Content table.

To do this task effectively, create a parameterized stored procedure named sp_Insert_Documents. It will use the FileLocation parameter that is of the varchar datatype. The procedure populates the list of files from the location given in the parameter and inserts all the files in the Document_Content table.

Step 1: Change the configuration parameter.

To run the PowerShell command using T-SQL, enable the xp_cmdshell configuration option. It is an advanced configuration option; hence before enabling xp_cmdshell, enable the Show advanced option configuration option. For this, execute the following T-SQL commands in sequence.

use master
go
exec sp_configure 'show advanced option',1
reconfigure with override

Exec sp_configure 'xp_cmdshell',1
Reconfigure with override

Step 2: Use PowerShell script to populate the file list within T-SQL code

To execute a PowerShell script using T-SQL, use the xp_cmdshell procedure. It executes the PowerShell command, which populates a list of files and its details in the Document_List table.
The code is as follows:

declare @PSScript varchar(2500)
set @PSScript= 'powershell.exe getFileList ''' + @FileLoc +'''' 
exec xp_cmdshell @PSScript

Step 3: Create a dynamic SQL query to get the file location

Create a dynamic SQL query which iterates through the Document_List table, loads the content of the file, located at the path given in the FullName column, converts it into the VARBINAR(MAX) column and inserts it into the Document_Content table. Along with File, the script inserts File Name, File attribute, File Size, and File Type into the Document_Content table. The script uses the case expression to determine the file type.

The code is as follows:

SET @FileCount = (SELECT Count(*)
                  FROM   Document_List)
                  WHILE ( @i < @FileCount )       
BEGIN
   SET @FileName = (SELECT TOP 1 name
                    FROM   Document_List) 

   /* Concate DirectoryLocation and FileName column to generate FQDN. */    
   SET @FileName = (SELECT TOP 1  Name 
                        FROM   Document_List) 
   SET @FileLocation = (SELECT TOP 1  fullname 
                        FROM   Document_List where name= @FileName)
   SET @FileAttribute = (SELECT TOP 1  attributes 
                        FROM   Document_List where name= @FileName)
   SET @FileCreateDate = (SELECT TOP 1  CreationTime 
                        FROM   Document_List where name= @FileName) 
   SET @FileSize = (SELECT TOP 1  Length 
                        FROM   Document_List where name= @FileName) 
   SET @FileType = (SELECT TOP 1  CASE
               WHEN ( name     LIKE '%jpg%' )
                     OR ( name LIKE '%png%' )
                     OR ( name LIKE '%jpg%' )
                     OR ( name LIKE '%bmp%' ) THEN 'Images'

				WHEN ( name  LIKE '%txt%' )THEN 'Text Files'
                     When ( name LIKE '%xls%' )THEN 'Text Files'
                     When ( name LIKE '%doc%' ) THEN 'Text Files'
               ELSE 'Other Files'
             END                AS 'File Type'
                        FROM   Document_List where name= @FileName) 
   SET @SQLText = 'Insert into Document_Content (ID, RootDirectory, FileName, FileAttribute,FileCreateDate,FileSize,FileType,FileStreamCol)      
   Select NEWID(),
   ''' + @FileLocation + ''',
   ''' + @FileName + ''',
   ''' + @FileAttribute + ''',
   ''' + @FileCreateDate + ''',
   ''' + @FileSize + ''',
   ''' + @FileType + ''',
    bulkColumn
    from Openrowset(Bulk '''+ @FileLocation + ''', Single_Blob) 
   as tb'
   EXEC Sp_executesql @SQLText
   DELETE FROM Document_List WHERE name = @FileName
   SET @I = @I + 1
END

Step 4: Wrap up the entire SQL code in a stored procedure

Create a parameterized stored procedure named sp_Insert_Files and wrap the code in it.

The code of Stored Procedure is as follows:

use FileStream_Demo
go
Create Procedure sp_Insert_Files
@FileLoc varchar(max)
as 
begin 
DECLARE @FileCount INT
DECLARE @I INT = 0
DECLARE @FileName NVARCHAR(max)
DECLARE @SQLText NVARCHAR(max)
declare @PSScript varchar(2500)
DECLARE @FileLocation NVARCHAR(max)
declare @FileAttribute varchar(50)
declare @FileCreateDate varchar(50)
declare @FileSize varchar(10)
declare @FileType varchar(20)
set @PSScript= 'powershell.exe getFileList ''' + @FileLoc +'''' 
exec xp_cmdshell @PSScript
SET @FileCount = (SELECT Count(*)
                  FROM   Document_List)
                  WHILE ( @i < @FileCount )       
BEGIN
   /* Get the File Name from Document_Name table */
   SET @FileName = (SELECT TOP 1 name
                    FROM   Document_List) 

   /* Populate File details from Document_List table*/    
   SET @FileName = (SELECT TOP 1  Name 
                        FROM   Document_List) 
   SET @FileLocation = (SELECT TOP 1  fullname 
                        FROM   Document_List where name= @FileName)
   SET @FileAttribute = (SELECT TOP 1  attributes 
                        FROM   Document_List where name= @FileName)
   SET @FileCreateDate = (SELECT TOP 1  CreationTime 
                        FROM   Document_List where name= @FileName) 
   SET @FileSize = (SELECT TOP 1  Length 
                        FROM   Document_List where name= @FileName) 
/*Determine type of file*/   
SET @FileType = (SELECT TOP 1  CASE
               WHEN ( name     LIKE '%jpg%' )
                     OR ( name LIKE '%png%' )
                     OR ( name LIKE '%jpg%' )
                     OR ( name LIKE '%bmp%' ) THEN 'Images'

				WHEN ( name  LIKE '%txt%' )THEN 'Text Files'
                     When ( name LIKE '%xls%' )THEN 'Text Files'
                     When ( name LIKE '%doc%' ) THEN 'Text Files'
               ELSE 'Other Files'
             END                AS 'File Type'
                        FROM   Document_List where name= @FileName) 
   SET @SQLText = 'Insert into Document_Content (ID, RootDirectory, FileName, FileAttribute,FileCreateDate,FileSize,FileType,FileStreamCol)      
   Select NEWID(),
   ''' + @FileLocation + ''',
   ''' + @FileName + ''',
   ''' + @FileAttribute + ''',
   ''' + @FileCreateDate + ''',
   ''' + @FileSize + ''',
   ''' + @FileType + ''',
    bulkColumn
    from Openrowset(Bulk '''+ @FileLocation + ''', Single_Blob) 
   as tb'
   EXEC Sp_executesql @SQLText
   DELETE FROM Document_List WHERE name = @FileName
   SET @I = @I + 1
END
End

Insert Files using Stored Procedure

Now test the stored procedure. I added a few files to the E:\Files directory. Insert the files into the SQL table by executing the stored procedure. The code is as follows:

use FileStream_Demo
go
exec sp_Insert_Files 'E:\Files'

Let’s verify that files have been copied to table. For this, execute the following code:

select 
    RootDirectory as 'File Location',
    FileName as 'File Name',
    FileAttribute as 'Attribute',
    FileCreateDate as 'Attribute',
    FileSize as 'File Size',
    FileType as 'File Type',
    FileStreamCol as 'File Content'
from Document_Content where FileType='Images'

The output of the query is as follows:

To access the file on the FILESTREAM data store using Win32 API, use the Pathname () method of FILESTREAM. With the Pathname () method, we can identify the logical path to detect the file in the FILESTREAM data store uniquely.

The code is as follows:

select 
    RootDirectory as 'File Location',
    FileName as 'File Name',
    FileAttribute as 'Attribute',
    FileCreateDate as 'Attribute',
    FileSize as 'File Size',
    FileType as 'File Type',
  FileStreamCol.PathName() AS FilePath
from Document_Content where FileName='RowDesign.png'

The output of the query is as follows:

Let’s navigate to the FILESTREAM data container (E:\Dummy-Documents) to verify that files have been inserted. See the following screenshot:

As you can see, all the files have been inserted in SQL tables and the FileStream container.

Summary

In this article, I have covered:

  1. Useful query to verify the pre-requisites of the FILESTREAM feature.
  2. How to register a PowerShell function as a module.
  3. Explain the PowerShell code to insert file list in SQL table using PowerShell script.
  4. Explained the code of stored procedure to insert multiple Files in SQL Table.
  5. Useful queries to gather a list of documents, stored in FILESTREAM container.

In future articles, I am going to explain how to backup and restore FILESTREAM enabled database.

Stay Tuned!

Nisarg Upadhyay