Storing Binary Data Types in SQL Server

Introduction

Daily work rarely requires storing binary data directly to database columns. However, it is very useful in some cases.

Contrary to popular opinion, byte arrays can help with significantly more than just storing large binary objects (documents, multimedia, etc.). Also, they can be used to store hash values and sample data for faster searching/high-level analysis. Or, they might contain bytes that are in ON/OFF states in some electronic relay. As soon as we start to think about hardware data stored in databases, applications become more apparent.

CodingSight - Storing Binary Data Types in SQL Server

Unlike VARCHAR data types where you have to take care of collation and code pages, binary data types are series of bytes (sometimes called byte arrays in object-oriented programming languages), either fixed (BINAR ) or variable (VARBINAR ) in size.

To understand details on binary types better, we will first do a short introduction to hexadecimal numbers, which are how this data is internally stored.

Hexadecimal Numbers

If you skipped the class on hexadecimal numbers in high school, a good introduction can be found on a dedicated Wikipedia page. There, you can get familiar with this numbering format.

For understanding this article, it is important to know that SQL Server Management Studio displays binary data in hexadecimal format with the prefix “0x”.

There is no big difference between hexadecimal and decimal numbering format. The hexadecimal format uses base-16 signs (0-9 and A-F) instead of the base-10 of decimal notation (0-9). Values A-F are numbers 10-15 from decimal numbering notation.

This is why we use hexadecimal notation. Since one byte contains 8 bits, allowing for 256 discrete integer numbers, it is beneficial to present bytes in hex format. If we are targeting range 0-256, it is represented as 00-FF in hexadecimal notation. Prefix in Management studio is for the reading clarity, to stress that we are showing hexadecimal numbers and not the default decimal values.

Hexadecimal Numbers

Manual Value Conversion Using CAST()

Since binary values are, in their strict sense, strings, we can convert them from number to character format using CAST or CONVERT SQL methods.

Have a look at the example that uses the CAST method:

SELECT CAST('HexTest' AS VARBINARY);                 
SELECT CAST(0x48657854657374 AS VARCHAR);  
Manual Value Conversion Using CAST()

Using Conversion Styles with CONVERT()

CONVERT() method, unlike CAST(), has an additional option to use conversion styles.

Conversion styles are templates for rules used in the conversion process. CONVERT() is mostly used in date/time operations. When the data is in a non-standard format, it can be used in binary value conversion. Note that binary data types do not support automatic data type conversion without proper parameter values in place. Then SQL Server will throw an exception.

If we look at the CONVERT() method definition, we see it takes two mandatory and one optional parameter.

The first parameter is the target data type, and the second one is the value from which we would like to convert from. The third parameter, in our case, can value 1 or 2. Value 1 means that CONVERT() should consider the input string as a hexadecimal string in text format, and value 2 means that you want to skip the 0x prefix.

Have a look at the examples displaying that behavior:

DECLARE @MyString NVARCHAR(500)='0x48657854657374';

SELECT CONVERT(VARBINARY(MAX), @MyString );    
-- String value is directly converted to binary value - we wanted is to change the datatype 
-- and not convert "0x.." prefix to the hexadecimal value

SELECT CONVERT(VARBINARY(MAX), @MyString, 1);  
Using Conversion Styles with CONVERT()

Difference Between BINARY and VARBINARY

With binary data, we can use two kinds of data types – fixed size and variable size. Or, they are BINARY and VARBINARY.

If we use the fixed-size variable, content is always extended to its defined size with padding of 0x00… – there is no padding in variable length. Using sum operation on these variables is no addition executed. Values are appended one to another. The same is like with string types.

To demonstrate prefix behaviour we will use two simple examples with the binary sum operation:

SELECT CAST('T' AS BINARY(1)) + CAST('e' AS BINARY(1)) + CAST('s' AS BINARY(1)) + CAST('t' AS BINARY(1)); 
SELECT CAST('T' AS BINARY(2)) + CAST('e' AS BINARY(2)) + CAST('s' AS BINARY(2)) + CAST('t' AS BINARY(2)); 
Difference Between BINARY and VARBINARY

Each value in the BINARY(2) statement is postfixed with 0x00 values.

Using Integer Values with Binary Data Types

SQL Server comes with in-built methods to convert between numeric types and binary types. We demonstrated this where we turned the Test string into the binary format, and then back to the BIGINT format, without using the ASCII() function:

SELECT CAST('Test' AS VARBINARY(MAX));
SELECT CAST(CAST('Test' AS VARBINARY(MAX)) AS BIGINT);
Using Integer Values with Binary Data Types

Simple Conversion Between Character and Hexadecimal Values

To convert between charter and hexadecimal values, it is useful to write a custom function that would perform this operation consistently. One possible approach is below:

-- DROP FUNCTION dbo.FN_CH_HEX(@InputValue CHAR(1)

CREATE OR ALTER FUNCTION dbo.FN_CH_HEX(@InputValue CHAR(1))
RETURNS CHAR(2)
AS
BEGIN
    RETURN(CONVERT(CHAR(2), CAST(@InputValue AS BINARY(1)), 2));
END;

-- SELECT dbo.FN_CH_HEX('A') 
Simple Conversion Between Character and Hexadecimal Values

This time we used the parameter value 2 in the CONVERT() function. It shows that this operation should not be mapped to ASCII code and be displayed without the 0x… prefix.

Example Case Study: Storing Photos in SQL Server Binary Type

We usually approach this problem by implementing a custom windows/web application or writing a custom SSIS package with C# code. In this example, I will use the SQL language only. It can be more useful if you don’t have access to the database front-end tools.

To store pictures in the database table, we need to create a table that will hold them. The table must include columns holding the picture name and picture binary content:

-- DROP TABLE T_BINARY_DATA 

CREATE TABLE T_BINARY_DATA 
(
   PICTURE_ID INT IDENTITY(1,1) PRIMARY KEY,
   PICTURE_NAME NVARCHAR(100),
   PICTURE_FILE_NAME NVARCHAR(500),
   PICTURE_DATA VARBINARY(MAX)
)
GO

To enable loading of binary data to SQL Server instance, we need to configure the server with two options:

  • Enable the OLE Automation Procedures option
  • Assing the BulkAdmin privilege to the user executing the image import process.

The script below will do the task under the high-privileged user of SQL Server instance:

USE MASTER
GO
EXEC sp_configure 'show advanced options', 1; 
GO
RECONFIGURE; 
GO
EXEC sp_configure 'Ole Automation Procedures', 1; 
GO
RECONFIGURE; 
GO
-- Add 'bulkadmin' to the correct user
ALTER SERVER ROLE [bulkadmin] ADD MEMBER [NT AUTHORITY\SYSTEM] 
GO 

Now we can start to write the import and export procedure:

-- DROP PROCEDURE dbo.proc_ImportBinary 
-- DROP PROCEDURE dbo.proc_ExportBinary 

CREATE PROCEDURE dbo.proc_ImportBinary 
(
     @PICTURE_NAME      NVARCHAR(100)
   , @FOLDER_PATH       NVARCHAR(500)
   , @PICTURE_FILE_NAME NVARCHAR(500)
   )
AS
BEGIN
   DECLARE @OutputPath NVARCHAR(4000);
   DECLARE @TSQLDYN    NVARCHAR(4000);
   
   SET @OutputPath = CONCAT(@OutputPath,'\',@PICTURE_FILE_NAME)
   SET @TSQLDYN = 'INSERT INTO T_BINARY_DATA(PICTURE_NAME,PICTURE_FILE_NAME,PICTURE_DATA) '
                + 'SELECT ' + '''' + @PICTURE_NAME + '''' + ',' + '''' + @PICTURE_FILE_NAME + '''' + ', * ' 
				+ '  FROM Openrowset( Bulk ' + '''' + @OutputPath + '''' + ', Single_Blob) as img'

   EXEC (@TSQLDYN)   
END
GO


CREATE PROCEDURE dbo.proc_ExportBinary (
     @PICTURE_NAME      NVARCHAR(100)
   , @FOLDER_PATH       NVARCHAR(500)
   , @PICTURE_FILE_NAME NVARCHAR(500)
   )
AS
BEGIN
   DECLARE @Binary     VARBINARY (max);
   DECLARE @OutputPath NVARCHAR(4000);
   DECLARE @Obj        INT
 
   SELECT @Binary = (
         SELECT CONVERT(VARBINARY(max), PICTURE_DATA , 1)
           FROM T_BINARY_DATA 
          WHERE PICTURE_NAME  = @PICTURE_NAME
         );
 
   SET @OutputPath = CONCAT(@FOLDER_PATH, '\', @PICTURE_FILE_NAME);
         
    BEGIN TRY
     EXEC sp_OACreate 'ADODB.Stream', @Obj OUTPUT;
     EXEC sp_OASetProperty @Obj ,'Type',1;
     EXEC sp_OAMethod @Obj,'Open';
     EXEC sp_OAMethod @Obj,'Write', NULL, @Binary;
     EXEC sp_OAMethod @Obj,'SaveToFile', NULL, @OutputPath, 2;
     EXEC sp_OAMethod @Obj,'Close';
     EXEC sp_OADestroy @Obj;
    END TRY
    
 BEGIN CATCH
  EXEC sp_OADestroy @Obj;
 END CATCH
 
   SET NOCOUNT OFF
END
GO

Now we can use these procedures from any client application in a very simple manner.

Let’s imagine we have pictures in the C:\Pictures\Inp folder. To load these pictures, we need to execute the following code:

-- Load picture to table row
exec dbo.proc_ImportBinary ‘MyPic’, ‘C:\Pictures\Inp’, ‘MyPic.jpg’ 

In the similar manner we can export data, to the C:\Pictures\Out folder:

exec dbo.proc_ExportBinary ‘MyPic’, ‘C:\Pictures\Out’, ‘MyPic.jpg’

Conclusion

The choice between binary objects or alternative means of storing binary data in a database (for instance, storing file paths in a database and retrieving them from the disk/cloud storage) depends on multiple factors.

The general rule is that if the file is less than 256 kilobytes in size, you should store it in the VARBINARY columns. If binary files are larger than one megabyte, you should store them on the file system. If you have FILESTREAM available in SQL Server versions 2008 and above, it keeps the files under transactional control as a logical part of the database.

If you decide to store binary files in the SQL Server table, use a separate table only for binary content. Then you can optimize its storage location and access the engine, probably using separate Files and FileGroups for this table. The detailed information is available in the official Microsoft article.

In any case, test both approaches and use the one that suits your needs best.

Josip Saban
Latest posts by Josip Saban (see all)

Josip Saban

An experienced database developer and software manager with more than 15 years in the field, both in corporate and start-up environments. Josip has strong problem-solving skills and a proven track-record of successfully implementing strategic solutions to meet changing business needs, combined with the ability to build key relationships and communicate well with stakeholders, most of it in finance industry. Focused on Microsoft tool stack, specifically SQL Server, especially in data warehouse modelling and development. from version 2000 to latest versions. He is interested in database architect and engineering management positions, with strong affinity to data-driven industries.