Written by 17:47 SQL Server

SUBSTRING Command in SQL: A Primer

Сodingsight - SQL Substring

Similar to my article regarding the SQL Server STUFF command, today we will be examining the SUBSTRING SQL server command. It is similar to the STUFF command in regards to the parameters it accepts.

The parameters accepted by SUBSTRING are as follows:

SUBSTRING ( STRING_VALUE , STARTING_POSITION, LENGTH)

The SUBSTRING command is prevalent in most major standard databases like SQL Server, MySQL, Oracle, and Postgres.

Let’s explore the SUBSTRING function with several examples.

Section 1 – Simple SUBSTRING Examples

In this first example, a string value of “ABC” will be passed to the SUBSTRING function. Each call will have the starting position argument incremented by 1. The length argument will stay 1 for each call.

SELECT SUBSTRING('ABC',1,1); --RETURNS : A

SELECT SUBSTRING('ABC',2,1); --RETURNS : B

SELECT SUBSTRING('ABC',3,1); --RETURNS : C

From these examples we can see the general functionality of the SUBSTRING function, each increment in the starting position shifts the letter returned. In execution #1 the value is A, #2 the value is B and in #3 the value is C.

An interesting feature of the SUBSTRING function is that the length parameter is non-breaking if it exceeds the length of the value you’re parsing. For example, the value “1000” has a length of 4 digits. If parsed with the following query specifying a length of 6, SUBSTRING returns the original string.

SELECT SUBSTRING('1000',1,6); --RETURNS : 1000

Section 2 – Parsing Data with the SUBSTRING Function

For the next example, let’s assume that we have a database for keeping data about used cars. However, instead of using a multi-table relational data model, the database contains a primary key and an SKU or stock-keeping unit only. This SKU can be used to find multiple fields around the features of a car by using the following key:

TOY-CEL-R-1990-150
-------------------------------------------------------
MAKE-MODEL-COLOR-YEAR-MILEAGE

*In these examples mileage is represented at a scale of 1/1000, i.e. 100 = 100,000

IF NOT EXISTS (SELECT * FROM sys.databases WHERE name = 'CODESIGHT_DEMOS')
BEGIN
  CREATE DATABASE CODESIGHT_DEMOS;
END;
GO

USE CODESIGHT_DEMOS;

IF OBJECT_ID('USEDCARS') IS NOT NULL DROP TABLE USEDCARS
CREATE TABLE USEDCARS
(
ID INT IDENTITY(1,1) PRIMARY KEY,
CAR_SKU VARCHAR(30)
)

--MAKE--MODEL--COLOR--YEAR--MILEAGE *1000
INSERT INTO USEDCARS
VALUES('TOY-CEL-R-1990-150')

INSERT INTO USEDCARS
VALUES('JEP-WRG-W-2019-15')

INSERT INTO USEDCARS
VALUES('FRD-ESC-G-1998-80')

SELECT * FROM USEDCARS

The output of the used car table is:

IDCAR_SKU
1TOY-CEL-R-1990-150
2JEP-WRG-W-2019-15
3FRD-ESC-G-1998-80

Using the SUBSTRING function, these values can be parsed by assigning the correct starting position and length parameters to the CAR_SKU field.

SELECT SUBSTRING(CAR_SKU,1,3) AS MAKE,
SUBSTRING(CAR_SKU,5,3) AS MODEL,
SUBSTRING(CAR_SKU,9,1) AS COLOR,
SUBSTRING(CAR_SKU,11,4) AS YEAR,
SUBSTRING(CAR_SKU,16,3) * 1000 AS MILEAGE
FROM USEDCARS;
MAKEMODELCOLORYEARMILEAGE
TOYCELR1990150000
JEPWRGW201915000
FRDESCG199880000

Section 3 – Abstracting the SUBSTRING SQL Logic to Views

Using the SQL SUBSTRING function, we were able to parse out the various features of each vehicle in our database. Let’s take this 1 step farther and translate that output into more meaningful data by building a view on the SUBSTRING query.

In a relational data model, this will be achieved by using joins to incorporate the CASE WHEN statement to translate the values.

CREATE VIEW  VW_CAR_INVENTORY

AS

SELECT 

CASE WHEN SUBSTRING(CAR_SKU,1,3) = 'TOY' THEN 'TOYOTA'
     WHEN SUBSTRING(CAR_SKU,1,3) = 'JEP' THEN 'JEEP'
	 WHEN SUBSTRING(CAR_SKU,1,3) = 'FRD' THEN 'FORD'
	 END AS MAKE,

CASE WHEN SUBSTRING(CAR_SKU,5,3) = 'CEL' THEN 'CELICA'
     WHEN SUBSTRING(CAR_SKU,5,3) = 'WRG' THEN 'WRANGLER'
	 WHEN SUBSTRING(CAR_SKU,5,3) = 'ESC' THEN 'ESCAPE'
    END AS MODEL,

CASE WHEN SUBSTRING(CAR_SKU,9,1) = 'R' THEN 'RED'
     WHEN SUBSTRING(CAR_SKU,9,1) = 'W' THEN 'WHITE'
	 WHEN SUBSTRING(CAR_SKU,9,1) = 'G' THEN 'GREEN'
    END AS COLOR,

SUBSTRING(CAR_SKU,11,4) AS YEAR,

SUBSTRING(CAR_SKU,16,3) * 1000 AS MILEAGE

FROM USEDCARS
MAKEMODELCOLORYEARMILEAGE
TOYOTACELICARED1990150000
JEEPWRANGLERWHITE201915000
FORDESCAPEGREEN199880000

Section 4 – SUBSTRING Alternatives for Integer Values

The SUBSTRING function is explicitly for string values, so the following query attempting to parse an integer value with SUBSTRING will fail.

SELECT SUBSTRING(1000,1,1); 

--Msg 8116, Level 16, State 1, Line 78 Argument data type int is invalid for argument 1 of substring function.

The alternative when parsing integer values is the LEFT or RIGHT function, although this approach loses some of the flexibility presented with the SUBSTRING function.

SELECT LEFT(10000,1); --RETURNS : 1
SELECT RIGHT(1000,1) ;  --RETURNS : 0

Situationally you could also explicitly cast the integer value as a string and substring the casted value:

SELECT (SUBSTRING(CAST(1000 AS VARCHAR(4)),1,1)) AS CASTED_SUBSTRING -- RETURNS : 1

Summary

This article covered the parameters required to use the SUBSTRING function which is a strong argument for a starting position and a length. With a non-relation data model, we used the SUBSTRING method to extract the car features from a 1 field table and construct a view with some transformation logic built into it.

The SUBSTRING method is a great function to know when performing ETL or parsing the data inside a database. It can be used to extract meaningful key data points from the already existing data inside of your database.

Tags: , , Last modified: March 15, 2022
Close