Storing Files in SQL Database Using FILESTREAM – Part 1

FILESTREAM has been introduced by Microsoft in 2008. The purpose was to store and manage unstructured files more effectively. Before FILESTREAM was introduced,  the following approaches were used to store the data in SQL server:

  1. Unstructured files can be stored in the VARBINARY or IMAGE column of a SQL Server table. This approach is effective to maintain transactional consistency and reduces the file management complexity, but when the client application reads data from the SQL table, it uses SQL memory which leads to poor performance.
  2. Instead of storing the entire file in the SQL table, store the physical location of the unstructured file in SQL Table. This approach gives huge performance improvement, but it does not ensure the transactional consistency moreover file management was difficult too.

Read More

SQL Database Administrator Interview Tips

Interview is the most significant step of SQL database administrator recruitment. During the interview, a technical person or interviewer asks questions about the technical experience of a candidate. The purpose of these questions is to understand candidate’s experience about SQL Server. The answers to the questions will measure your knowledge and experience. If you cannot prove your experience in this interview, most probably the decision on your employment will be negative. First of all, you should be calm during the interview. This advice is the most significant key to success. The first question of an interview is about your past experience and skills. Being relevant to this question, you should describe your old and current job responsibilities and skills. You must be sure that you demonstrate all of your knowledge and experience to the interviewer. Because this introduction makes the first impact on your potential employer. Read More

Moving Existing Table From Primary Filegroup to Different Filegroup

In this article, I am going to explain how to move a table from the Primary filegroup to the Secondary filegroup. First, let’s understand what are datafile, filegroup, and type of filegroups.

Database Files and Filegroups

When SQL Server is installed on any server, it creates a Primary data file and Log file to store data. The Primary data file stores data and database objects like tables, index, stored procedures, etc. Log files store information required to recover transactions. Data files can be clubbed together in filegroups. Read More

Move Datafiles in SQL Server

Introduction

There are a number of situations which would warrant the movement of database files or transaction log files from one volume to another on the same server. These may include:

  1. The need to format the volume assuming it was not formatted properly when SQL Server was installed.
  2. The need to use a new volume assuming the limits have been reached for the underlying storage.
  3. The need to improve performance by managing IO.

Read More

Overview of DMV Tool using DBFS FUSE with Docker Containers on Linux Machine

This article is intended for database administrators and system administrators who plan to configure the DBFS FUSE file system on a Linux machine using the Docker containers. I am going to present the step-by-step procedure for setting up and configuring the DBFS FUSE file system to access local or remote SQL database instances on Linux CentOS docker containers.

Introduction to the Docker, DBFS, FUSE and DMV Tool

To make the most of this material, you’ll need some knowledge of the Docker Containers and SQL Server. I would recommend you refer to the previous articles that explored the topics:

Introduction to Docker and Docker toolbox

Docker and SQL Server on Linux

Now, let’s start with the basic concepts of Docker, DBFS, FUSE, and DMV Tool.

Docker

Docker allows us to package and run an application in a loosely isolated environment called a container. Docker is an open-platform engine that facilitates faster delivery, faster development, and faster deployment in an isolated environment.

DBFS and FUSE

DBFS (Database File System) is a process of mounting a virtual file system on the top of that database objects using the FUSE technique which provides an interface between a VFS and Kernel module. This setup lets us access those files using dbfs programs.

A big part of SQL Server monitoring is to ensure that SQL Server is performing optimally and it’s highly available. Traditionally, to get system metrics or performance information, one could rely on GUI admin tools such as SSMS, or command-line tools such as SQLCMD to run queries.

Now, there is another option. We can query the information using a new experimental Linux tool known as DBFS. It enables us to access live DMVs mounted to a virtual file system through the FUSE file-system concept. Thus, we can explore information about the database (Dynamic Management Views) using native bash commands.

There is no need to use any other tools or run SQL queries. The meta-data configuration file is used internally by the DBFS process to query the SQL Server databases.

Features:

  1. Access data in JSON or CSV format.
  2. Compatible grep, sed, and awk programs.
  3. Real-time DMV data
  4. Cross-platform support (Windows and Linux)

Note: This tool is currently only available for Ubuntu, Red Hat, and CentOS. SUSE coming soon!

DMV Tool

DMV TOOL is a command-line, cross-platform tool that allows for accessing the live data by mounting SQL query output file or DMVs (Dynamic Management Views) to a virtual directory using FUSE and by querying a file structure as an object.

DBFS Concepts

In Linux, everything is a file. FUSE (Filesystem in Userspace) is a loadable kernel program for all Linux distros. It provides a platform for non-privileged users to create their own file systems without even touching the existing kernel code. This is achieved by mounting a file system in a user space through the interface known as FUSE modules.

Diagram Flow Implementation

In this section, the implementation steps are represented in a diagram.

Getting Started

See the following steps to configure the DBFS:

  1. Log in to the console with the SU account (Super User Mode)
  2. Install the wget utility. The GNU Wget is a non-interactive, file retrieval utility for downloading files from the Web or via FTP.
# yum install wget
  1. To get Docker set up on a CentOS, start to download the .rpm package for the corresponding release and install it manually using YUM command. Browse the URL to download the latest docker-ce file for the Docker version that you want to install.

4. Download the rpm package using the wget utility

# wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-18.03.0.ce-1.el7.centos.x86_64.rpm
  1. Install Docker CE package using the following yum command:
#yum install docker-ce-18.03.0.ce-1.el7.centos.x86_64.rpm
  1. Check the docker package information using the yum info command.
#yum info docker
  1. Use the systemctl command to start Docker, verify and enable the Docker service
#systemctl start docker
#systemctl status docker
#systemctl enable docker
  • Verify that Docker is installed correctly by running the welcome image.
#docker run hello-world
  •  Let’s download the CentOS latest image from the library. Use the Docker Pull command to extract the CentOS image, the community edition of RHEL Linux distros. We can also define a specific image from the list using tags or just append a colon ‘:‘ and type latest that only gives the most current image of CentOS.
#docker pull centos:latest
  • Once the pull is complete, create a container from the extracted image. We’ll do that with a sudo docker run.
# docker run –it centos:latest bash

Here, -it refers to interactive terminal, centos:latest is the reference to the image, and bash defines the shell that will open when the container starts.

Now, we can see the status of all Docker containers with the command docker ps –a

# docker ps -a
  • Let’s install the wget utility
# yum install wget
  • Download the DBFS (dbfs-0.2.5-0) package using the wget utility
wget https://github.com/Microsoft/dbfs/releases/download/0.2.5/dbfs-0.2.5-0.x86_64.rpm
  • Download the EPEL repository using wget utility
#wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
  • Setup the EPEL package
#rpm -ivh epel-release-latest-7.noarch.rpm
  • Install the DBFS package
#yum install dbfs-0.2.5-0.x86_64.rpm
  • Verify the DBFS package information
#yum info dbfs
  • Let’s quit the Docker container session by typing EXIT. Now, commit the Docker image. This commits all the previous changes made to the centos:latest image. The committed image will have the wget utilities, repositories of EPEL and DBFS package.
#exit
#docker commit 7af55818f007
  • Add the tag to the committed image using the Docker tag command. Let us name the tag centos:dbfs. We can see the new image using Docker images.
#docker tag 88cd8b6c264890547bc0fa24047f5248f2f6c40b5147a580c18a78018f8c664c centos:dbfs
#docker images

Docker image is ready with the DBFS configuration.

Let’s see how the DBFS FUSE program executes DMVs and Custom queries remotely using the configuration file.

Getting Ready

We will use the image centos7:dbfs in this section. Ensure that this image is available and has the required package and configuration to build the FUSE filesystem.

The following command is executed to launch the container centos:dbfs with a bash console:

#docker run –it –device /dev/fuse --cap-add SYS_ADMIN centos:dbfs bash

To mount FUSE file-system based, we need to combine both –cap-add and –device options with the docker run command. We can mount the FUSE file systems inside a container, but FUSE must be enabled on the host system. Docker will not explicitly enable and load kernel modules on the host.

Let us take a look at the other parameters of the docker run command:

SYS_ADMIN performs a range of system administration operations

–device flag limits access to a specific device that we can use for the container. By default, the container can read and write these devices.

DBFS mount command should not be started as the root user. To start dbfs, you need to create a normal user account. Next, go ahead and create the user “docker” with the useradd command. This command is a low-level utility applied for adding/creating user accounts in Linux.

The useradd command in Linux performs the following things:

  • Edits /etc/passwd, /etc/shadow, /etc/group, and /etc/gshadow files.
  • Creates and populates the home directory.
  • Sets permission and ownership to the home directory.
#useradd docker
#passwd docker

Grant full permissions to the /tmp folder. Let the user docker have full read and write access to all files and associated directories and subdirectories in that /tmp directory – use the command:

#mkdir /tmp/dbfs
#mkdir /tmp/dbfs/sql
#chown -R docker /tmp

The first chown command makes the user docker own the /tmp directory. The second chmod command gives them full read and access permissions. The r gives read permission, the X gives execute permission to directories and not files.

Now, without exiting the session, log in to enter the user session. dbfs should always run at the user privilege. I’m using the docker user credentials which I created above to open a user session. We can see that the pwd is set to /home/docker.

Let’s prepare the configuration file – it requires the following parameters:

  • [Server friendly name]
  • hostname=[HOSTNAME]
  • username=[DATBASE_LOGIN]
  • password=[PASSWORD]
  • version=[VERSION]
  • customQueriesPath=[PATH_TO_CUSTOM_QUERY_FOLDER]

In the following example, I’m connecting to a SQL instance, the Windows Server, HQDBT01.

  • Hqdbt01 is the server’s friendly name.
  • The location of the output file or dump directory would be //CustomQueries/. For example, the /tmp/dmv is the mount directory, then the path to the dump directory would be /tmp/dmv/hqdbt01/CustomQueries/.
  • Hqdbt01 is a hostname of the SQL instance.
  • The username and password are the SQL instance access credentials. I would recommend you define the database user and grant the required permission to that user. Then, the VIEW SERVER STATE permission must enable the user to query DMVs.
  • The version is set to 16 or higher (SQL Server 2016 or later), the fetch the data as JSON. Otherwise, the data would be exported as Tab-Separated Values (TSV).
  • CustomQueriesPath is the path of the SQL file to which the custom queries will be written.
[hqdbt01]
hostname=hqdbt01
username=sa
password=Api11401$sd
version=16
customQueriesPath=/tmp/dbfs/sql

Change the directory to /tmp and create a directory,dmv, the mount directory. Now, FUSE the filesystem as flat files using the dbfs client.

$cd /tmp/
$mkdir dmv
$dbfs –c ./dmvconfig.tool –m ./dmv

The DMV data can also be accessed using various data extraction commands.

$cat dm_exec_sessions

List only those databases where the database recovery model is set to SIMPLE:

$cat databases| grep SIMPLE

List the database name, database information, and sizes of columns of the database where the recovery model is set to SIMPLE. We can see that the output of one program is redirected to another program for further processing. The output is formatted as tab-separated values.

$cat databases| grep SIMPLE | awk ‘{print $1,$2,$5}’| column -t

Configure DBFS to Run Custom SQL Queries

Creating a custom SQL query is pretty straightforward. As per the configuration file, go to the path /tmp/dbfs/sql and write the SQL statement in a SQL file, demo.sql.

#vi /tmp/dbfs/sql/demo.sql

Type the following SQL code in the demo.sql:

SELECT name,user_acces_desc,is_read_only, state_desc,recovery_model_desc from
Sys.databases

Save the file by pressing escape :wq

The JSON output is then dumped into a file and mounted to a virtual directory. The thing is, the Docker container doesn’t have any tools installed on the source image. Still, we can fetch the SQL data.

Browse the directory /tmp/dmv/hqdbt01/customQueries

$cd /tmp/dmv/hqdbt01/customQueries/
$cat demo.sql|awk ‘{print $1,$2,$3,$4}’|column -t

Let’s update the demo.sql file with the following SQL statements or SQL of your choice:

SELECT c.session_id
, c.auth_scheme
, c.node_affinity
, s.login_name
, db_name(s.database_id) AS database_name
, CASE s.transaction_isolation_level
WHEN 0 THEN 'Unspecified'
WHEN 1 THEN 'Read Uncomitted'
WHEN 2 THEN 'Read Committed'
WHEN 3 THEN 'Repeatable'
WHEN 4 THEN 'Serializable'
WHEN 5 THEN 'Snapshot'
END AS transaction_isolation_level
, s.status
, c.most_recent_sql_handle
FROM sys.dm_exec_connections c
INNER JOIN sys.dm_exec_sessions s
ON c.session_id = s.session_id

Browse the directory /tmp/dmv/hqdbt01/customQueries and view the file

$cat demo.sql

The docker image is configured and tested with the dbfs FUSE technique to mount the programs as virtual directories on the Docker container.

Adding an Existing Project to GitHub Using the Command Line

The first step is to save the Docker image to a compressed file. This is the file that we are uploading to the github repository.

#docker save centos7:dbfs > centos7_dbfs.tar

Install Git on CentOS using yum

#yum install git

Let’s initialize or re-initialize the local directory as a Git repository

git init

Add the .tar files to the local repository. This stages them for the first commit

git add centos7_dbfs.tar

Commit the file that we’ve staged in the local repository

git commit -m "centos7 built with dbfs configuration"

Add the URL to the remote repository where the local repository will be pushed

git remote add origin <remote repository URL>

Push the changes in your local repository to GitHub.

git push –u origin master

Wrapping Up

In a nutshell, Docker can get more applications running on the same hardware than other technologies. It makes it easy for developers to create ready-to-run container applications quickly, and it makes managing and deploying applications much easier.

We have installed the Docker engine, provisioned Docker Linux VM, and installed the DBFS FUSE package. We have seen the complete life cycle of the Docker containers. It is that simple to build, use and test programs using the Agile process. Do you agree?

Implementing Paging Using OFFSET FETCH NEXT in SQL Server

A database can have hundreds of thousands of records. It is easy to insert and select these records via database management systems like SQL Server or MySQL etc. However, it is not easy to display thousands of records on a single webpage or in a desktop application. Space and memory constraints make it difficult to display a huge number of records all at once.

A common solution to such a problem is to implement paging. (Note, this is not memory paging implemented by operating systems) Paging in programming refers to displaying data via a series of pages. A random Google search may result in thousands of results. Google uses paging to display these results. If you scroll down the Google page with search results you would see the following:

Here you can see the number of pages that the search result is divided into. You can click the Next link to see more pages.

In this article, we will see how OFFSET FETCH NEXT operators can be used to implement paging on front-end applications. We will begin with a simple example using the OFFSET FETCH NEXT operator and will then see how it can be practically used with the help of a stored procedure.

Read More

Basics of sys.dm_exec_requests

Performance monitoring and troubleshooting in SQL Server is a vast topic. In SQL Server 2005, dynamic management views, also known as DMV’s, have been introduced and became an essential helping tool for diagnosing SQL Server performance problems. At the same time, we can use dynamic management views for Azure SQL Database. Some of them can differ from SQL Server on-premise database but the logic of work is still the same. Microsoft has very good documentation about dynamic management views. The only thing,  you need to be careful about the version and product validation of dynamic management views. Read More

Main Usage of sys.dm_os_wait_stats

As you know, the main responsibility of the database administrator lies in the monitoring of the SQL Server performance and intervening in determined time. You can find several SQL Server performance monitoring tools in the market but sometimes we need additional information about SQL Server performance to diagnosis and troubleshoot the performance issues. So we must have enough information about SQL Server Dynamic Management Views to handle issues about SQL Server.

Dynamic Management View (DMV) is a concept that helps us to discover SQL Server Engine performance metrics. DMV was first announced in SQL Server 2005 version and it continued in all versions of SQL Server afterward. In this post, we will talk about particular DMV whose database administrator must have enough information. This is sys.dm_os_wait_stats.

Read More

Traditional and Natively Compiled Scalar-Valued User-Defined Functions

In the software programming world, there are several approaches used by developers that help them with effortless software development. Design patterns, object-oriented programming, test driven development are some of them. If we particularly mention the code reuse; “Reuse of a typed code. It is written once and can be used in most places. It is a basic concept of software engineering. It is a structure necessary for modular programming. ”

For this reason, user-defined functions in SQL Server help us to avoid rewriting of T-SQL queries. At the same time, they improve code readability. In this post, we will discuss the pros and cons (advantages and disadvantages) of the traditional and natively compiled scalar user-defined functions and make a simple performance test.

Read More

Run Excel Macro using SSIS Script Task

When we export data from SQL server table to excel file by using SQL Server integration service package, data in excel file column exports into text format. Even though the column values are Integer or Decimal, it is stored in text format. For example, I am retrieving the values of “CreditLimit” column from the product table. Data format changes automatically and stores in text format in excel file.

Read More