Applying SQL Transformations and Handling Missing Values in Azure ML

Total: 1 Average: 5

In this article, we will introduce SQL transformations in action. We will also see how to handle missing values in our dataset.

Consider a scenario of a movie rating dataset containing records of different movies along with the average user ratings associated with each movie. The ratings are in numeric form ranging from 1 to 10 with 1 as the lowest rating and, respectively, 10 as the highest rating (though no movie in the history has achieved 10 rating:). Suppose that we want to convert the numeric ratings into categorical ratings. For instance, we want to replace ratings of 1-3 with the categorical value “poor”, of 4-6 with “average” while ratings of 7-10 will have the value “good”. We can accomplish it with SQL transformation in Azure ML Studio. Read More

Basics of SQL Server Task Automation

Total: 5 Average: 5

This is an introductory article about automation in SQL server primarily focused on the basic concepts. We will discuss some standard practices and a few examples to help beginners get started with SQL server automation.

This article also highlights the importance of automating SQL server tasks to save time and effort required to do these tasks manually.

Additionally, we will look at cases in which it is not a good idea to automate SQL server tasks despite the fact that automation saves time and effort. Read More

Calculate the median by using Transact SQL

Total: 5 Average: 3.6

The statistical median is the value which separates a dataset into halves – one comprises greater values, and the other comprises lesser ones. For a specified dataset, it can be considered as the “middle” value. For example, in the dataset {1, 3, 3, 4, 5, 6, 7, 8, 9}, the median is 5, which is fourth largest, and fourth smallest number in the dataset.

To calculate the median of any dataset, we first need to arrange all values from the dataset in a specific order. After arranging the data, we must determine the middle value of the specified dataset. If the dataset contains an odd number of values, than the middle value of the entire dataset will be considered as a median. Read More

Parameter Sniffing Primer

Total: 2 Average: 4.5

Introduction

Developers are often told to use stored procedures in order to avoid the so-called ad hoc queries which can result in unnecessary bloating of the plan cache. You see, when recurrent SQL code is written inconsistently or when there’s code that generates dynamic SQL on the fly, SQL Server has a tendency to create an execution plan for each individual execution. This may decrease overall performance by:

  1. Demanding a compilation phase for every code execution.

  2. Bloating the Plan Cache with too many plan handles that may not be reused.

Read More

Auto Create Statistics and Auto Update Statistics

Total: 3 Average: 4

Statistics comprises lightweight objects that are used by SQL Server Query optimizer to determine the optimal way to retrieve data from the table. SQL Server optimizer uses the histogram of column statistics to choose the optimal query execution plan. If a query uses a predicate which already has statistics, the query optimizer can get all the required information from the statistics to determine the optimal way to execute the query. SQL Server creates statistics in two ways:

  1. When a new index is created on a column.
  2. If the AUTO_CREATE_STATISTICS option is enabled.

In this article, Auto Create Statistics and Auto Update Statistics options are analyzed. They are database specific and can be configured using SQL Server management studio and T-SQL Query. Read More

T-SQL SET Operators Part 2: INTERSECT and EXCEPT

Total: 12 Average: 3.7

In my previous article, I explained the basics of set operators, their types, and prerequisites for their use. I also talked about UNION and UNION ALL operators, their usage and differences.

In this article, we’re going to learn the following:

  1. EXCEPT and INTERSECT operators.
  2. Difference between INTERSECT and INNER JOIN.
  3. The detailed explanation of INTERSECT and EXCEPT with an example.

EXCEPT and INTERSECT operators were introduced in SQL Server 2005. Both are set operators used to combine the result sets generated by two queries and retrieve the desired output. Read More

How to Properly Use the T-SQL IsNumeric Function

Total: 4 Average: 3.8

This article is focused on the T-SQL (Transact-SQL) IsNumeric function and its proper use in day-to-day SQL scripting tasks.

We will also see why it is important to understand how and why IsNumeric can be used – both incorrectly and correctly.

There may be some better alternatives to IsNumeric depending on the context. However, in the cases we’re going to cover in this article, I see this function as the best possible choice. Read More

T-SQL Datetime Data Type

Total: 5 Average: 3.8

Introduction

Data types are attributes that specify the kind of data that objects such as columns, local variables, expressions, and parameters can hold. Across the RDBMS world, data types are typically grouped into string, numeric, and date data types.

T-SQL supports 6 date and time data types namely:

  1. Datetime
  2. Smalldatetime
  3. Date
  4. Time
  5. Datetime2
  6. Datetimeoffset

The first two data types are considered as legacy versions of the newer ones. In this article, we focus on the date data types and, specifically, on the datetime and datetime2 data types available in SQL Server. Table 1 gives details of the various date and time data types available in SQL Server. Read More