An Example of Using LEAD and LAG Functions from SQL Server 2012

I recently encountered a technical interview question and it was something like this:

A company has its door access logging system and the company policy dictates that every employee’s card-in/card-out records shall reflect his/her true presence in or absence from office. But somehow, some employees skip cardings or tailgate behind others. Therefore, the logging system may have messy records with multiple “card-in’s” before one “card-out” or records that cannot be paired. Your task is to write a script to calculate the exact amount of time that the employee is in office.

As I’ve dwelled too long on the database administration side, I actually had no freaking clue how to write an effcient script without resorting to the RBAR appoach, so I acknowledged my defeat presented my hnoest thoughts and admitted that I would need Google for a lot more help 🙂

Now I am back at my shabby cozy little cubicle from my potential employer’s shiny high-end downtown office building, I did start Googling and found out the two newly-introduced functions LEAD and LAG from SQL Server 2012 might just be the solution that I was looking for.

First of all, for the sake of explanation, I fabricate some fake entry logs with courtesy again to two of the most prominent NBA’s superstars:

USE TestDB
GO

CREATE TABLE CardingLogs
(
	CardingTime DATETIME, --time holder get carded by the system
	Name VARCHAR(50), --name of the card holder
	InOut VARCHAR(4) --with 'in' meaning card-in and 'out' meaning card-out
)
GO

INSERT CardingLogs VALUES
('2014-03-30T07:30:00', 'Lebron James', 'in'),
('2014-03-30T07:45:00', 'Lebron James', 'in'),
('2014-03-30T08:00:00', 'Lebron James', 'in'),
('2014-03-30T09:00:00', 'Lebron James', 'out'),
('2014-03-30T10:00:00', 'Lebron James', 'out'),
('2014-03-30T11:00:00', 'Lebron James', 'out'),
('2014-03-30T12:00:00', 'Lebron James', 'out'),
('2014-03-30T14:00:00', 'Lebron James', 'in'),
('2014-03-30T15:00:00', 'Lebron James', 'in'),
('2014-03-30T16:00:00', 'Lebron James', 'out'),
('2014-03-30T17:00:00', 'Lebron James', 'in')
GO

INSERT CardingLogs VALUES
('2014-03-30T07:30:00', 'Dwyane Wade', 'out'),
('2014-03-30T07:45:00', 'Dwyane Wade', 'in'),
('2014-03-30T08:00:00', 'Dwyane Wade', 'out'),
('2014-03-30T09:00:00', 'Dwyane Wade', 'in'),
('2014-03-30T10:00:00', 'Dwyane Wade', 'out'),
('2014-03-30T11:00:00', 'Dwyane Wade', 'in'),
('2014-03-30T12:00:00', 'Dwyane Wade', 'out'),
('2014-03-30T14:00:00', 'Dwyane Wade', 'out'),
('2014-03-30T15:00:00', 'Dwyane Wade', 'in'),
('2014-03-30T16:00:00', 'Dwyane Wade', 'out'),
('2014-03-30T17:00:00', 'Dwyane Wade', 'out')
GO

And then if you’d like to see what’s inside the table with SELECT * FROM CardingLogs, you’ll get the below output:

As illustrated by the red arrow, the programming logic is to find if the current ‘out’ record is ‘leading’ an immediate ‘in’ record behind; or put it another way, if the current ‘in’ is ‘lagged’ by an immediate ‘out’ record in front of itself. And hence the function LEAD() and LAG() come in handy. My approach would be use the lag function to mark all the ‘out’ rows (in green) which can be described as the first ‘out’ after an ‘in’ (or a series of ‘in’s’). Below query

SELECT *,
IsFO = 
CASE
	WHEN (InOut = 'out' AND LAG(InOut, 1, 0) OVER (PARTITION BY Name ORDER BY CardingTime ASC) = 'in') THEN 1
	ELSE 0
END
FROM CardingLogs

shows the “marked” rows as follows:
MarkedRows1

Now what we need to do is to calculate the time diffierences between all rows marked with ‘1’ and their immediate preceeding rows marked with ‘0’ as illustrated by the green box in the above resultset. And which function can you think of that can do this job with a breeze? You betcha it is the LAG() again! And the final script with a little help from our buddy common table expression is like this:

WITH TMP AS
(
SELECT DATEDIFF(MINUTE, LAG(CardingTime, 1, 0) OVER(PARTITION BY Name ORDER BY CardingTime ASC), CardingTime) AS TimeInOffice,
*,
IsFO = 
CASE
	WHEN (InOut = 'out' AND LAG(InOut, 1, 0) OVER (PARTITION BY Name ORDER BY CardingTime ASC) = 'in') THEN 1
	ELSE 0
END
FROM CardingLogs
)
SELECT Name, SUM(TimeInOffice) AS TotalTimeInOffice FROM tmp
WHERE IsFO = 1
GROUP BY Name

The final resultset:

CardingResultSet

And if you’d like a better understanding of LEAD and LAG function, do not hesitate to consult Microsoft’s official explanation of them here.

Posted in Common Table Expression, T-SQL Script | Tagged , , | Leave a comment

Two Examples of Recursion with T-SQL

T-SQL is very good for dealing with set-based data, but it doesn’t shy away from recursion either. You can use either user defined functions (UDF) or common table expression (CTE) to achieve this and I put up hereunder two examples to solve the claissic problem of finding the organization levels of employees and their managers.

I use well-known names from the NBA franchise Miami Heat to provide some raw data:

CREATE TABLE Employees
(
	EmployeeID INT ,
	EmployeeName VARCHAR(50),
	ReportsTo INT
)
Go

INSERT Employees VALUES
(1,'Micky Arison', NULL),
(2, 'Pat Riley', 1),
(3, 'Erik Spoelstra', 2),
(4, 'Dwyane Wade', 3),
(5, 'Lebron James', 3),
(6, 'Chris Bosh', 4),
(7, 'Ray Allen', 5),
(8, 'Shane Battier', 4),
(9, 'Norris Cole', 8),
(10, 'Michael Beaseley', 8),
(11, 'Mario Chalmers', 8)
GO

And by running a simple T-SQL statement SELECT * FROM Employees ORDER BY EmployeeID, you shall have the following resultset:

Employees2
And the hierachy of this “organization” shall be something like this:
Employees

To recursively get an employee’s level in the organization, I created the below functions. The first one is for retrieving EmployeeID by EmployeeName and the second one implements the recursion logic of searching upwards in the organization tree.

CREATE FUNCTION dbo.fn_GetEmployeeID(@EmployeeName AS VARCHAR(50))
RETURNS INT
AS
BEGIN
	DECLARE @EmployeeID AS INT
	--Assuming EmployeeName is unique
	SELECT @EmployeeID = EmployeeID FROM Employees
	WHERE EmployeeName = @EmployeeName
	RETURN @EmployeeID
END
GO
CREATE FUNCTION dbo.fn_GetLevel(@EmployeeID AS INT, @Level AS INT = 0)
RETURNS INT
AS
BEGIN
	DECLARE @ReportsTo AS INT
	SELECT @ReportsTo = ReportsTo FROM Employees
	WHERE EmployeeID = @EmployeeID
	IF @ReportsTo IS NOT NULL
		RETURN dbo.fn_GetLevel(@ReportsTo, @Level+1) 
	RETURN @Level+1
END
GO

With data populated and functions created, now you can use T-SQL statement SELECT dbo.fn_GetLevel(dbo.fn_GetEmployeeID('Lebron James'), DEFAULT) to retrieve the all-star’s level at Miami Heat and aslo statement
SELECT EmployeeID, EmployeeName, dbo.fn_GetLevel(EmployeeID) AS [EmployeeLevel] FROM Employees
to get the below full table with every empoyee’s level info:
EmployeeLevel

Another quick way to realize the recursion logic is to use CTE (common table expression) which can reference from the top node of the organization tree with the rest of the rows by using key words UNION ALL:

WITH tmp (EmployeeID, EmployeeName, EmployeeLevel) AS
(
	SELECT E.EmployeeID, E.EmployeeName, 1 AS EmployeeLevel FROM Employees E
	WHERE ReportsTo IS NULL
	
	UNION ALL
	
	SELECT E.EmployeeID, E.EmployeeName, T.EmployeeLevel + 1 AS EmployeeLevel
	FROM Employees E INNER JOIN tmp T on E.ReportsTo = T.EmployeeID
)
SELECT * FROM tmp

And the final output is exactly same as shown by the prior approach that uses UDF. Hopefully this will be helpful for some new learners on T-SQL.

Posted in Common Table Expression, Recursion, T-SQL Script, User Defined Function | Tagged , , | Leave a comment

T-SQL Stored Procedure to Get DB’s Recovery Model

As a lazy DBA, I am not a fan of writing ad-hoc queries, therefore I keep wrapping up codes which I think that will come in handy some time later. Here is the piece of T-SQL that checks the recovery model of all or a specific DB on an instance.


/*
.SYNOPSIS
Stored procedure to get recovery model(s) of a specific DB(s).

.SQL Version
2005+

.Last Modified
2014-03-19 by Nick Xu

.Example
EXEC sp__GetRecoveryModel
EXEC sp__GetRecoveryModel @dbname = 'AdventureWorks'
*/

--For generic setting
USE master
GO

IF EXISTS (SELECT * FROM sysobjects WHERE name = N'sp__GetRecoveryModel' AND type = N'P')
BEGIN
	DROP PROC sp__GetRecoveryModel
END
GO


CREATE PROC sp__GetRecoveryModel
@dbname AS VARCHAR(50) = NULL
AS
BEGIN
	IF @dbname is NOT NULL
	BEGIN
		SELECT [DatabaseName] = name,
		   [RecoveryModel] = recovery_model_desc 
		FROM   sys.databases
		WHERE name = @dbname
	END
	ELSE
	BEGIN
		SELECT [DatabaseName] = name,
		   [RecoveryModel] = recovery_model_desc 
		FROM   sys.databases
		ORDER BY [RecoveryModel] DESC
	END
END
GO




Posted in Recovery Model, SQL Server, T-SQL Script | Tagged | Leave a comment

T-SQL Stored Procedure to Kill all Connections to a DB

I’ve encountered the below error message many times when I was trying to do an adhoc restore operation on a DB server:

Msg 3101, Level 16, State 1, Line 2
Exclusive access could not be obtained because the database is in use.
Msg 3013, Level 16, State 1, Line 2
RESTORE DATABASE is terminating abnormally.

And to quickly gain exclusive access to the target DB, I wrote down this proc to kick out all connections that are bound to the target DB and don’t hate me because I used cursors 🙂

/*
Stored procedure to kill processes that are connected to a specific DB

SQL Version: 2005+

Last modified on 2014-03-17 by Nick Xu

--Usage
EXEC sp__KillDBConnections 'DBADB'

*/


--For generic setting
USE master
GO

IF EXISTS (SELECT * FROM sysobjects WHERE name = N'sp__KillDBConnections' AND type = N'P')
BEGIN
	DROP PROC sp__KillDBConnections
END
GO



CREATE PROC sp__KillDBConnections
@dbname AS VARCHAR(100) = NULL
AS
BEGIN
	DECLARE @DatabaseName VARCHAR(50);
	DECLARE @Spid VARCHAR(20);
	DECLARE @Command VARCHAR(50);

	SET @DatabaseName = @dbname;

	IF @DatabaseName IS NULL
	BEGIN
		PRINT 'Please specify to which DB you want to kill all connections...'
		RETURN
	END

	--Select all SPIDs except the SPID for this connection and kill them
	DECLARE SpidCursor CURSOR FOR
	SELECT spid FROM master.dbo.sysprocesses
	WHERE dbid = DB_ID(@DatabaseName) AND spid != @@spid

	open SpidCursor
	FETCH NEXT FROM SpidCursor INTO @Spid
	while @@fetch_status = 0
	BEGIN
		SET @Command = 'kill ' + RTRIM(@Spid) + ';';
		EXEC(@Command);
		FETCH NEXT FROM SpidCursor INTO @Spid
	END

	CLOSE SpidCursor
	DEALLOCATE SpidCursor

END
GO
Posted in T-SQL Script | Tagged , | Leave a comment

Powershell Function to Get SQL Server’s Network Protocols

#Function to get SQL Server network protocols
Function Get-SQLProtocols
{
	Param([String]$ComputerName = $env:COMPUTERNAME)

	[reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo") | out-null

	$Server = New-Object "Microsoft.SqlServer.Management.Smo.Server" $ComputerName

	$VersionMajor = $Server.VersionMajor

	Get-WmiObject -ComputerName $ComputerName -NameSpace root\Microsoft\SqlServer\ComputerManagement$VersionMajor -Class ClientNetworkProtocol | Select-Object ProtocolName, ProtocolDisplayName, ProtocolOrder

}#End of function

The output of the above function usually looks like this:

ProtocolName ProtocolDisplayName ProtocolOrder
sm
Shared Memory
1
tcp
TCP/IP
2
np
Named Pipes
3
via
VIA
0

Number ‘0’ means corresponding protocol is ‘disabled’, while the NO. ‘1’ has the highest priority to be applied when a DB connection is being established.

Posted in Database Connection, Powershell | Leave a comment

Powershell Function to Get Largest Files on a Drive or in a Folder

Ofentimes when a disk space shortage alert pops up from a specific server, I’d like to know quickly which files are the largest that I might have a way to shrink or move or delete. Here are the two funtions that I use to immediately locate the full list of top N largest files at a specific location (be it a hard drive on a server or a UNC path):

#Function to get the largest N files on a specific computer's drive
Function Get-LargestFilesOnDrive
{
	Param([String]$ComputerName = $env:COMPUTERNAME,[Char]$Drive = 'C', [Int]$Top = 10)
	Get-ChildItem -Path \\$ComputerName\$Drive$ -Recurse | Select-Object Name, @{Label='SizeMB'; Expression={"{0:N0}" -f ($_.Length/1MB)}} , DirectoryName,  Length | Sort-Object Length -Descending  | Select-Object Name, DirectoryName, SizeMB -First $Top | Format-Table -AutoSize -Wrap
}

#Function to get the largest N files on a specific UNC path and its sub-paths
Function Get-LargestFilesOnPath
{
	Param([String]$Path = '.\', [Int]$Top = 10)
	Get-ChildItem -Path $Path -Recurse | Select-Object Name, @{Label='SizeMB'; Expression={"{0:N0}" -f ($_.Length/1MB)}} , DirectoryName,  Length | Sort-Object Length -Descending  | Select-Object Name, DirectoryName, SizeMB -First $Top | Format-Table -AutoSize -Wrap
}
Posted in Powershell | Tagged | Leave a comment

Powershell Function to Get Sizes of All Databases on an Instance

Below is a function that I modified from one in Idera’s SQL Server Powershell Scripts toolset:

#Function to get sizes of all DB's hosted on an instance
Function Get-SQLDBSizes
{
	param (
		[string]$InstaceName = "$(Read-Host 'Please use format ServerName(IPAddress)\InstanceName' [e.g. 127.0.0.1\instance])"
	)

	begin {
		[void][reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo")
	}
	process {
		try {
			Write-Verbose "Connect to SQL Server using IP address, instance and Windows authentication..."

			$dbs = @()

			Write-Verbose "Creating SMO Server object..."
			$smoServer = new-object Microsoft.SqlServer.Management.Smo.Server $InstaceName

			# Use Windows Authentication by setting LoginSecure to TRUE
			Write-Verbose "Setting Windows Authentication mode..."
			$smoServer.ConnectionContext.set_LoginSecure($True)

			# Output object that contains a of the databases
			foreach ($Database in $smoServer.Databases) {
				$db = New-Object -TypeName PSObject -Property @{
					'DBOwner' = $Database.Owner
					'DBName' = $Database.Name
					'DBSize(MB)' = "{0:N0}" -f $Database.Size
				}
				$dbs += $db
			}
			Write-Output $dbs | Format-Table -AutoSize
		}
		catch [Exception] {
			Write-Error $Error[0]
			$err = $_.Exception
			while ( $err.InnerException ) {
				$err = $err.InnerException
				Write-Output $err.Message
			}
		}
	}
}#End of function
Posted in Powershell | Tagged , | Leave a comment

Powershell Function to Reboot a Computer with Warning Messages

I am fully aware that a simple

Restart-Computer "ServerName"

command can easily do the trick, but it means doing this trick a bit too easy and too harzardly. I’d like to see my own function to promt up some warnings and confirmations before I actually do the reboot. I learned this lesson by typing a typo yesterday and accidentally restarting another (fortunately dev) box.

Function Reboot-Computer
{
	param([String[]]$ComputerName = $env:COMPUTERNAME)
	
	Write-Host "WARNING: YOU ARE TRYING TO REBOOT COMPUTER(S): $ComputerName!" -ForegroundColor "Yellow" 
	Write-Host "PLEASE CONFIRM THAT YOUR ACTION IS WELL-INTENTIONED!" -ForegroundColor "Red" 
	Restart-Computer -ComputerName $ComputerName -Confirm  
}
Posted in Powershell | Leave a comment

T-SQL Stored Procedure to Get the Rough Row Count(s) of Specific Table(s)

If you’d like to get the most accurate of row count of any table, select count(*) is still your BFF. However, in some situations, the speed of retrieving a rough estimate of the row count from a very large table is more emphasized upon. And hence here goes the script:

/*
.Synopsis
Stored procedure to (quickly) get the rough estimate(s) of row count(s) for specific table(s).

The accuracy of row counts depends on how often the stats update is run on the target instance.

.Original link
http://www.sqlservercentral.com/articles/T-SQL/67624/
http://www.brentozar.com/archive/2014/02/count-number-rows-table-sql-server/


.SQL Version
2005+

.Last Modified
2014-03-19 by Nick Xu

.Example
EXEC sp__GetTableRowCount @dbname = 'AdventureWorks'

*/


--For generic setting
USE master
GO

IF EXISTS (SELECT * FROM sysobjects WHERE name = N'sp__GetTemplateProc' AND type = N'P')
BEGIN
	DROP PROC sp__GetTableRowCount
END
GO


CREATE PROC sp__GetTableRowCount
@dbname VARCHAR(100) = 'master', @tablename VARCHAR(100) = NULL
AS
BEGIN

	-- Shows all user tables and row counts for the current database 
	-- Remove is_ms_shipped = 0 check to include system objects 
	-- i.index_id < 2 indicates clustered index (1) or hash table (0) 
	
	DECLARE @sql AS VARCHAR(1000)
	
	
	IF @tablename IS NULL
	BEGIN
		SET @sql = 
		'SELECT '''+ @dbname + ''' as [DBName], o.name AS [TableName], ddps.row_count AS [RowCount] 
		FROM [' + @dbname + '].sys.indexes AS i 
		INNER JOIN [' + @dbname + '].sys.objects AS o ON i.OBJECT_ID = o.OBJECT_ID 
		INNER JOIN [' + @dbname + '].sys.dm_db_partition_stats AS ddps ON i.OBJECT_ID = ddps.OBJECT_ID 
		AND i.index_id = ddps.index_id 
		WHERE i.index_id < 2 
		ORDER BY [RowCount] DESC'
		--SELECT @sql
		EXEC(@sql)
	END
	ELSE
	BEGIN
		SET @sql = 
		'SELECT '''+ @dbname + ''' as [DBName], o.name AS [TableName], ddps.row_count AS [RowCount] 
		FROM [' + @dbname + '].sys.indexes AS i 
		INNER JOIN [' + @dbname + '].sys.objects AS o ON i.OBJECT_ID = o.OBJECT_ID 
		INNER JOIN [' + @dbname + '].sys.dm_db_partition_stats AS ddps ON i.OBJECT_ID = ddps.OBJECT_ID 
		AND i.index_id = ddps.index_id 
		WHERE i.index_id < 2 AND o.name= ''' + @tablename + '''' 
		--SELECT @sql
		EXEC(@sql)
	END
END
GO


Posted in SQL Server, T-SQL Script | Tagged , | Leave a comment

T-SQL Stored Procedure to Get the Number of Connections and Login Names per DB

/*
Script to get the number of connections and their login names on an instance

Last modified by Nick Xu on 2014-03-05

Usage:
EXEC sp__GetDBConnections 'master'
EXEC sp__GetDBConnections
*/



--For generic setting
USE master
GO

IF EXISTS(SELECT * FROM sysobjects WHERE name = 'sp__GetDBConnections' AND type = 'P')
BEGIN
	DROP PROD sp__GetDBConnections
END
GO



CREATE PROCEDURE sp__GetDBConnections
@dbname VARCHAR(100) = NULL --display all DB's connections IF db name is not specified
AS
BEGIN

	--return full list IF @dbanme is NULL
	IF @dbname is NULL
	BEGIN
		SELECT 
			DB_NAME(dbid) AS DBName, 
			COUNT(dbid) AS [#Connections],
			loginame AS Login_Name
		FROM
			sys.sysprocesses
		WHERE 
			dbid > 0
		GROUP BY 
			dbid, loginame
		ORDER BY [#Connections] desc
	END
	ELSE
	BEGIN
		SELECT 
			DB_NAME(dbid) as [DBName], 
			COUNT(dbid) as [#Connections],
			loginame as Login_Name
		FROM
			sys.sysprocesses
		WHERE 
			DB_NAME(dbid) = @dbname
		GROUP BY 
			dbid, loginame
		ORDER BY [#Connections] desc
	END
END
GO


Posted in Database Connection, T-SQL Script | Tagged | Leave a comment