Bob Pusateri - The Outer Join

Collecting Index Usage Statistics

Knowing when and how often your indexes are used can really come in handy. Indexes that are never utilized aren’t worth keeping around, and knowing their usage patterns can be a big help when making decisions regarding things such as filegroup placement and compression settings. SQL Server 2005 brought some great advancements in the form of the sys.dm_db_index_usage_stats DMV, which returns statistics on index usage that are automatically being kept by the server. While it’s a step forward, I feel it still leaves a few things to be desired:

The values are cumulative since the server was last restarted
Once the server is shut down or restarted, there’s no way to get the previous values back

With this in mind, I came up with my own system for archiving non-cumulative index usage stats on a daily basis. Microsoft introduced Management Data Warehouse in SQL Server 2008 which provides a way to keep track of these statistics over time (along with many others), but I’ve been using my own method longer than that. This will also work in SQL Server Express Edition, which doesn’t support MDW. I’ll give descriptions of each part below, but if you just want the code right now download it here.

Tables

IndexUsageStats_LastCumulative
As I mentioned before, the output from sys.dm_db_index_usage_stats is cumulative since the last time the server was restarted, and I was looking for a way to find daily values that weren’t cumulative. Restarting the server immediately after collecting the values would have accomplished this, but users probably wouldn’t be very happy :). Instead I just keep track of the previous cumulative values and subtract them from the current ones. If the output shows an index was used 7 times on Tuesday at Midnight, and the same number was 12 on Wednesday at Midnight, then it must have been used 5 times in that 24 hour period. IndexUsageStats_LastCumulative holds the previous cumulative values for all indexes to make this calculation possible.

Names
The only true way to identify an object in SQL Server is by its name, but DMVs typically return object IDs instead of names. It’s easy to translate between names and IDs, but over time names and IDs can change:
- If an object is dropped and re-created with the same name, chances of it having the same ID are virtually zero.
- If an object is renamed, its ID will not change, but of course its name will.
Using either IDs or Names exclusively can make historical values become useless over time – you might have an ID that’s no longer valid or a name that changed a while back. To give myself more options for tracking changes over time, I store both. Since storing each object’s name in a character data type would consume a lot of space and be repetitive, I prefer to map object names to numeric values, and this table facilitates that.

IndexUsageStats
This table does the heavy ~~lifting~~ storing for our system – it contains all the historical index usage statistics by both object ID and name. I’ve found it to be a great candidate for compression if you’re running enterprise edition, but it’s still pretty compact even when uncompressed thanks to the Names table.

Stored Procedures

CollectIndexUsageStats
The stored procedure that drives statistics collection. You’ll want to set up a job to run this – mine runs at midnight. Basically it iterates through each online database in the instance collecting usage stats and then compares them to the previous cumulative values and writes the differences to the IndexUsageStats table. You’ll find comments in-line.

Views

vw_IndexUsageStats
Since the data contained in the IndexUsageStats table is all numeric, it’s not the easiest to browse. This view makes some joins back to the Names table so things are a little more user-friendly. Most of the queries I write are against this view. Object IDs aren’t returned here as they’re typically not needed. When they are necessary, I’ll write more detailed queries against IndexUsageStats.

A few notes before the code:

As always, this code is offered as-is with no warranty whatsoever. Do not deploy this in a production environment before you have tested it and understand exactly what it does.
I recommend putting all these objects in their own database. If you already have a DBA database for management information that should work fine too. My database is called “iDBA” because I inherited it with that name.
This was written and tested on SQL Server 2008 and 2008R2. It can probably be tweaked to run in 2005 by removing the datetime2 datatype and writing substitute queries for the MERGE statement.

-----------------------------------------------------------
-- Name: CollectIndexUsageStats.sql
--
-- Description: Collects index usage statistics over time.
--
-- Author: Bob Pusateri, http://www.bobpusateri.com
--
-- THIS SCRIPT IS PROVIDED "AS-IS" WITHOUT ANY WARRANTY.
-- DO NOT RUN THIS ON A PRODUCTION SYSTEM UNTIL YOU HAVE
--   COMPLETE UNDERSTANDING OF THE TASKS IT PERFORMS AND
--   HAVE TESTED IT ON A DEVELOPMENT SYSTEM.
-----------------------------------------------------------

-- I recommend putting these objects in a separate database
-- (or your existing DBA database if you have one)
--
-- If your DBA database is not named 'iDBA', you'll want to replace it with your DB name
--     It is referenced several times in the script below
USE [iDBA];
GO

-- stores cumulative data from sys.dm_db_index_usage_stats DMV
CREATE TABLE [dbo].[IndexUsageStats_LastCumulative] (
	[ServerNameID] [int] NOT NULL,
	[DatabaseID] [smallint] NOT NULL,
	[ObjectID] [int] NOT NULL,
	[IndexID] [int] NOT NULL,
	[LoadTime] [datetime2](0) NOT NULL,
	[User_Seeks] [bigint] NOT NULL,
	[User_Scans] [bigint] NOT NULL,
	[User_Lookups] [bigint] NOT NULL,
	[User_Updates] [bigint] NOT NULL,
	[System_Seeks] [bigint] NOT NULL,
	[System_Scans] [bigint] NOT NULL,
	[System_Lookups] [bigint] NOT NULL,
	[System_Updates] [bigint] NOT NULL,
	CONSTRAINT [PK_IUS_C] PRIMARY KEY CLUSTERED ([ServerNameID],[DatabaseID],[ObjectID],[IndexID])
);
GO

-- used for Server/DB/Schema/Table/Index name mapping
CREATE TABLE [dbo].[Names] (
	[ID] [int] IDENTITY(1,1) NOT NULL,
	[Value] [nvarchar](260) NOT NULL,
	CONSTRAINT [PK_Names] PRIMARY KEY CLUSTERED ([ID])
);
GO

-- stores historical usage statistics
CREATE TABLE [dbo].[IndexUsageStats] (
	[StatsDate] [datetime2](0) NOT NULL,
	[ServerNameID] [int] NOT NULL,
	[DatabaseID] [smallint] NOT NULL,
	[ObjectID] [int] NOT NULL,
	[IndexID] [int] NOT NULL,
	[DatabaseNameID] [int] NOT NULL,
	[SchemaNameID] [int] NOT NULL,
	[TableNameID] [int] NOT NULL,
	[IndexNameID] [int] NULL,
	[User_Seeks] [bigint] NOT NULL,
	[User_Scans] [bigint] NOT NULL,
	[User_Lookups] [bigint] NOT NULL,
	[User_Updates] [bigint] NOT NULL,
	[System_Seeks] [bigint] NOT NULL,
	[System_Scans] [bigint] NOT NULL,
	[System_Lookups] [bigint] NOT NULL,
	[System_Updates] [bigint] NOT NULL,
	CONSTRAINT [PK_IUS] PRIMARY KEY CLUSTERED ([StatsDate],[ServerNameID],[DatabaseID],[ObjectID],[IndexID]),
	CONSTRAINT [FK_IUS_Names_DB] FOREIGN KEY([DatabaseNameID]) REFERENCES [dbo].[Names] ([ID]),
	CONSTRAINT [FK_IUS_Names_Index] FOREIGN KEY([IndexNameID]) REFERENCES [dbo].[Names] ([ID]),
	CONSTRAINT [FK_IUS_Names_Schema] FOREIGN KEY([SchemaNameID]) REFERENCES [dbo].[Names] ([ID]),
	CONSTRAINT [FK_IUS_Names_Table] FOREIGN KEY([TableNameID]) REFERENCES [dbo].[Names] ([ID]),
	CONSTRAINT [CK_IUS_PositiveValues] CHECK ([User_Seeks]>=(0) AND [User_Scans]>=(0) AND [user_Lookups]>=(0)
		AND [user_updates]>=(0) AND [system_seeks]>=(0) AND [system_scans]>=(0) AND [system_lookups]>=(0)
		AND [system_updates]>=(0))
);
GO

-- collects usage statistics
-- I run this once daily (can be run more often if you like)
CREATE PROCEDURE [dbo].[CollectIndexUsageStats]
AS
BEGIN
	BEGIN TRY
		SET NOCOUNT ON;

		-- get current stats for all online databases

		SELECT database_id, name
		INTO #dblist
		FROM sys.databases
		WHERE [state] = 0
			AND database_id != 2; -- skip TempDB

		CREATE TABLE #t (
			StatsDate DATETIME2(0),
			ServerName SYSNAME,
			DatabaseID SMALLINT,
			ObjectID INT,
			IndexID INT,
			DatabaseName SYSNAME,
			SchemaName SYSNAME,
			TableName SYSNAME,
			IndexName SYSNAME NULL,
			User_Seeks BIGINT,
			User_Scans BIGINT,
			User_Lookups BIGINT,
			User_Updates BIGINT,
			System_Seeks BIGINT,
			System_Scans BIGINT,
			System_Lookups BIGINT,
			System_Updates BIGINT
		);

		DECLARE @DBID INT;
		DECLARE @DBNAME SYSNAME;
		DECLARE @Qry NVARCHAR(2000);

		-- iterate through each DB, generate & run query
		WHILE (SELECT COUNT(*) FROM #dblist) > 0
		BEGIN
			SELECT TOP (1) @DBID=database_id, @DBNAME=[name]
			FROM #dblist ORDER BY database_id;

			SET @Qry = '
				INSERT INTO #t
				SELECT
					SYSDATETIME() AS StatsDate,
					@@SERVERNAME AS ServerName,
					s.database_id AS DatabaseID,
					s.object_id AS ObjectID,
					s.index_id AS IndexID,
					''' + @DBNAME + ''' AS DatabaseName,
					c.name AS SchemaName,
					o.name AS TableName,
					i.name AS IndexName,
					s.user_seeks,
					s.user_scans,
					s.user_lookups,
					s.user_updates,
					s.system_seeks,
					s.system_scans,
					s.system_lookups,
					s.system_updates
				FROM sys.dm_db_index_usage_stats s
				INNER JOIN ' + @DBNAME + '.sys.objects o ON s.object_id = o.object_id
				INNER JOIN ' + @DBNAME + '.sys.schemas c ON o.schema_id = c.schema_id
				INNER JOIN ' + @DBNAME + '.sys.indexes i ON s.object_id = i.object_id and s.index_id = i.index_id
				WHERE s.database_id = ' + CONVERT(NVARCHAR,@DBID) + ';
				';

			EXEC sp_executesql @Qry;

			DELETE FROM #dblist WHERE database_id = @DBID;
		END -- db while loop

		DROP TABLE #DBList;

		BEGIN TRAN;

		-- create ids for Server Name by inserting new ones into dbo.Names
		INSERT INTO iDBA.dbo.Names (Value)
		SELECT DISTINCT RTRIM(LTRIM(t.ServerName)) AS ServerName
		FROM #t t
		LEFT JOIN iDBA.dbo.Names n ON t.ServerName = n.Value
		WHERE n.ID IS NULL AND t.ServerName IS NOT NULL
		ORDER BY RTRIM(LTRIM(t.ServerName));

		-- same as above for DatabaseName
		INSERT INTO iDBA.dbo.Names (Value)
		SELECT DISTINCT RTRIM(LTRIM(t.DatabaseName)) AS DatabaseName
		FROM #t t
		LEFT JOIN iDBA.dbo.Names n ON t.DatabaseName = n.Value
		WHERE n.ID IS NULL AND t.DatabaseName IS NOT NULL
		ORDER BY RTRIM(LTRIM(t.DatabaseName));

		-- SchemaName
		INSERT INTO iDBA.dbo.Names (Value)
		SELECT DISTINCT RTRIM(LTRIM(t.SchemaName)) AS SchemaName
		FROM #t t
		LEFT JOIN iDBA.dbo.Names n ON t.SchemaName = n.Value
		WHERE n.ID IS NULL AND t.SchemaName IS NOT NULL
		ORDER BY RTRIM(LTRIM(t.SchemaName));

		-- TableName
		INSERT INTO iDBA.dbo.Names (Value)
		SELECT DISTINCT RTRIM(LTRIM(t.TableName)) AS TableName
		FROM #t t
		LEFT JOIN iDBA.dbo.Names n ON t.TableName = n.Value
		WHERE n.ID IS NULL AND t.TableName IS NOT NULL
		ORDER BY RTRIM(LTRIM(t.TableName));

		-- IndexName
		INSERT INTO iDBA.dbo.Names (Value)
		SELECT DISTINCT RTRIM(LTRIM(t.IndexName)) AS IndexName
		FROM #t t
		LEFT JOIN iDBA.dbo.Names n ON t.IndexName = n.Value
		WHERE n.ID IS NULL AND t.IndexName IS NOT NULL
		ORDER BY RTRIM(LTRIM(t.IndexName));

		-- Calculate Deltas
		INSERT INTO iDBA.dbo.IndexUsageStats (StatsDate, ServerNameID, DatabaseID, ObjectID,
			IndexID, DatabaseNameID, SchemaNameID, TableNameID, IndexNameID, User_Seeks, User_Scans,
			User_Lookups, User_Updates, System_Seeks, System_Scans, System_Lookups, System_Updates)
		SELECT
			t.StatsDate,
			s.ID AS ServerNameID,
			t.DatabaseID,
			t.ObjectID,
			t.IndexID,
			d.ID AS DatabaseNameID,
			c.ID AS SchemaNameID,
			b.ID AS TableNameID,
			i.ID AS IndexNameID,
			CASE
				-- if the previous cumulative value is greater than the current one, the server has been reset
				-- just use the current value
				WHEN t.User_Seeks - ISNULL(lc.User_Seeks,0) < 0 THEN t.User_Seeks
				-- if the prev value is less than the current one, then subtract to get the delta
				ELSE t.User_Seeks - ISNULL(lc.User_Seeks,0)
			END AS User_Seeks,
			CASE
				WHEN t.User_Scans - ISNULL(lc.User_Scans,0) < 0 THEN t.User_Scans
				ELSE t.User_Scans - ISNULL(lc.User_Scans,0)
			END AS User_Scans,
			CASE
				WHEN t.User_Lookups - ISNULL(lc.User_Lookups,0) < 0 THEN t.User_Lookups
				ELSE t.User_Lookups - ISNULL(lc.User_Lookups,0)
			END AS User_Lookups,
			CASE
				WHEN t.User_Updates - ISNULL(lc.User_Updates,0) < 0 THEN t.User_Updates
				ELSE t.User_Updates - ISNULL(lc.User_Updates,0)
			END AS User_Updates,
			CASE
				WHEN t.System_Seeks - ISNULL(lc.System_Seeks,0) < 0 THEN t.System_Seeks
				ELSE t.System_Seeks - ISNULL(lc.System_Seeks,0)
			END AS System_Seeks,
			CASE
				WHEN t.System_Scans - ISNULL(lc.System_Scans,0) < 0 THEN t.System_Scans
				ELSE t.System_Scans - ISNULL(lc.System_Scans,0)
			END AS System_Scans,
			CASE
				WHEN t.System_Lookups - ISNULL(lc.System_Lookups,0) < 0 THEN t.System_Lookups
				ELSE t.System_Lookups - ISNULL(lc.System_Lookups,0)
			END AS System_Lookups,
			CASE
				WHEN t.System_Updates - ISNULL(lc.System_Updates,0) < 0 THEN t.System_Updates
				ELSE t.System_Updates - ISNULL(lc.System_Updates,0)
			END AS System_Updates
		FROM #t t
		INNER JOIN iDBA.dbo.Names s ON t.ServerName = s.Value
		INNER JOIN iDBA.dbo.Names d ON t.DatabaseName = d.Value
		INNER JOIN iDBA.dbo.Names c ON t.SchemaName = c.Value
		INNER JOIN iDBA.dbo.Names b ON t.TableName = b.Value
		LEFT JOIN iDBA.dbo.Names i ON t.IndexName = i.Value
		LEFT JOIN iDBA.dbo.IndexUsageStats_LastCumulative lc
			ON s.ID = lc.ServerNameID
			AND t.DatabaseID = lc.DatabaseID
			AND t.ObjectID = lc.ObjectID
			AND t.IndexID = lc.IndexID
		ORDER BY StatsDate, ServerName, DatabaseID, ObjectID, IndexID;

		-- Update last cumulative values with the current ones
		MERGE INTO iDBA.dbo.IndexUsageStats_LastCumulative lc
		USING #t t
		INNER JOIN iDBA.dbo.Names s ON t.ServerName = s.Value
		ON s.ID = lc.ServerNameID
			AND t.DatabaseID = lc.DatabaseID
			AND t.ObjectID = lc.ObjectID
			AND t.IndexID = lc.IndexID
		WHEN MATCHED THEN
			UPDATE SET
				lc.LoadTime = t.StatsDate,
				lc.User_Seeks = t.User_Seeks,
				lc.User_Scans = t.User_Scans,
				lc.User_Lookups = t.User_Lookups,
				lc.User_Updates = t.User_Updates,
				lc.System_Seeks = t.System_Seeks,
				lc.System_Scans = t.System_Scans,
				lc.System_Lookups = t.System_Lookups,
				lc.System_Updates = t.System_Updates
		WHEN NOT MATCHED BY TARGET THEN
			INSERT (ServerNameID, DatabaseID, ObjectID, IndexID, LoadTime, User_Seeks, User_Scans,
				User_Lookups, User_Updates, System_Seeks, System_Scans,
				System_Lookups, System_Updates)
			VALUES (s.ID, t.DatabaseID, t.ObjectID, t.IndexID, t.StatsDate, t.User_Seeks, t.User_Scans,
				t.User_Lookups, t.User_Updates, t.System_Seeks, t.System_Scans,
				t.System_Lookups, t.System_Updates)
		WHEN NOT MATCHED BY SOURCE
			THEN DELETE;

		COMMIT TRAN;

	END TRY
	BEGIN CATCH

		IF @@TRANCOUNT > 0
			ROLLBACK TRAN;

		DECLARE @ErrorNumber INT;
		DECLARE @ErrorSeverity INT;
		DECLARE @ErrorState INT;
		DECLARE @ErrorProcedure NVARCHAR(126);
		DECLARE @ErrorLine INT;
		DECLARE @ErrorMessage NVARCHAR(2048);

		SELECT @ErrorNumber = ERROR_NUMBER(),
			   @ErrorSeverity = ERROR_SEVERITY(),
			   @ErrorState = ERROR_STATE(),
			   @ErrorProcedure = ERROR_PROCEDURE(),
			   @ErrorLine = ERROR_LINE(),
			   @ErrorMessage = ERROR_MESSAGE();

		RAISERROR (@ErrorMessage, @ErrorSeverity, @ErrorState);

	END CATCH
END
GO

-- displays usage statistics
CREATE VIEW [dbo].[vw_IndexUsageStats]
AS
SELECT
	s.StatsDate,
	vn.Value AS ServerName,
	dbn.Value AS DatabaseName,
	sn.Value AS SchemaName,
	tn.Value AS TableName,
	dn.Value AS IndexName,
	s.IndexID,
	s.User_Seeks,
	s.User_Scans,
	s.User_Lookups,
	s.User_Updates,
	s.System_Seeks,
	s.System_Scans,
	s.System_Lookups,
	s.System_Updates
FROM dbo.IndexUsageStats s
INNER JOIN dbo.Names vn ON s.ServerNameID = vn.ID
INNER JOIN dbo.Names dbn ON s.DatabaseNameID = dbn.ID
INNER JOIN dbo.Names sn ON s.SchemaNameID = sn.ID
INNER JOIN dbo.Names tn ON s.TableNameID = tn.ID
LEFT JOIN dbo.Names dn ON s.IndexNameID = dn.ID;
GO

2012: New Year, New Goals

Happy New Year to all! At the start of 2011, I laid out some goals for the year and did an update on them back in July. Now that it’s 2012 it’s a great time to take a look at what I accomplished (and fell short on) last year and also set some new targets. My 2011 goals were as follows:

Speak More – I spoke at two SQL Saturdays this year, #67 in Chicago and #101 in Kansas City. I had hoped to make it to more than 2, but I still met my goal for the year and am hoping to continue the trend.

Travel More – I flew to Kansas City when I spoke there, so I’ll consider this goal completed as well.

Become eligible to take the MCM exam – I achieved this when I passed 70-451 back in April. Check!

Attend training – I considered this done when I attended the SQLskills Immersion Event in Chicago in May.

Keep up the blogging – I had kind of a mid-year lull in blog posts, but I like to think I finished off strong. I had 39 posts in 2011, but 45 in 2010. Worse yet, I didn’t even blog the entire year in 2010. Definitely a fail here.

Learn IO – As I mentioned in July, the plan was for my team to get a new SAN this year. We now have the SAN, actually it was delivered just last week. Unfortunately with the holidays apparently nobody from the vendor was available to install it, so as far as I know it is currently in our datacenter but still in boxes. Hopefully it gets set up soon so I can start working with it, but in the meantime this is a fail as well.

Help out more – I’m still around on twitter and chip in on #sqlhelp when I can, but more things going on at work keep me from being as active as I would like. I’m going to have to find a way to increase my activity. Fail.

All in all I consider myself to be 4 for 7. I did some great things in 2011, but not quite as much as I had hoped. That being said, I think it’s ok that I didn’t do everything I wanted to as those things will automatically roll over into next year. Here’s what I hope to accomplish in 2012:

Continue all the above – While I listed the above as goals, many of them are more of a journey than a destination. For example, training is a process that should have no end because there will always be something else out there to learn. The same goes for speaking, traveling, blogging and helping out. As for my goal of learning more about IO (and SAN management in particular) I should get that opportunity very soon once our new hardware is installed.

More MCM Progress – Becoming eligible to take the Microsoft Certified Master exam was great, but that’s no stopping point. I need to keep up studying, practicing, and working towards passing things! I don’t feel comfortable saying I want to become an MCM in 2012, so instead my goal is to continue progressing toward it.

Attend the PASS Summit – I should have done this long ago but never got the opportunity, so one of my goals for this year is to attend the PASS Summit. Perhaps this year I can be on the other side of all the tweets coming from the keynote! An added bonus would be if I could speak there, and I will definitely be submitting a few abstracts once the call for speakers opens.

Blog smarter, not harder – While my previous goal of keeping up my blogging is all well and good, I don’t just want to focus on the number of posts I write – anyone can write a lot of short and useless things and call it a victory in terms of quantity. I’m generally happy with what I put up here – if I don’t like a post then it doesn’t go live, and I’ve left a few on the cutting room floor or delayed them so they can be improved. In 2012 I plan to continue this trend and ensure that when you come here you will find informational, helpful, and thought-provoking material. As always, I welcome your comments and do my best to respond to them in short order. If you, my readers, aren’t happy, then there’s really no point in me having a blog to begin with!

So that’s my goals for the year. May you all have a happy, healthy, and prosperous 2012!

T-SQL Tuesday #025: Trick or ...?

13 December 2011 by Bob Pusateri in SQL Server , SQLServerPedia Syndication , TSQL Tuesday with one comment

This month’s T-SQL Tuesday is hosted by Allen White (blog | @SQLRunr) and is an invitation to show off a nifty T-SQL trick you use to make your job easier. Here’s one of my favorites as of late:

Some of the source systems we copy from shard their data across multiple similar tables. For example, instead of having 1 table named dbo.Sales, they might have 30 of them named dbo.Sales_001 to dbo.Sales_030. If there were a rhyme or reason to this sharding it might be seen as an advantage, but unfortunately neither myself nor others on the team have ever found a method to this madness, and the vendor will not divulge the way they do this. As if that’s not bad enough, additional tables can pop up out of nowhere, and not all these tables are guaranteed to contain the same columns. Table dbo.Sales_031 may pop up tomorrow and have only a subset of the columns from the previous 30 tables, or may contain new columns not present in any of the others.

To keep this all straight, I have a process that compares the columns in all applicable tables and generates a view that combines their contents using UNION ALL. I’ve actually blogged about an aspect of this before, but today I have a function containing a nice little trick that’s proven very helpful for me. Since the tables I’m generating a view of aren’t guaranteed to have the same columns in them, a simple “SELECT *” from each of them won’t work because unioning tables together requires identical datatypes in the same order. Instead I have to generate a select statement that explicitly lists all columns for every table. T-SQL can easily accomplish this with cursors and loops, but then I found FOR XML PATH, which made life a lot simpler.

FOR XML PATH has many features which are outside the scope of this post, but the one behavior we’re going to exploit today is that you can pass a row tag name that’s blank. When you do this, instead of sandwiching each row between tags, it simply concatenates all the row values together into one string. If you add commas, you’ve got a great way to list out all the columns of a table which you can then form into a SELECT statement. Without any further ado, here’s the demo code:

-- create our demo table with a few columns
CREATE TABLE dbo.MyTable (
   ID INT NOT NULL,
   Value VARCHAR(20),
   Notes NVARCHAR(MAX)
);

-- select the table's columns from system tables
SELECT c.name
FROM sys.columns c
INNER JOIN sys.tables t ON c.object_id = t.object_id
WHERE t.name = 'MyTable'
ORDER BY c.column_id;

-- now let's format it as XML
SELECT c.name
FROM sys.columns c
INNER JOIN sys.tables t ON c.object_id = t.object_id
WHERE t.name = 'MyTable'
ORDER BY c.column_id
FOR XML PATH;

-- now pass a blank string argument to PATH
-- note how the <row> tags are now removed but column tags remain
SELECT c.name
FROM sys.columns c
INNER JOIN sys.tables t ON c.object_id = t.object_id
WHERE t.name = 'MyTable'
ORDER BY c.column_id
FOR XML PATH('');

-- casting it to NVARCHAR removes the tags for each column
DECLARE @s NVARCHAR(MAX);

-- now let's add a blank string argument to PATH
SET @s = (
      SELECT ',' + QUOTENAME(c.name) -- comma to separate names
      FROM sys.columns c
      INNER JOIN sys.tables t ON c.object_id = t.object_id
      WHERE t.name = 'MyTable'
      ORDER BY c.column_id
      FOR XML PATH('') -- XML that really isn't
   );

-- remove the leading comma
SELECT SUBSTRING(@s, 2, LEN(@s));

Wrap the final output with a SELECT and a table name and you’re good to go. There’s tons more uses for this than creating lists of columns – basically anytime you want to turn column values into a delimited list you can make it happen using this method. Happy coding!

SQL Saturday #119 Open for Registration!

9 December 2011 by Bob Pusateri in SQL Server , SQLServerPedia Syndication with zero comments

The third time’s a charm! SQL Saturday #119 (Chicago 2012) has been posted on the SQL Saturday website! It will once again be held at the DeVry University campus in Addison, IL and will be on May 19, 2012.

SQL Saturday in Chicago has a special place in my heart for a few reasons. The first SQL Saturday I ever attended was SQL Saturday #31 back in 2009, the very first one held in Chicago. I didn’t have this blog back then, but had an excellent time. Last year was SQL Saturday #67, which I did blog about. I was also very fortunate to be part of a great group of people who help put it together, and am back for more this time around. The rest of our team consists of:

- Ted Krueger (blog | @onpnt)
- Aaron Lowe (blog | @vendoran)
- Bill Lescher (@blescher)
- Rich Rousseau (@zigzag219)
- Wendy Pastrick (blog | @wendy_dance)

Register Today!

Registration is open, so sign up today! We had a great turn out last year and we’re hoping to fill up again! The event itself is free, but there is a $10.00 charge for lunch (it won’t be pizza!)

Why You Need To Be There

It’s free knowledge! Yes there is a charge for lunch, but that’s optional. The training itself costs nothing.
Meet other awesome DBAs! Most companies have very few DBAs, sometimes only one. This can lead to feelings of isolation and that the whole world of developers is against you. You’re not alone though – there’s tons of us out there! Community events are by far the best way to meet people in this field – where else can you hang out all day with a bunch of people as passionate about databases as you are?
Did I mention it’s free?

Call for Speakers

Have you spoken at work, a user group, or a SQL Saturday before? Have you never tried but are looking to take the plunge? The call for speakers is open until April 19, 2012, so please submit an abstract (or two or three) if you’re at all interested!

I can’t wait to see you there in May!

Christmas Gifts I'd Like to See

5 December 2011 by Bob Pusateri in SQL Server , SQLServerPedia Syndication with 3 comments

Well, we’re back at the top of the month again, and this time around for Meme Monday we’re to talk about what we’d love to see under the tree from Microsoft. This got my creative juices flowing, and I came up three wishes:

First and foremost, if I had one wish that I could wish this holiday season, it would be that all the children of the world would join hands and sing together in the spirit of harmony and peace.

If I had two wishes that I could wish for this holiday season, it would be for all of the children of the world to join hands and sing, and for SQL Server to natively support object-level restores.

When working with a large database, native backups take a good deal of time to create and restore. For this reason, VLDBs are likely to be backed up using non-native methods such as SAN snapshots, but let’s forget about those and stick with what Microsoft includes in the box. If a single table needs to be restored from a backup, the only native way to accomplish this at the moment is to restore the entire database somewhere else and then copy the desired table back to the production server. This takes a great amount of disk space (the entire size of the database) as well as the time necessary to restore all that data. If available, filegroup backups can make it possible to restore only the specific filegroup(s) necessary for said table(s), but even that would be restoring more data than is necessary, since filegroups typically contain more than one table.

The best solution is to be able to restore an individual table or other object directly from a backup. Since only the necessary data would be read out of the backup this would keep the required time and disk space to a minimum. This feature is already offered by several third party vendors, but having it be built-in would be a huge help as you wouldn’t have to sit down with your boss and make the case for purchasing yet another tool.

If I had three wishes that I could wish for this holiday season, it would be again for the children, for object-level restores, and for the ability to run integrity checks on a backup.

The time it takes for DBCC CHECKDB to complete increases along with the size of the database being checked, and as databases get larger the general recommendation for integrity checks shifts towards running them on another server instead of the production machine. Once again limiting ourselves to options provided natively by Microsoft, we would have to restore a full backup of our very large database on another machine and run DBCC CHECKDB on it. As before, this will be expensive both in terms of time and disk space.

A few years ago Paul Randal mentioned in one of this blog posts that during his time at Microsoft he developed and patented a way to run DBCC CHECKDB on a backup without restoring it. I can’t imagine why Microsoft would sit on this technology, as I know it would be a huge help to me and I’m sure many others could benefit from it as well. Hopefully sometime it can be included as a feature.