Posts belonging to Category SQL Server 2000



Grouping and Rounding Dates

DSC01618Ever need to group dates and round them to intervals?  I had this exact need for a report on wait stats I was working on for an upcoming post.  I wanted the flexibility to be able to group a list of dates by either minute, hour, date, etc.  And then also round those dates within those intervals.  For example, round a list of dates to every 15 minutes, every 3 hours, or every 2 days.  At first thought this sounded like something that might be a little tricky to accomplish.  But as I’ll show in the examples below, this is quite easy.

Start With The First of the Month

To resolve this issue, I started by using the logic that I typically when I need to take a date and change it to the first day of a month or quarter.  The logic I use for that utilizes the DATEADD and DATEDIFF functions.  I’ll skip going into how those work since Books Online covers that pretty well. 

To find the start of a month or quarter, these functions can be used together to be compare a date and time against the value 0.  Casting 0 as datetime returns the value 1900-01-01 00:00:00.000.  Using the two functions together, use DATEDIFF to determine how many months or quarters have occurred since 1900-01-01.  And then use DATEADD to add those months or quarters to the date 1900-01-01.  Doing this will provide you with the the first day in either the month or the quarter.

The examples below can be used to demonstrate this:


SELECT DATEADD(M, DATEDIFF(M, 0, '20100310'), 0)
GO

SELECT DATEADD(Q, DATEDIFF(Q, 0, '20100310'), 0)
GO

Returning the following results:

DateAddDateDiff

Round to the Desired Interval

Now that’s the easy part which leaves the next part, where dates and times need to be rounded not to the 1st of the month put to periods that are determined at execution time.  As I mentioned above, I need to change the logic above such that I can get dates to round to every 1 week or 2 days or 3 hours.

On the surface, this sounded a lot harder than it turned out to be.  Because if you use the % to find the remainder of division, you can easy round any date or time to any logical grouping of dates or times. 

Suppose you need to round a datetime to the nearest 4 hours.  Find the distance in hours between 1900-01-01 and the datetime in question.  Take that number and find the modulo using your rounding interval.  In this case that value is 4.  Then subtract the remainder from the distance.  And use the new distance value in the DATEADD function from 1900-01-01.  Hopefully, that didn’t lose too many people.

Rather than explain it another way, take a look at the following T-SQL statements where the datetime is rounded down to the nearest 2, 3, and 4 hour intervals.


--Round down to nearest 2 hours.
SELECT DATEADD(HH, DATEDIFF(HH, 0, '2010-03-10 03:00:00.000')-(DATEDIFF(HH, 0, '2010-03-10 03:00:00.000')%2), 0)
GO

--Round down to nearest 3 hours.
SELECT DATEADD(HH, DATEDIFF(HH, 0, '2010-03-10 03:00:00.000')-(DATEDIFF(HH, 0, '2010-03-10 03:00:00.000')%3), 0)
GO

--Round down to nearest 4 hours.
SELECT DATEADD(HH, DATEDIFF(HH, 0, '2010-03-10 03:00:00.000')-(DATEDIFF(HH, 0, '2010-03-10 03:00:00.000')%4), 0)
GO

These statements produce the following results:

DateAddDateDiff2 

As you’ll hopefully see, this is a very simple technique.  Imagine replacing the hour, day, or month date parts with a variable and then the interval to round the dates with a variable.  Doing that and the expression can be used in a stored procedure or report to group any set of dates along any set of date or time groupings.

Hope you find this as useful as I did.  And in the next couple days, I hope to have something up that uses this technique to make it even easier to use in your own code by providing fully working examples.

Please, no TRIM()?

Nose TrimA couple weeks back I was asked, “How come Microsoft has yet to put TRIM() in SQL?”

I don’t really know why there isn’t a TRIM() function in SQL Server. There is an LTRIM for function removing leading spaces. And an RTRIM function for removing trailing spaces. But there isn’t a TRIM function.

Let’s be honest, the why doesn’t matter to me – I don’t want the function to be a part of SQL Server. So Instead of figuring out why, I’m going to write about my opinions. This is a great place for them, this being my blog and all.

There are two main reasons that I am against having a TRIM function in SQL Server…

What Do You Know

First, a large majority of people tend to stick with what they know. Once you find something that works, why not stick with it until there is a reason not to use it any longer. Unfortunately this can have some unintended consequences.

Interactive PlaypenMost of the developers that have built the applications that I consult on were originally designed by application developers versus SQL Server developers. This distinction is important because a lot of these developers are then aware of the .NET or Java function for TRIM in the languages that they are using. In their cases, using it will not have a seriously negative impact on their environments.

But that “knowledge of a function” doesn’t translate well into SQL Server. In SQL Server, the use of TRIM would have a serious impact on the effectiveness of indexing being used to query the database. In case you don’t believe me, this will be demonstrated below.

What Do You Need

Secondly, how often do leading spaces need to be removed from a field? Sure, I’ve removed them while loading data into a database in an ETL process. But most often, leading space saved to a field in a database is just “bad” data that I’d rather clean up.

Are there reasons to have leading spaces in a field? Probably, but a vast majority of the time they won’t be there and they aren’t going to be needed.

By this logic, most of the time only trailing spaces will need to be removed from a field.

Considering the Two Reasons

If leading spaces don’t need to removed but people know about TRIM from their primary programming language, will they be inclined to find out if there is a function the only removes trailing spaces? Or would a person be more inclined to just use what they know. I understand the nature of laziness and pretty confident that people would end up just using TRIM().

Depending on the spaces that need to be removed can have a serious impact on performance. As an example of the performance impact let’s consider the following query.

USE AdventureWorks

SELECT * FROM Person.Contact
WHERE EmailAddress = 'gustavo0@adventure-works.com'

When executed, the query returns the following execution plan:

Trim001

You’ll notice that since there is an index on the column EmailAddress, that the Query Optimizer chooses an Index Seek on the column to find the value(s) the are being filtered. And then it looks up the rest of the columns with a Key Lookup operation.

Suppose though that the column had trailing spaces in it that needed to be removed when querying the data. To do this, you’d need to use the RTRIM() function. The query in this case would likely look like this.

USE AdventureWorks

SELECT * FROM Person.Contact
WHERE RTRIM(EmailAddress) = 'gustavo0@adventure-works.com'

And when executed the query would return the following execution plan:

Trim002

And in this case, the Query Optimizer has chosen to use an Index Scan. It can do this because SQL Server knows how the value begins and can scan for similar values within the index. And then follow-up with a Key Lookup for the rest of the values in the query.

But what we really are talking about is a function that removes the leading and the trailing spaces. To do this you would need to use the LTRIM() function in conjunction with the RTRIM() function. And now the query would look something like this:

USE AdventureWorks

SELECT * FROM Person.Contact
WHERE LTRIM(RTRIM(EmailAddress)) = 'gustavo0@adventure-works.com'

Running that query would produce the following execution plan:

Trim003

As you see now, the execution plan is quite different from the previous two executions. Now, the Query Optimizer doesn’t know the first letters of the values in the EmailAddress column. With the existing index being useless, the query doesn’t have any other good candidate to find the values from the WHERE clause. This results in the query just doing a Clustered Index Scan.

Of course, the last execution plan looks like it is simpler, but closer scrutiny of the execution would show that the IO between each of the queries has some significant differences.

--EmailAddress = 'gustavo0@adventure-works.com'Table 'Contact'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

--RTRIM(EmailAddress) = 'gustavo0@adventure-works.com'
Table 'Contact'. Scan count 1, logical reads 178, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

--LTRIM(RTRIM(EmailAddress)) = 'gustavo0@adventure-works.com'
Table 'Contact'. Scan count 1, logical reads 569, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Based on my people going with what they know, the addition of a TRIM() function would likely result in some serious performance issues with some queries. This would likely happen just as a matter of development without any thought as to the effect and difference between the performance of RTRIM(), LTRIM(), and TRIM().

Some People Want TRIM

Now my opinion isn’t the only one that is out there. Quite often developers I work with ask about the TRIM() function. And Pinal Dave (blogging at SQLAuthority) put in a Connect item regarding adding a TRIM() function. He’s had a number of posts regarding this topic and ways to get around the lack of a TRIM() function.

While I appreciate the discussions I’ve had with him, one of those time on the cursed “it depends”, I’ll have to disagree with him on adding a TRIM() function. The pain in this discovery from people just coming to SQL Server is worth the benefit of unnecessary performance death that could occur by overuse of a TRIM() function. I actually would encourage people to vote down this Connect item.

Checking EXISTS – What’s Your Practice?


Old microphone
Michael Swart (Database Whisperer) posted a great write up and interview with Brad Schulz that is worth checking out.  Michael’s post links to a post from Brad on the use of EXIST and what is required for the SELECT portion of the statementHope you are still with me.

Definitely worth the read and +1  for my blog subscriber list.

As he demonstrates, it doesn’t matter whether you use SELECT *, SELECT 1 or SELECT 1/0.  If you want to know what this means, please read his post.

From SQL Server’s perspective, it doesn’t consider any of this information in preparing the results.  The only thing that matters and this is from a practices perspective is to select a style for writing EXISTS statements and sticking to it.

Rounding Up to the Nearest 1,000

A co-worker came over and asked how to round up to the nearest 1,000.  He tried to use ROUND but found that it always split between going up or going down depending on the value and the report he was working on always had to round up.

Almost before he finished asking, actually before he finished asking, I said, “CEILING”.  Books Online defines the CEILING function as the following:

Returns the smallest integer greater than, or equal to, the specified numeric expression.

A quick example from Books Online:

SELECT CEILING($123.45), CEILING($-123.45), CEILING($0.0)

GO

The example provides the following output:

image

It’s a fairly simple function, but there isn’t an option to break this out to the tens, hundreds, or thousands.  To do this, I suggested he modify the base value that he needed by 1,000 to get the CEILING value and then multiply it back by the base value of 1,000.

The following is the example code that I showed him:

DECLARE @Value decimal(12,3)


SET @Value = 9321.12


SELECT @Value as Original

    ,CEILING(@Value/1000)*1000 as Ceiling_Pos

    ,CEILING(-@Value/1000)*1000 as Ceiling_Neg

    ,ROUND(@Value, -3) as Round_Pos

    ,ROUND(-@Value, -3) as Round_Neg

In the second column, the CEILING function produced the result he was looking for:

image

I included the use of the ROUND function to show that it did not produce the required result.  This method also works if you need to round up to the nearest 250, 50, 5, or 333rd.  Ok… it only mostly works with the 333rd and has a rounding issue, but that would just be a goofy requirement so I can live with that.

Hopefully, this can help others with their rounding up needs.

Scripting Object Level Permissions

A while ago while making changes to a client database, I needed to determine all of the object level permissions that were in the database I was working on.  The system stored procedure sp_helprotect turned out to be pretty useful in obtaining this information. 

The procedure accepts four parameters:


sp_helprotect [ [ @name = ] 'object_statement' ]
[ , [ @username = ] 'security_account' ]
[ , [ @grantorname = ] 'grantor' ]
[ , [ @permissionarea = ] 'type' ]

They do pretty much what would be expected.  Provide @name of an object and the results are limited to those for that object only.

Of course, I wasn’t a huge fan of the output from the procedure since I couldn’t use it directly to rebuild all of the permissions.  Below is the T-SQL statement that I put together to provide what I needed.


SET NOCOUNT ON
GO

DECLARE @Permissions TABLE
(
Owner nvarchar(128)
,Object nvarchar(128)
,Grantee nvarchar(128)
,Grantor nvarchar(128)
,ProtectType nvarchar(10)
,[Action] nvarchar(10)
,[Column] nvarchar(128)
)

INSERT INTO @Permissions
EXEC sp_helprotect

SELECT RTRIM(UPPER(ProtectType)) + Space(1)
   + RTRIM(UPPER([Action])) + Space(1)
   + CASE WHEN Object <> '.' THEN 'ON ' + QUOTENAME(Owner) + '.' + QUOTENAME(Object)
   + Space(1) ELSE Space(0)
   END
   + CASE WHEN (PATINDEX('%All%', [Column]) = 0) and ([Column] <> '.') THEN ' ('
+ [Column] + ')' ELSE Space(0) END
   + 'TO ' + QUOTENAME(Grantee)
   + CHAR(10) + 'GO' + CHAR(10)
FROM @Permissions
ORDER BY CASE WHEN [Action] = 'CONNECT' THEN 0 ELSE 1 END, Owner, Object, Grantee

Hope this helps anyone with similar needs.

Last Time SQL Server Restarted

Not a really hard thing to figure out, but for some crazy reason I always forget.  So, hopefully actually writing it out for the world to see will help me remember.

Since tempdb is created every time that SQL Server restarts, the created date for that database can be used to determine the last time the SQL Server restarted.

SELECT create_date FROM sys.databases WHERE name = ‘tempdb

Yeah… really hard, hehe.

Incrementing Values

Occasionally, I run across procedures that mimic the functionality of the IDENTITY property.  There are always various reasons for these procedures to exist, some valid and some not quite so.  Recently while trying to tune one of these procedures that someone had added interesting locking hints to a creative rewrite to the procedure was suggested.

Usually these procedure have a SELECT and then an UPDATE in the procedure to get and set the value.  In the suggested solution the OUTPUT variable would be included in the SET statement.  I’ve attached two sample scripts that includes this type of functionality.  The "IDENTITY_OLD" is the typical way that this is implemented and "IDENTITY_NEW" contains the new method.  The new method is as such:

Update Configuration 
Set value = value + 1, @ID = value + 1 
Where tablename = @tablename

This code change hasn’t been implemented in the test or production environments yet, because I am still early in reviewing this.  I haven’t been able to find anything wrong with this yet and maybe there isn’t anything to fear.  A few searches on the internet hasn’t turned up anything that resembles the solution, so I ask… does anyone see anything wrong with this?

EDITED: Better scripts make better examples.

Microsoft Certification Second Shot Exam, It’s Back…

It turns out that the Microsoft Certfication Second Shot Exam (FAQ) is back…

For a limited time, you can get a free, second shot at any IT professional, developer, or Microsoft Dynamics certification exam. Just register for this offer before your first exam, and you will get two chances to pass. But this offer won’t last forever.

The limited time is September 15, 2007, and January 30, 2008, so there is plenty of time to cram and exam.  The nice thing is this isn’t limited to a single exam, you can get a voucher for any and all exams taken while through January.

This is pretty great, almost as good as playing Halo 3 will be next Tuesday.  Riiiight!

This blog entry brought to you by the procedure "sp_MSForEachDB"

My undocumented stored procedure of the day is sp_MSForEachDB. This procedure allows SQL statments to loop through all database names on a server with a dynamic SQL string. The procedure accepts the variable @command1 for the dynamic SQL string, and the database name is passed into the string through the use of "?". Unlike using the sp_ExecuteSQL you are able build the SQL string

One thing to note with this stored procedure is that is does not switch contexts while it executes. So all of the SQL statements will run in the context of the master database. Getting around this issue is fairly trivial using either three-part naming of database objects or through the USE command.

For example executing sp_HelpDB for each database can be achieved through three-part naming:

sp_MSForEachDB ‘?..sp_HelpDB’

or the USE command

sp_MSForEachDB ‘Use ? exec sp_HelpDB’

The sp_HelpDB can be quite useful, I recently ran a script similar to the one below to get a list of IDs that appeared in a table in all of our client databases.

Create Table #Foo
(
DBName sysname,
FooType varchar(20),
FooID varchar(20)
)
Exec sp_MSForEachDB ‘Use ? If Object_ID(”Foo”) Is NOT NULL Insert Into #Foo Select ”?”, FooType, FooID From Foo’
Select * From #Foo

Hope this helps the many (or few rather) people that read this.

Room with a View

Every so often someone will get me on the topic of why "Select *" is an official Bad Practice. There are various arguements against "Select *" but one that a lot of programmers may not be aware of is the relationship between "Select *" and views.

When a view is created with "Select *" it references the tables that are part of the view and creates metadata to translate "Select *" into a listing of the columns that the view needs to retrieve. While this is a way to get around the work of typing out the columns for the result set there are unintended circumstances that can occur.
For an example, create the following table and view:

create table tbl_foo ([roo] int, [too] int, [moo] int)
create view vw_foo as select * from tbl_foo

If select from this view, you’ll get a list back of all of the columns in the table as expected. Now alter the table to include an additional column, such as:

alter table tbl_foo add [BOO] int

Now when you select from the view only the columns roo, too, and moo will return. The column BOO is not included in the select list. As I mentioned before when a view is created, SQL Server obtains metadata about to define what columns "Select *" references. In order to return the columns BOO in the view, the metadata needs to be refreshed. SQL Server does include a stored procedure, the procedure sp_refreshview will refresh the metadata of the view.

exec sp_refreshview ‘vw_foo’

Select from the view now will return all of the columns from the table. 
Being able to fix the metadata issue that table schema changes cause to "Select *" views doesn’t offset the additional step required to go in after the fact table changes and refresh each affected view. Views need to return specific sets of data, be naming the columns that are returned mistakes can be avoided that might sentance you to WTF.
I can imagine the guy who avoids refreshing views to prevent new columns from returning in the views. Then one day someone updates the sales table and refreshes the view, and suddenly the CEO wants to know why there are credit card numbers on the corporate website.