• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Cool SQL query optimization (SQL Server 7)

Mucman

Diamond Member
Played around with SQL Profiler and found out how a customer was crashing Cold Fusion on all of our servers.

I realized that it was a DB issue because several servers running Cold Fusion would die. I connected to the SQL server and saw that its
CPU was pegged. Cold Fusion DB drivers must suck because the entire Cold Fusion Application server dies when the DB connection dies.

After running SQL Profiler on the DB server for 10 minutes, I saw about 40 queries sent to it (during this time the performance slowed to
a crawl). The queries were the following :


SELECT *
FROM Users
WHERE Users.UserID IN (SELECT UserID FROM UserSeason WHERE ShowID = 77)
OR Users.UserID IN (SELECT UserID FROM UserBest WHERE ShowID = 77)
OR Users.UserID IN (SELECT UserID FROM UserInd WHERE ShowID = 77)
ORDER BY UserLastName, UserFirstName


I tried the query in Query Analyzer and it peformed at 4-8 seconds per query!!! In less than a minute I optimized it with
the following


SELECT *
FROM Users
WHERE Users.UserID IN ((SELECT UserID FROM UserSeason WHERE ShowID = 77) UNION
(SELECT UserID FROM UserBest WHERE ShowID = 77) UNION
(SELECT UserID FROM UserInd WHERE ShowID = 77))
ORDER BY UserLastName, UserFirstName


This runs for less than a second! What is that? A 500% increase in performance

Anyways... I let the client know and he is pretty happy about it 🙂
 
You could probably increase performance further by using UNION ALL instead of UNION.
The fact that you are using IN on the subquery means it doesn't matter whether duplicates occur.
 
Originally posted by: Haircut
You could probably increase performance further by using UNION ALL instead of UNION.
The fact that you are using IN on the subquery means it doesn't matter whether duplicates occur.

Keep in mind that it wasn't my query in the first place 🙂

Going to have to look at UNION ALL now.

 
A join is almost always much faster than a nested select.

I bet this brings it below 0.5s:

SELECT [insert users columns here]
FROM Users u, UserSeason us
where u.UserID = us.UserID
and us.ShowID = 77
UNION
SELECT [insert users columns here]
From Users u,UserBest ub
where u.UserID = ub.UserID
and ub.ShowID = 77
UNION
SELECT [insert users columns here]
From Users u, UserInd ui
where u.UserID = ui.UserID
and ui.ShowID = 77
--- not sure what to do with the order by; could put this result in a temp table & select from there or do the sorting outside SQL; throw in nolock's as appropriate...

or alternatively, this should work even better (if I got the null checking right)
select distinct [insert users columns here]
from users u
left outer join userseason us on us.userID = u.userID and us.ShowID = 77
left outer join userbest ub on ub.userID = u.userID and ub.ShowID = 77
left outer join userind ui on ui.userID = u.userID and ui.ShowID = 77
where us.ShowID <> null or ub.ShowID <> null or ui.ShowID <> null
order by UserLastName, UserFirstName
 
What kinda database server are you using? A good db server would catch this query and optimize it on the run...
 
they said MS SQL 7

But I suspect most DBs will have the same problem. The nested selects completely kill the optimizer.
 
Argh, I wish I could save the "Execution Plans" as an image.

Anyways... here are some stats for the queries :


SELECT *
FROM Users
WHERE Users.UserID IN (SELECT UserID FROM UserSeason WHERE ShowID = 77)
OR Users.UserID IN (SELECT UserID FROM UserBest WHERE ShowID = 77)
OR Users.UserID IN (SELECT UserID FROM UserInd WHERE ShowID = 77)
ORDER BY UserLastName, UserFirstName


~ 7 seconds


SELECT *
FROM Users
WHERE Users.UserID IN ((SELECT UserID FROM UserSeason WHERE ShowID = 77) UNION
(SELECT UserID FROM UserBest WHERE ShowID = 77) UNION
(SELECT UserID FROM UserInd WHERE ShowID = 77))
ORDER BY UserLastName, UserFirstName


0-1 second


select *
from users u
left outer join userseason us on us.userID = u.userID and us.ShowID = 77
left outer join userbest ub on ub.userID = u.userID and ub.ShowID = 77
left outer join userind ui on ui.userID = u.userID and ui.ShowID = 77
where us.ShowID IS NOT null or ub.ShowID IS NOT null or ui.ShowID IS NOT null
order by UserLastName, UserFirstName


4 seconds


SELECT [insert users columns here]
FROM Users u, UserSeason us
where u.UserID = us.UserID
and us.ShowID = 77
UNION
SELECT [insert users columns here]
From Users u,UserBest ub
where u.UserID = ub.UserID
and ub.ShowID = 77
UNION
SELECT [insert users columns here]
From Users u, UserInd ui
where u.UserID = ui.UserID
and ui.ShowID = 77


This won't work since the UNION operator requires the fields to be exactly the same for each set that it
is "UNIONing"

 
The times still sound really high (I assume there isn't a huge # of rows returned or there wouldn't be any <1s case).... is UserID indexed in all relevant tables? Or better yet, is there a combined index on ShowID and UserID (in that order)? If not, WARNING: Queries will block for awhile while indices are being added.

Also I know temp tables are always bad in the long run, but how does this do?

create table #userlist(userid int) -- or whatever type userid is

insert into #userlist
select userid from UserSeason (nolock) where ShowID = 77

insert into #userlist
select userid from UserBest (nolock) where ShowID = 77

insert into #userlist
select userid from UserInd (nolock) where ShowID = 77

select distinct [users columns]
from Users (nolock), #userlist
where Users.UserID = #userlist.UserID

drop table #userlist
 
Yes, UserID is indexed in all of the tables and it is in fact a combined index (order ShowID,UserID).

Your query works. It took 2s the first time, but <0 every other time. I still prefer my query with the UNIONS though 🙂. I wish I could
post the pics of the "Execution Plan" from Query Analyzer.

There is one problem with your last query though... all of the other queries return 83 results, while your last one is returning 115.

I find this interesting how one can figure out multiple ways of fetching the same data though 🙂. For some reason I always find query
writing and enjoyable experience... the more complex the more enjoyable 😛
 
Back
Top