Sunday, February 19, 2012

IN versus OR operator in WHERE clause AND IN with lots of values

Is there any preference to using IN versus OR operator in a WHERE clause?
The following query:
SELECT * FROM Customer WHERE CustID = 1 OR CustID = 5 OR CustID = 10
can also be written as:
SELECT * FROM Customer WHERE CustID IN (1, 5, 10)
Certainly the latter query is concise. Also I have a clustered index on
CustID. Should I prefer former or latter?
Also if I need construct a query with IN/OR operator that contains 200 or
300 values in it, will performance degrade? Is there an alternative? The
BETWEEN operator is not an option for me.
Thanks.
Raghu/..> Certainly the latter query is concise. Also I have a clustered index on
> CustID. Should I prefer former or latter?
What does "should" mean? At what point does conciseness beat out
performance? Did you try both queries and compare performance? Did you see
any difference in the execution plans?
> Also if I need construct a query with IN/OR operator that contains 200 or
> 300 values in it, will performance degrade?
Did you try it?
> Is there an alternative?
Stuff your list into a single-column (one row per list value) table, maybe
even a #temp table, and perform an inner join?
--
Aaron Bertrand
SQL Server MVP
http://www.aspfaq.com/|||You raised all good questions. However I am still in the early stages.
1) Is there a difference between IN and OR in the way I used it?
2) When using lots of values, will there be a difference between using IN or
OR operator?
If the only answer to above question is to try, then I will do that. I am
looking for some logical explanation.
Thanks.
Raghu/..
"Aaron Bertrand - MVP" <aaron@.TRASHaspfaq.com> wrote in message
news:%23KyzM$e5DHA.2432@.TK2MSFTNGP10.phx.gbl...
> > Certainly the latter query is concise. Also I have a clustered index on
> > CustID. Should I prefer former or latter?
> What does "should" mean? At what point does conciseness beat out
> performance? Did you try both queries and compare performance? Did you
see
> any difference in the execution plans?
> > Also if I need construct a query with IN/OR operator that contains 200
or
> > 300 values in it, will performance degrade?
> Did you try it?
> > Is there an alternative?
> Stuff your list into a single-column (one row per list value) table, maybe
> even a #temp table, and perform an inner join?
> --
> Aaron Bertrand
> SQL Server MVP
> http://www.aspfaq.com/
>|||IN is the same as a bunch of = OR'd together as far as the query engine is
concerned
Greg Jackson
PDX, Oregon|||>I am looking for some logical explanation.
Interesting choice of words.
You have a set of values that you want to use
in the IN clause; yet, that set of values is not
in your database -- whose purpose in life is to
help you play with sets of data. Was it illogical
to play with that set of data without the aid
of the database which you have at hand?
Furthermore, is it going to be illogical
to send that set of data through
the sql statement compiler?
You will likely find this post:
From: Delbert Glass (delbert@.noincoming.com)
Subject: Re: IN clause very slow
Newsgroups: microsoft.public.sqlserver.programming
Date: 2002-07-26 13:50:16 PST
(and the thread containing it) of interest.
You might even wonder
whether or not the sql statement compiler
as been improved since that time.
Ponder:
Isn't the principal illogicalness of sending
a set of data through the sql statement compiler
independent of how much effort is incurred?
Bye,
Delbert Glass
"Raghu" <Raghu@.nospamzzzqcsi.com> wrote in message
news:eWQmNPf5DHA.1804@.TK2MSFTNGP12.phx.gbl...
> You raised all good questions. However I am still in the early stages.
> 1) Is there a difference between IN and OR in the way I used it?
> 2) When using lots of values, will there be a difference between using IN
or
> OR operator?
> If the only answer to above question is to try, then I will do that. I am
> looking for some logical explanation.
> Thanks.
> Raghu/..
> "Aaron Bertrand - MVP" <aaron@.TRASHaspfaq.com> wrote in message
> news:%23KyzM$e5DHA.2432@.TK2MSFTNGP10.phx.gbl...
> > > Certainly the latter query is concise. Also I have a clustered index
on
> > > CustID. Should I prefer former or latter?
> >
> > What does "should" mean? At what point does conciseness beat out
> > performance? Did you try both queries and compare performance? Did you
> see
> > any difference in the execution plans?
> >
> > > Also if I need construct a query with IN/OR operator that contains 200
> or
> > > 300 values in it, will performance degrade?
> >
> > Did you try it?
> >
> > > Is there an alternative?
> >
> > Stuff your list into a single-column (one row per list value) table,
maybe
> > even a #temp table, and perform an inner join?
> >
> > --
> > Aaron Bertrand
> > SQL Server MVP
> > http://www.aspfaq.com/
> >
> >
>|||>SELECT * FROM Customer WHERE CustID = 1 OR CustID = 5 OR CustID = 10
>can also be written as:
>SELECT * FROM Customer WHERE CustID IN (1, 5, 10)
For these examples,
one can expect the IN version to be converted to the OR version.
(Take a guess what you will see if you point
your mouse at the different operators in the plans.)
You could also write:
SELECT * FROM Customer WHERE CustID = 1
UNION ALL
SELECT * FROM Customer WHERE CustID = 5
UNION ALL
SELECT * FROM Customer WHERE CustID = 10
depending on what version of which database engine
you are using there can be a big difference
between how well the different versions execute.
Bye,
Delbert Glass
"Raghu" <Raghu@.nospamzzzqcsi.com> wrote in message
news:efrJJ5e5DHA.2576@.TK2MSFTNGP11.phx.gbl...
> Is there any preference to using IN versus OR operator in a WHERE clause?
> The following query:
> SELECT * FROM Customer WHERE CustID = 1 OR CustID = 5 OR CustID = 10
> can also be written as:
> SELECT * FROM Customer WHERE CustID IN (1, 5, 10)
> Certainly the latter query is concise. Also I have a clustered index on
> CustID. Should I prefer former or latter?
> Also if I need construct a query with IN/OR operator that contains 200 or
> 300 values in it, will performance degrade? Is there an alternative? The
> BETWEEN operator is not an option for me.
> Thanks.
> Raghu/..
>|||> 2) When using lots of values, will there be a difference between using IN or
> OR operator?
As Jaxon mentioned, the Query Processor will convert the IN to a bunch of OR's before the optimizer
gets is hand on the query. However, I've seen cases where a LOT of values produces bad performance
(search KB). In this case, creating a table and do JOIN/IN/EXISTS against that table should be
better.
--
Tibor Karaszi, SQL Server MVP
Archive at: http://groups.google.com/groups?oi=djq&as_ugroup=microsoft.public.sqlserver
"Raghu" <Raghu@.nospamzzzqcsi.com> wrote in message news:eWQmNPf5DHA.1804@.TK2MSFTNGP12.phx.gbl...
> You raised all good questions. However I am still in the early stages.
> 1) Is there a difference between IN and OR in the way I used it?
> 2) When using lots of values, will there be a difference between using IN or
> OR operator?
> If the only answer to above question is to try, then I will do that. I am
> looking for some logical explanation.
> Thanks.
> Raghu/..
> "Aaron Bertrand - MVP" <aaron@.TRASHaspfaq.com> wrote in message
> news:%23KyzM$e5DHA.2432@.TK2MSFTNGP10.phx.gbl...
> > > Certainly the latter query is concise. Also I have a clustered index on
> > > CustID. Should I prefer former or latter?
> >
> > What does "should" mean? At what point does conciseness beat out
> > performance? Did you try both queries and compare performance? Did you
> see
> > any difference in the execution plans?
> >
> > > Also if I need construct a query with IN/OR operator that contains 200
> or
> > > 300 values in it, will performance degrade?
> >
> > Did you try it?
> >
> > > Is there an alternative?
> >
> > Stuff your list into a single-column (one row per list value) table, maybe
> > even a #temp table, and perform an inner join?
> >
> > --
> > Aaron Bertrand
> > SQL Server MVP
> > http://www.aspfaq.com/
> >
> >
>|||Good idea.
I'd never heard of this trick.
However, I'm not so proud as not to piggy back on it....:-)
If creating a Table to dump the values into, I would recommend a Table
Variable as opposed to a Temp Table.
Cheers,
Greg Jackson
PDX, Oregon|||> If creating a Table to dump the values into, I would recommend a Table
> Variable as opposed to a Temp Table.
I suggest trying both. In this case, a table variable seems reasonable, I
admit. Just to point out that table variables are sometimes slower than temp
tables (one reason is that the optimizer doesn't care about collecting any
statistics for temp variables).
--
Tibor Karaszi, SQL Server MVP
Archive at:
http://groups.google.com/groups?oi=djq&as_ugroup=microsoft.public.sqlserver
"Jaxon" <GregoryAJackson@.hotmail.com> wrote in message
news:uvzsqdp5DHA.2300@.TK2MSFTNGP10.phx.gbl...
> Good idea.
> I'd never heard of this trick.
> However, I'm not so proud as not to piggy back on it....:-)
> If creating a Table to dump the values into, I would recommend a Table
> Variable as opposed to a Temp Table.
>
> Cheers,
>
> Greg Jackson
> PDX, Oregon
>|||cool.
again, another thing I did not know.
thanks
GAJ|||True, I also tend to recommend #temp tables when version hasn't been
indicated (such as this case), and even when version is 2000 but it's not
explicitly stated that the code ONLY has to run on 2000 servers with 80
compat.
--
Aaron Bertrand
SQL Server MVP
http://www.aspfaq.com/
"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:#abpwkp5DHA.2300@.TK2MSFTNGP10.phx.gbl...
> > If creating a Table to dump the values into, I would recommend a Table
> > Variable as opposed to a Temp Table.
> I suggest trying both. In this case, a table variable seems reasonable, I
> admit. Just to point out that table variables are sometimes slower than
temp
> tables (one reason is that the optimizer doesn't care about collecting any
> statistics for temp variables).
> --
> Tibor Karaszi, SQL Server MVP
> Archive at:
>
http://groups.google.com/groups?oi=djq&as_ugroup=microsoft.public.sqlserver
>
> "Jaxon" <GregoryAJackson@.hotmail.com> wrote in message
> news:uvzsqdp5DHA.2300@.TK2MSFTNGP10.phx.gbl...
> > Good idea.
> >
> > I'd never heard of this trick.
> >
> > However, I'm not so proud as not to piggy back on it....:-)
> >
> > If creating a Table to dump the values into, I would recommend a Table
> > Variable as opposed to a Temp Table.
> >
> >
> > Cheers,
> >
> >
> > Greg Jackson
> > PDX, Oregon
> >
> >
>|||First recall *where* the set of values is,
and then ponder the "getting that set of values from there into a table
variable".
Bye,
Delbert Glass|||If this is the case, is it not the job of sql server to provide this kind of
optimaztion?
"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:erEkd3k5DHA.2748@.TK2MSFTNGP09.phx.gbl...
> > 2) When using lots of values, will there be a difference between using
IN or
> > OR operator?
> As Jaxon mentioned, the Query Processor will convert the IN to a bunch of
OR's before the optimizer
> gets is hand on the query. However, I've seen cases where a LOT of values
produces bad performance
> (search KB). In this case, creating a table and do JOIN/IN/EXISTS against
that table should be
> better.
> --
> Tibor Karaszi, SQL Server MVP
> Archive at:
http://groups.google.com/groups?oi=djq&as_ugroup=microsoft.public.sqlserver
>
> "Raghu" <Raghu@.nospamzzzqcsi.com> wrote in message
news:eWQmNPf5DHA.1804@.TK2MSFTNGP12.phx.gbl...
> > You raised all good questions. However I am still in the early stages.
> >
> > 1) Is there a difference between IN and OR in the way I used it?
> > 2) When using lots of values, will there be a difference between using
IN or
> > OR operator?
> >
> > If the only answer to above question is to try, then I will do that. I
am
> > looking for some logical explanation.
> >
> > Thanks.
> > Raghu/..
> >
> > "Aaron Bertrand - MVP" <aaron@.TRASHaspfaq.com> wrote in message
> > news:%23KyzM$e5DHA.2432@.TK2MSFTNGP10.phx.gbl...
> > > > Certainly the latter query is concise. Also I have a clustered index
on
> > > > CustID. Should I prefer former or latter?
> > >
> > > What does "should" mean? At what point does conciseness beat out
> > > performance? Did you try both queries and compare performance? Did
you
> > see
> > > any difference in the execution plans?
> > >
> > > > Also if I need construct a query with IN/OR operator that contains
200
> > or
> > > > 300 values in it, will performance degrade?
> > >
> > > Did you try it?
> > >
> > > > Is there an alternative?
> > >
> > > Stuff your list into a single-column (one row per list value) table,
maybe
> > > even a #temp table, and perform an inner join?
> > >
> > > --
> > > Aaron Bertrand
> > > SQL Server MVP
> > > http://www.aspfaq.com/
> > >
> > >
> >
> >
>|||>> If this is the case, is it not the job of sql server to provide this kind
of optimaztion? <<
Good question. The answer is, to some extent yes. However, there are some
factors one needs to consider.
First, redundancy in SQL is one problem. SQL as a language has a some
superfluous clauses and constructs and developing cost based optimization
for all diverse permutations is not typically easy. And this impacts not
just SQL Server, but development of all DBMSs that implements some flavor of
SQL as the primary data manipulation language.
Another fact with performance is that, apart from the DBMS optimization a
lot of factors which are implementation specific. Achieving perfection with
optimization, even with principled approaches, for all applicable
implementation scenarios is truly difficult, if not unattainable. Also, note
that the query manager and optimizer in SQL Server 2000 are far advanced
than some of those of the contemporaries. Yes, there is room to improve and
it gets better as the product evolves.
--
Anith

No comments:

Post a Comment