Friday, March 9, 2012

Inconsistent Horrible Performance on UPDATE

We're trying to figure out what is causing an incredible inconsistency in
our query performance on the same update. query
We have a table with ~75 million records, it is indexed on a field which
breaks the table into about 25 chunks of ~2 - 3 million records.
If we run an update on a particular chunk of this table ( WHERE idField = 25000 ) and idField is set up in 2-3 million chunks with idFields from 25000
to 25025 It runs for upwards of 13 hours.
If we select out these 2-3 millions records, run an update on them delete
them from the original table, then insert them back changed it can be done
in under 2 hours.
We've been finding SQL server isn't performing well with recrodsets over 5
million. Fillfactors on tables with lots of writing are often set VERY low
~25%, wasting 30 or so GB for the sake of speed is often the route we must
go.
We suspected page splitting, but found 0 page splits occurring during these
updates, but found the reading and writing of these transactions to be maxed
out.
Also The Enterprise Manager shows the Wait Type as PAGEIOLATCH_EX and _SH.
What is causing this? All the drives are defragmented.
Also with the databases in Simple Recovery mode, why are the transaction
logs filling up?
Thanks for any information on these problems.The transaction log filling up is normal since the update is rather large
and the complete action needs to be logged and will only checkpoint once the
transaction is complete.
Could you provide DDL for the table (include the attached indexes) and the
DML (statments used during large update) plus a brief server description,
Disk. Memory, CPU.
I suspect that the logging and the index updates are what is taking so long
without these specifics though it's tough to give advice.
-Lars
"Kory Yingling" <kory@.removeme-mlsc.com> wrote in message
news:e7p#5$pWDHA.2568@.tk2msftngp13.phx.gbl...
> We're trying to figure out what is causing an incredible inconsistency in
> our query performance on the same update. query
> We have a table with ~75 million records, it is indexed on a field which
> breaks the table into about 25 chunks of ~2 - 3 million records.
> If we run an update on a particular chunk of this table ( WHERE idField => 25000 ) and idField is set up in 2-3 million chunks with idFields from
25000
> to 25025 It runs for upwards of 13 hours.
> If we select out these 2-3 millions records, run an update on them delete
> them from the original table, then insert them back changed it can be done
> in under 2 hours.
> We've been finding SQL server isn't performing well with recrodsets over 5
> million. Fillfactors on tables with lots of writing are often set VERY
low
> ~25%, wasting 30 or so GB for the sake of speed is often the route we must
> go.
> We suspected page splitting, but found 0 page splits occurring during
these
> updates, but found the reading and writing of these transactions to be
maxed
> out.
> Also The Enterprise Manager shows the Wait Type as PAGEIOLATCH_EX and _SH.
> What is causing this? All the drives are defragmented.
> Also with the databases in Simple Recovery mode, why are the transaction
> logs filling up?
> Thanks for any information on these problems.
>|||Yes. MS has at least two KB articles (don't have the numbers offhand)
regarding parallelism problems. You might to search for them. The
workaround is to disable parallelism for the affected queries. I have also
avoided it by rewriting the query (avoid outer / self joins involving large
datasets). I've only experienced the problem in SELECTs involving large
resultsets - don't do any mass updates.
Note - using SQL Server 7. I believe that the basic issues apply to SS2K as
well.
"Kory Yingling" <kory@.removeme-mlsc.com> wrote in message
news:%23xBZ9%23qWDHA.2444@.tk2msftngp13.phx.gbl...
> Some of these problems we've traced to when the optimizer decides to
> parallelize the process. We believe we've encountered a bug where the
> parallel processing with CXPacket and PAGEIOLATCH waits cause the process
to
> go into a dead lock of some sort but the disk usage and cpu usage for the
> process continues to be used with the disk usage at about an 8:1 ratio
> slowing the entire system dramatically.
> Has Anyone else run into problems with parallelism flaking out on large
> updates?
>
> "Kory Yingling" <kory@.removeme-mlsc.com> wrote in message
> news:e7p%235$pWDHA.2568@.tk2msftngp13.phx.gbl...
> > We're trying to figure out what is causing an incredible inconsistency
in
> > our query performance on the same update. query
> >
> > We have a table with ~75 million records, it is indexed on a field which
> > breaks the table into about 25 chunks of ~2 - 3 million records.
> >
> > If we run an update on a particular chunk of this table ( WHERE idField
=> > 25000 ) and idField is set up in 2-3 million chunks with idFields from
> 25000
> > to 25025 It runs for upwards of 13 hours.
> >
> > If we select out these 2-3 millions records, run an update on them
delete
> > them from the original table, then insert them back changed it can be
done
> > in under 2 hours.
> >
> > We've been finding SQL server isn't performing well with recrodsets over
5
> > million. Fillfactors on tables with lots of writing are often set VERY
> low
> > ~25%, wasting 30 or so GB for the sake of speed is often the route we
must
> > go.
> >
> > We suspected page splitting, but found 0 page splits occurring during
> these
> > updates, but found the reading and writing of these transactions to be
> maxed
> > out.
> >
> > Also The Enterprise Manager shows the Wait Type as PAGEIOLATCH_EX and
_SH.
> > What is causing this? All the drives are defragmented.
> >
> > Also with the databases in Simple Recovery mode, why are the transaction
> > logs filling up?
> >
> > Thanks for any information on these problems.
> >
> >
>|||if you are have trouble with a parallel execution plan,
try it with OPTION(MAXDOP 1) to restrict the plan to 1 CPU.
There are some serious issues with Hyper-threading and
parallel execution plans, if this is applicable, disable
HT.
also, in high row count operations, the plan may spool to
the tempdb, you need to verify a high-performance disk
system for both the main app & tempdb data files, and
separate disk drive for each heavy use log file
>--Original Message--
>Some of these problems we've traced to when the optimizer
decides to
>parallelize the process. We believe we've encountered a
bug where the
>parallel processing with CXPacket and PAGEIOLATCH waits
cause the process to
>go into a dead lock of some sort but the disk usage and
cpu usage for the
>process continues to be used with the disk usage at about
an 8:1 ratio
>slowing the entire system dramatically.
>Has Anyone else run into problems with parallelism
flaking out on large
>updates?
>
>"Kory Yingling" <kory@.removeme-mlsc.com> wrote in message
>news:e7p%235$pWDHA.2568@.tk2msftngp13.phx.gbl...
>> We're trying to figure out what is causing an
incredible inconsistency in
>> our query performance on the same update. query
>> We have a table with ~75 million records, it is indexed
on a field which
>> breaks the table into about 25 chunks of ~2 - 3 million
records.
>> If we run an update on a particular chunk of this table
( WHERE idField =>> 25000 ) and idField is set up in 2-3 million chunks
with idFields from
>25000
>> to 25025 It runs for upwards of 13 hours.
>> If we select out these 2-3 millions records, run an
update on them delete
>> them from the original table, then insert them back
changed it can be done
>> in under 2 hours.
>> We've been finding SQL server isn't performing well
with recrodsets over 5
>> million. Fillfactors on tables with lots of writing
are often set VERY
>low
>> ~25%, wasting 30 or so GB for the sake of speed is
often the route we must
>> go.
>> We suspected page splitting, but found 0 page splits
occurring during
>these
>> updates, but found the reading and writing of these
transactions to be
>maxed
>> out.
>> Also The Enterprise Manager shows the Wait Type as
PAGEIOLATCH_EX and _SH.
>> What is causing this? All the drives are defragmented.
>> Also with the databases in Simple Recovery mode, why
are the transaction
>> logs filling up?
>> Thanks for any information on these problems.
>>
>
>.
>|||Well I found the OPTION MAXDOP 1 to alleviate the problems fixing 13 hour
incomplete queries to finish in under a minute! This has been a nightmare
problem causing a lot of problems and very little information on this to
find this work around.
We have every database on it's own 3 disk RAID 0 stripe, including the
system drive, temp DB, and logs. I don't think we could break it down any
more. This is on a 2GB dual Xeon 2.8Ghz with Hyperthreaded to 4. Is it
possible using Standard edition SQL server and the 3GB switch to get SQL
server to use 3GB of memory? I think we'll grab 2 more GB of memory and
see if we can get it to use some of it.
All out updates (and mostly all queries) which run over 500,000 records
which most do and are upwards of 30 or so million), now run more as expected
with the parallelism forced off. Pretty much a waste of dual cpu's with
hyperthreading.
"joe chang" <jchang6@.yahoo.com> wrote in message
news:01a601c35abb$10276fb0$a501280a@.phx.gbl...
> if you are have trouble with a parallel execution plan,
> try it with OPTION(MAXDOP 1) to restrict the plan to 1 CPU.
> There are some serious issues with Hyper-threading and
> parallel execution plans, if this is applicable, disable
> HT.
> also, in high row count operations, the plan may spool to
> the tempdb, you need to verify a high-performance disk
> system for both the main app & tempdb data files, and
> separate disk drive for each heavy use log file
> >--Original Message--
> >Some of these problems we've traced to when the optimizer
> decides to
> >parallelize the process. We believe we've encountered a
> bug where the
> >parallel processing with CXPacket and PAGEIOLATCH waits
> cause the process to
> >go into a dead lock of some sort but the disk usage and
> cpu usage for the
> >process continues to be used with the disk usage at about
> an 8:1 ratio
> >slowing the entire system dramatically.
> >
> >Has Anyone else run into problems with parallelism
> flaking out on large
> >updates?
> >
> >
> >"Kory Yingling" <kory@.removeme-mlsc.com> wrote in message
> >news:e7p%235$pWDHA.2568@.tk2msftngp13.phx.gbl...
> >> We're trying to figure out what is causing an
> incredible inconsistency in
> >> our query performance on the same update. query
> >>
> >> We have a table with ~75 million records, it is indexed
> on a field which
> >> breaks the table into about 25 chunks of ~2 - 3 million
> records.
> >>
> >> If we run an update on a particular chunk of this table
> ( WHERE idField => >> 25000 ) and idField is set up in 2-3 million chunks
> with idFields from
> >25000
> >> to 25025 It runs for upwards of 13 hours.
> >>
> >> If we select out these 2-3 millions records, run an
> update on them delete
> >> them from the original table, then insert them back
> changed it can be done
> >> in under 2 hours.
> >>
> >> We've been finding SQL server isn't performing well
> with recrodsets over 5
> >> million. Fillfactors on tables with lots of writing
> are often set VERY
> >low
> >> ~25%, wasting 30 or so GB for the sake of speed is
> often the route we must
> >> go.
> >>
> >> We suspected page splitting, but found 0 page splits
> occurring during
> >these
> >> updates, but found the reading and writing of these
> transactions to be
> >maxed
> >> out.
> >>
> >> Also The Enterprise Manager shows the Wait Type as
> PAGEIOLATCH_EX and _SH.
> >> What is causing this? All the drives are defragmented.
> >>
> >> Also with the databases in Simple Recovery mode, why
> are the transaction
> >> logs filling up?
> >>
> >> Thanks for any information on these problems.
> >>
> >>
> >
> >
> >.
> >

No comments:

Post a Comment