sqlserver2000 date: Incorrect query stats from full text engine

Hi
We have a table that is approx 800,000 rows with about 6 columns of which 3
are full text indexed.
This table is full text queried then the results are used to filter other
tables in a
"select * from maintable where category=.. and primaryid in (select key from
CONTAINSTABLE(fulltexttable, Keywords, 'FORMSOF(INFLECTIONAL, "glass")'))"
Every so often we get a slow perfoming query with these. Putting them into
Management Studio and asking for a query plan shows its the remote scan
consuming all the time. Hovering the mouse over the "Remote Scan" to get the
statistics gives wildly inaccurate estimates of number of rows, and then also
gives impossible number of "Actual Rows Returned" e.g. on this query I have
just done it has claims to have returned 13,350,965 rows - this from a table
that only has approx 800,000 rows.
The indexed columns may have duplicate words but checking the specific
example the searched for word definately appears less that 800,000 times in
the whole table.
Any one got any suggestions?
Thanks
Chris
Just to say, is MS SQL 2005 Standard Edition SP2 64 bit edition
|||The statistics returned my SQL FTS are not accurate as you have discovered.
You should work on tuning other parts of the query to remove spooling, etc.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"chrisredburn" <chrisredburn@.discussions.microsoft.com> wrote in message
news:DC403EF1-81FC-42FE-BF36-AA8457E48755@.microsoft.com...
> Hi
> We have a table that is approx 800,000 rows with about 6 columns of which
> 3
> are full text indexed.
> This table is full text queried then the results are used to filter other
> tables in a
> "select * from maintable where category=.. and primaryid in (select key
> from
> CONTAINSTABLE(fulltexttable, Keywords, 'FORMSOF(INFLECTIONAL, "glass")'))"
> Every so often we get a slow perfoming query with these. Putting them
> into
> Management Studio and asking for a query plan shows its the remote scan
> consuming all the time. Hovering the mouse over the "Remote Scan" to get
> the
> statistics gives wildly inaccurate estimates of number of rows, and then
> also
> gives impossible number of "Actual Rows Returned" e.g. on this query I
> have
> just done it has claims to have returned 13,350,965 rows - this from a
> table
> that only has approx 800,000 rows.
> The indexed columns may have duplicate words but checking the specific
> example the searched for word definately appears less that 800,000 times
> in
> the whole table.
> Any one got any suggestions?
> Thanks
> Chris
|||In the execution plan returned for this query, the remote scan cost is 90% of
the query. The plan says that the next stage that the FTS gets passed to is
a filter that takes 1% and reduces the 13,000,000 down to 110 rows (the final
number of rows returned). The next largest part of the query, at 4%, is an
Index Seek. The rest is made up of 1 merge join, 2 inner joins and a few
filters, but as these are only working on a small (approx 600) number of rows
they aren't appearing to take any time up.
Surely if the statistics are wrong, then the query planneris going to start
making bad choices about how to plan the query?
"Hilary Cotter" wrote:

> The statistics returned my SQL FTS are not accurate as you have discovered.
> You should work on tuning other parts of the query to remove spooling, etc.
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "chrisredburn" <chrisredburn@.discussions.microsoft.com> wrote in message
> news:DC403EF1-81FC-42FE-BF36-AA8457E48755@.microsoft.com...
>
>
|||A profiler trace of the query, showing SQL:BatchCompleted and
SQL:FullTextQuery events for the query, has...
SQL:FullTextQuey Duration = 164596
SQL:BatchCompleted Duration = 165274
|||I'd recommend look at a couple of 2 options:
1) use top_n_by_rank parameter if don't have to retrieve all of the results
but just a window (say top 100). The "n" value for the should be in this case
100*avg selectivity of the non-ft part (or actually smaller - the smaller,
the better)
2) If this is a plan issue, consider using OPTIMIZE FOR parameter to hint
the FT search string to CONTAINSTABLE.
3) make sure your index is not too fragmented; reorganization of an index
will improve card estimates. FTS cardinality estimation typically works ok
for single terms but worse for expressions.
Regards,
-Denis.
"chrisredburn" wrote:
[vbcol=seagreen]
> In the execution plan returned for this query, the remote scan cost is 90% of
> the query. The plan says that the next stage that the FTS gets passed to is
> a filter that takes 1% and reduces the 13,000,000 down to 110 rows (the final
> number of rows returned). The next largest part of the query, at 4%, is an
> Index Seek. The rest is made up of 1 merge join, 2 inner joins and a few
> filters, but as these are only working on a small (approx 600) number of rows
> they aren't appearing to take any time up.
> Surely if the statistics are wrong, then the query planneris going to start
> making bad choices about how to plan the query?
>
> "Hilary Cotter" wrote:

Wednesday, March 21, 2012

Incorrect query stats from full text engine

No comments:

Post a Comment

sqlserver2000 date

Blog Archive

About Me