I have never seen anything like this, so I am quite baffled:
I have a large (6 million rows) table in a data warehouse. Because of a new user requirement, I ran an update on that table to update two columns changing the value from null to a 'real' value.
I ran the update and it completed in 14 minutes.
Now I query the table searching for a count of the records where the value in one of the columns is null. And I keep getting different answers; the results vary by as much as 100,000 records.
Here are the scripts:
CREATE TABLE TASKTRN (
TASKTRNKEY VARCHAR(10) NOT NULL,
TASKHDRKEY VARCHAR(10) NOT NULL,
TASKDTLKEY VARCHAR(10) NULL,
RECEIPTKEY VARCHAR(20) NULL,
RECEIPTLINE VARCHAR(5) NULL,
TASKTYPE INT NOT NULL
)
GO
ALTER TABLE TASKTRN ADD
CONSTRAINT PK_TASKTRN PRIMARY KEY CLUSTERED (TASKTRNKEY)
GO
CREATE INDEX TASKTRN_TASKHDRKEY ON TASKTRN (TASKHDRKEY)
GO
CREATE INDEX TASKTRN_RECEIPTKEY ON TASKTRN (RECEIPTKEY)
GO
/****** Object: Table [dbo].[TASK_TMP] Script Date: 03/21/2003 12:27:40 ******/
if not exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[TASK_TMP]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
BEGIN
CREATE TABLE [TASK_TMP] (
[TASKTRNKEY] [char] (10) NOT NULL ,
[TASKHDRKEY] [char] (10) NULL ,
[RECEIPTKEY] [char] (10) NULL ,
[RECEIPTLINE] [char] (5) NULL
) ON [PRIMARY]
END
Print 'Created Temp table'
CREATE INDEX TASK_TMP_TASKTRNKEY ON TASK_TMP (TASKTRNKEY)
CREATE INDEX TASK_TMP_TASKHDRKEY ON TASK_TMP (TASKHDRKEY)
Print 'created Indexes'
INSERT INTO TASK_TMP
SELECT TASKTRNKEY, TASKHDRKEY, RECEIPTKEY, RECEIPTLINE
FROM TASKTRN
WHERE TASKTYPE = 1 AND RECEIPTKEY IS NULL
Print 'Insert Records'
UPDATE TASK_TMP
SET RECEIPTKEY = B.RECEIPTKEY, RECEIPTLINE = B.RECEIPTLINE
FROM
TASK_TMP A JOIN
(SELECT TASKHDRKEY, RECEIPTKEY, RECEIPTLINE
FROM TASKTRN
WHERE TASKTYPE = 1 AND RECEIPTKEY IS NOT NULL) B ON
A.TASKHDRKEY = B.TASKHDRKEY
Print 'Updated null values in temp table'
UPDATE TASKTRN
SET RECEIPTKEY = B.RECEIPTKEY, RECEIPTLINE = B.RECEIPTLINE
FROM
TASKTRN A JOIN
TASK_TMP B ON
A.TASKTRNKEY = B.TASKTRNKEY
Print 'Updated null values in permanent table'
This is the SQL that generates the disparate results:
select count(tasktrnkey) From tasktrn_roc where tasktype = 1 and receiptkey is null
Does anyone have any idea what may be going on?
Regards,
Hugh Scottbad statistics after the update, maybe?
after the update (or large inserts), try running sp_updatestats and see if it affects your results.
running DBCC INDEXDEFRAG might be a good idea as well
-isaac|||Yep,
I did that and still came up with some funky results. Now the mystery deepens a little further.
I ran a two different queries:
SELECT COUNT(TASKTRNKEY) WHERE RECEIPTKEY IS NULL AND TASKTYPE = 1
SELECT COUNT(TASKTRNKEY) WHERE RECEIPTKEY IS NOT NULL AND TASKTYPE = 1
In theory the sum of these two queries should add up to:
SELECT COUNT(TASKTRNKEY) WHERE TASKTYPE = 1
Didn't work; the sum of the results from the first two queries is slightly less than twice the actual number of records where TASKTYPE = 1.
Finally, I ran this query:
SELECT
CASE
WHEN RECEIPTKEY IS NULL THEN 'Null'
ELSE 'Not Null'
END as 'RECEIPTKEY',
COUNT(TASKTRNKEY)
FROM
TASKTRN
WHERE
TASKTYPE = 1
GROUP BY
CASE
WHEN RECEIPTKEY IS NULL THEN 'Null'
ELSE 'Not Null'
END
This returned what I expected it to return. But I am baffled to explain why or why the other select statements return such bizarre and conflicting results.
Regards,
Hugh Scott
Originally posted by isaacfain
bad statistics after the update, maybe?
after the update (or large inserts), try running sp_updatestats and see if it affects your results.
running DBCC INDEXDEFRAG might be a good idea as well
-isaac
No comments:
Post a Comment