Result is showing differences when keys match exactly

kevin · October 10, 2007 9:56PM

On a few of my tables, the comparison comes up with items Missing on one side and Extra items on the other side but when reviewing the key information, the key fields are identical and therefore should have been match... it does not seem to be able to match up the data all the time although it does work for some records on the table.

This occurs on Tables or Views and in keys which are one or many fields so i can't seem to narrow it down to anything in particular.

Anything I'm missing ?

Robert C · October 11, 2007 3:13AM

Hmm... that obviously shouldn't be happening if the keys do actually match. What datatype are the columns that make up your comparison key?

Also, am I right in thinking you're comparing two live databases here, rather than backups?

Thanks,
Robert

Charles · October 19, 2007 6:44PM

I'm experiencing the same thing, though this is the 1st situation where I know for sure it's happening.

I'm comparing two tables with identical schemas, one on the local (2005) machine, and one on a remote (2000) machine.

I'm using a 2-field index as a key, and because the tables involved are large (1.2 billion rows) and the suspect portions are relatively small, I'm using a WHERE clause on the key to limit the area checked. (Yes, this is the same pair of tables I discuss 3 threads after this one)

When I limit the fields to compare to JUST the keys, the comparison results are thus (a only, diff, b only, same):

92599, 0, 0, 566192 (total 658,791)

This is what I expect to see. However, if I allow it to match ALL fields, then these are the numbers I get:

636788, 36, 544189, 21967 (total 1,202,980)

Needless to say, this freaked me out big time until I figured out that it wasn't real.

So, at this point, I can't use the program to sync these two tables; all I can do is use it to generate inserts for the keys, send it to a file, and manipulate it to build a list of keys to feed into a query to bcp out of the one database and into the other.

The behavior is duplicable every time right now; I'd be happy to work with you in the next couple of days to track down what's causing the glitch.

BTW, after reading another post I think I should add --though it shouldn't have anything at all to do with it -- that the 2005 table is partitioned, the 2000 is not. All data being compared is in the same partition.

Chris Spencer · October 19, 2007 7:17PM

Charles,

This definitely sounds like something isn't working as it should. To help us attempt to replicate this it would be useful to know the datatypes of the 2 fields in the key. Presumably the WHERE clause is filtering using both these fields. The more information you can give us the better.

Regards
Chris

Charles · October 19, 2007 8:46PM

Table Definition:

CREATE TABLE &#91;dbo&#93;.&#91;cider_detail&#93;&#40;
	&#91;trsource&#93; &#91;char&#93;&#40;1&#41; COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
	&#91;sourceid&#93; &#91;int&#93; NOT NULL,
	&#91;subscriberid&#93; &#91;smallint&#93; NOT NULL,
	&#91;logon&#93; &#91;varchar&#93;&#40;10&#41; COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
	&#91;clientnum&#93; &#91;varchar&#93;&#40;10&#41; COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
	&#91;accountnum&#93; &#91;varchar&#93;&#40;10&#41; COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
	&#91;accountid&#93; &#91;int&#93; NULL,
	&#91;datetime&#93; &#91;datetime&#93; NULL,
	&#91;Date&#93; &#91;datetime&#93; NULL,
	&#91;Time&#93; &#91;datetime&#93; NULL,
         ...and a bunch more &#40;27 fields total&#41;
&#41; ON &#91;psCider&#93;&#40;&#91;trsource&#93;&#41;

Indices:

CREATE NONCLUSTERED INDEX &#91;IX_cider_detail_sourceid&#93; ON &#91;dbo&#93;.&#91;cider_detail&#93; 
&#40;
	&#91;trsource&#93; ASC,
	&#91;sourceid&#93; ASC
&#41;WITH &#40;PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF&#41; ON &#91;psCider&#93;&#40;&#91;trsource&#93;&#41;

CREATE CLUSTERED INDEX &#91;ix-clust_cider_detail_sub_date&#93; ON &#91;dbo&#93;.&#91;cider_detail&#93; 
&#40;
	&#91;subscriberid&#93; ASC,
	&#91;Date&#93; ASC,
	&#91;Time&#93; ASC
&#41;WITH &#40;PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF&#41; ON &#91;psCider&#93;&#40;&#91;trsource&#93;&#41;

(of course, on the 2000 machine, they're on a filegroup, rather than a partition scheme)

WHERE clause:

trsource='f' and sourceid between
 49428637 and 50087444

trsource/sourceid form a unique key for this group of records (though it is not guaranteed unique for all values of trsource)

That's all I can think of now that might be relevant; please feel free to request any other specific info.

richardjm · October 22, 2007 3:40AM

Just looking at your numbers there I'm not sure the WHERE clause is working as you expect it to on the database that is in 'a'.

566192 (your key only total) = 544189 + 36 + 21967 + (3 which I assume is the database being used)

So the in1 (a) values are the rest of the database.

I'd check how you've got the WHERE clause setup in the tables & views tab of the options. Apart from that have a look at the .sdc file itself which is in XML and there is a set of 'User actions' at the bottom of the file where the WHERE clause should also be - to make sure it's in there correctly.

kevin · October 22, 2007 10:16AM

Sorry guys, we fixed our issue but I neglected to post a resolution.

We discovered a couple of option settings that managed to resolve the issue although I can't say for sure why. If you are in Edit Project and click on options, we checked all the boxes in the Comparison Behaviour section. Specifically Trim Trailing Spaces and Force Binary Collation.

Hope that helps

Charles · October 22, 2007 11:09AM

kevin wrote:

We discovered a couple of option settings that managed to resolve the issue

You managed to duplicate the problem!

You couldn't duplicate it with those options set...

I had started off with none of the comparison options checked. I checked the top two (trim, force), no change. I tried it with the 3rd (show) checked, still no result. I unchecked "show" just so I could force the 4th (use) to be checked, then re-checked it so that all 4 boxes show checked...and still no diffrence. It didn't solve the problem on my system.

Charles · October 25, 2007 2:12PM

Any more ideas? This is a pretty important table to us, and it would be nice to know that the product we're considering buying will do what we need it to.

Thanks!

richardjm · October 26, 2007 4:17AM

I've created your table on a test server and populated it with 100,000 rows of test data covering the WHERE clause ranges and I can't seem to get it to fail in the way you're seeing.

Like I said before it's as if when you include all the extra columns for comparison it stops using the WHERE clause. I was confused by your comment of 'Needless to say, this freaked me out big time until I figured out that it wasn't real.' - what do you mean it wasn't real?

Also can I check you're using SQL Data Compare 6.

If you can send me your .sdc project files to richard.mitchell@red-gate.com
I'll have a look at them and see if there is anything obvious wrong with the user actions.

Not sure what else to suggest at the moment as it doesn't seem to be a schema problem. The only other thing is if you have strange characters in your other columns names. You can run a profiler on the database in both situations to see exactly what SQL SDC6 is sending to request the rows - that may help.

Charles · October 26, 2007 5:17PM

I'm sorry, Richard, I totally missed your post right before Kevin's! My fault.

richardjm wrote:

I was confused by your comment of 'Needless to say, this freaked me out big time until I figured out that it wasn't real.' - what do you mean it wasn't real?

Machine A is replicating to machine B. I knew that there were records on A that never made it over to B (92k, as confirmed later by doing the Key-only compare), but it was rather shocking to see it initially report not only 500k of those, but another 600k that supposedly existed on B but weren't on A! (I also spot-checked those records and verified that they DID indeed exist on A, contrary to what it claimed.)

richardjm wrote:

Also can I check you're using SQL Data Compare 6.

Yes, build #1124. BTW, I like the photo that appears after 30 seconds in the about box. It's a bit classier than one of my products where it brings up a cheap pong game with everyone's heads as the balls. 8)

richardjm wrote:

If you can send me your .sdc project files

Okay. I glanced through it and didn't see anything conclusive. I'll pass it on to you.

richardjm wrote:

You can run a profiler on the database in both situations

That's a good suggestion. I also might play with some other things to simplify it and see where I can get the problem to drop out. For example, I'm using a compound key, but my WHERE clause limits the 1st field of the key to a single value, so there's no real need to use that field in the join. Anyway, I'll need to wait until tonight, because the queries murder my data load (in addition to guaranteeing inconsistencies).

Thanks again!

Charles · October 26, 2007 5:54PM

Charles wrote:

BTW, It's a bit classier than one of my products where it brings up a cheap pong game

Oh, how fun -- it *is* a game! Turns into a little puzzle after a while. I like it!

Charles · October 26, 2007 9:52PM

Problem solved. Yes, I was doing something stupid. :oops:

I had been making changes to the WHERE clause and didn't want to lose track of them, so the code looked something like this:

fld1 = 'x' and 
fld2 between 100 and 300
--fld2 between 50 and 600

Bad decision to use that style of comment -- you never know what's going to be appended to it! Indeed, it was coming out as:

fld1 = 'x' and 
fld2 between 100 and 300
--fld2 between 50 and 600 ORDER BY fld1,fld2

Of course, the ORDER BY was commented out!

Adding an extra blank line after the comment would fix the problem, but if you must include a comment, it's safer to use this style:

fld1 = 'x' and 
fld2 between 100 and 300
/*fld2 between 50 and 600*/

so there's a definite closure.

Strangely, list compares work much better when the two lists are in the same order.... :roll:

Thanks for the help, guys!

richardjm · October 29, 2007 3:42AM

I just saw your email with the details. Like I said I'd never have thought of people putting comments into the WHERE clause however we should certainly cope with it.

The SQL Profiler would have shown that up in it's trace I'm sure of it.

We must replace that game at some point too, there's a nice bug in it where it's only actually solvable 50% of the time :twisted:

Glad it's working for you and I'll place a bug in our system to get it patched for the next release.

Result is showing differences when keys match exactly

Comments

Product Learning

Community Forums

Events & Friends

Simple Talk