Identifying Watermarks problems on Paraccel (Actian Matrix)
Posted by Diego em Julho 8, 2014
Watermarks control the next available value on an identity column. Every time you insert a value on a table that contains an identity, paraccel queries the identity highwater table to check which the next available value on a particular slice is. It’s important to notice that once that value has been retrieved, paraccel, when inserting on the destination table, does not check if it clashes with an existing one (remember that constraints are not enforced).
That being said, if the watermarks get “out of sync” with the actual table, you may end up with duplicate value on an identity field, which is never a good thing.
Now, I won’t get into the details of how it can happen (believe me, it can), instead, I’ll focus on how to identify these clashes.
So, the watermarks are controlled per table\col\slice and to check it, first you will need to get the table id, which is hidden on the pg_class table (you have to specify the oid field, otherwise it will not show with “*”):
select oid, * from pg_class where relname = ‘<table>’ –1006571
You can also check the max value of you key:
select max(key) from <table> –160 016
Now, let’s check the highwater values by querying the stv_identity_highwater using the tableid:
select * FROM stv_identity_highwater where tbl = 1006571
That’s nice, it tells us for example that the next row inserted on slice 0 will get a value above 159871 (next next value is dependent on the watermark and the number of slices on the cluster), but we can’t really identity any problems just using this table.
There is a second table we can use, called stv_blocklist, which contains (among other things – of course) the maximum key value contained on each slice (It also contains the “num_values” (number of rows) on that slice so I guess here is the explanation why counts are extremely fast. Just a guess, I may be completely wrong here).
So if we query this table and compare its data with the previous one, we can easily identify problems.
where tbl = 1006571 and col=0 and slice in (0,5,9) order by col, slice
In this example, slice 0 is OK because the watermark is 159871 and the max value on the slice is 159683, but slice 5 has a problem because the watermark is 159840 and the max value on the slice is 159868.
We can also write a query that does this verification for us, and it would look something like this:
select * from (
select h.tbl as tbl_id,h.col as col,h.slice as slice
, max(h.highwater) as highwater, max(b.maxvalue) as maxvalue
, case when max(b.maxvalue) > max(h.highwater) then 1 else 0 end AS invalid
from stv_identity_highwater h
left join stv_blocklist b on
h.tbl = b.tbl
and h.col = b.col
and h.slice = b.slice
where h.tbl = 1006571
group by h.tbl,h.col,h.slice
) X where invalid = 1
Once the invalid water marks have been identified, a few steps must be taken to fix them, which I’ll focus on another post.