Row count based on table + index + statistics histogram

121 Views Asked by At

I have nothing more than the table and the column name.
Based on that, I'm looking for the row count grouped by the given columnname. Yes, I could just run ...

select columnname, count_big(*) from schema.tablename group by columnname

But that means eating up valuable resources. (IO etc.)
I wonder, can't I not just collect what I need from the statistics histogram?
After all, all I need is a narrow index on the given column.
But first of all, I need to collect the name of the narrow index.
For this task I run this query :

select st.name, si.name, si.index_id, sc.column_id
from sys.tables st
    join sys.indexes si on si.[object_id] = st.[object_id]
    join sys.index_columns sic on sic.[index_id] = si.[index_id] and sic.[object_id] = si.[object_id]
    join sys.columns sc on sic.column_id = sc.column_id and sc.[object_id] = sic.[object_id]
where st.[object_id] = object_id('schemaname.tablename', 'U')
and lower(sc.[name]) = 'columnname'

This query gives me every index name, even the composite wide indexes. And that's not quite what I need.
You might say, it doesn't matter, since both index histograms will contain the exact row count for the attribute of the column.
And you'd be right.

Here is the really ugly problem.
Statistics are updated automatically when roughly 20% of the tables content changes. But I want my numbers now. So I'll have to run a update statistics. Running the update statistics on a wide index takes time and eats up resources I meant to save.
So, I need my 1 column narrow index. (Yeay, first real good use of a narrow index! ) Because that takes no time at all to update.

Back to my query.
If I could get my query to show me how many columns the individual index holds, I'd just set the predicate something like "columncount = 1".

Do you know what to add to the statement above, in order to show the columns contained within the index?

But why all this?
I need a report showing how many rows in every staging table are new and how many have been processed by my ETL.
I need this report on a daily basis.
The column I'm always looking for will have a couple distinct values, so the histogram for the narrow index will have max 5 steps.

1

There are 1 best solutions below

0
On

I got it.

select schema_name(st.schema_id), st.name, si.[name], si.index_id, sc.column_id, ccnt.colcnt
from sys.tables st
    join sys.indexes si on si.[object_id] = st.[object_id]
    join sys.index_columns sic on sic.[index_id] = si.[index_id] and sic.[object_id] = si.[object_id]
    join sys.columns sc on sic.column_id = sc.column_id and sc.[object_id] = sic.[object_id]
    join (select isi.index_id, isi.[object_id], count_big(*) as colcnt
            from sys.indexes isi 
                join sys.index_columns isic on isic.[index_id] = isi.[index_id] and isic.[object_id] = isi.[object_id]
                join sys.columns isc on isic.column_id = isc.column_id and isc.[object_id] = isic.[object_id]
            where isic.is_included_column = 0
            and isi.[object_id] = object_id('schemaname.tablename', 'U')
            group by isi.index_id, isi.[object_id]
            ) as ccnt on ccnt.index_id = si.index_id
where sic.is_included_column = 0
and lower(sc.[name]) = 'columname'
and st.[object_id] = ccnt.[object_id]
and ccnt.colcnt = 1