> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-1d264819.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> System table containing sampled filter and index selectivity statistics collected from `MergeTree` read pipelines during query execution.

# system.predicate_statistics_log

<Info>
  **Querying in ClickHouse Cloud**

  The data in this system table is held locally on each node in ClickHouse Cloud. Obtaining a complete view of all data, therefore, requires the `clusterAllReplicas` function. See [here](/reference/system-tables/overview#system-tables-in-clickhouse-cloud) for further details.
</Info>

<h2 id="description">
  Description
</h2>

Contains sampled selectivity statistics collected while reading from `MergeTree` tables. The table is populated only when [`predicate_statistics_sample_rate`](/reference/settings/session-settings#predicate_statistics_sample_rate) is greater than `0`.

Use this table to inspect how selective user predicates are in real workloads and how many granules remain after primary-key or skip-index filtering. The data is intended as input for workload-driven index and projection recommendations.

<h2 id="row-shapes">
  Row shapes
</h2>

A single query can produce two kinds of rows in `system.predicate_statistics_log`:

* **Filter rows**, emitted per prewhere/filter step in `MergeTreeSelectProcessor`. They populate `predicate_expression`, `input_rows`, `passed_rows`, `filter_selectivity`, and the whole-predicate columns `total_input_rows`, `total_passed_rows`, `total_selectivity`. Index-related columns are empty.
* **Index rows**, emitted per read step in `ReadFromMergeTree`. They populate the `index_names`, `index_types`, `total_granules`, `granules_after`, and `index_selectivities` arrays, one entry per index stage (primary key, partition, skip indexes). Predicate-related columns are empty.

Filter rows and index rows for the same query share the same `query_id` and `table`, so they can be joined when both are needed.

<h2 id="sampling-and-overhead">
  Sampling and overhead
</h2>

Sampling is controlled by [`predicate_statistics_sample_rate`](/reference/settings/session-settings#predicate_statistics_sample_rate):

* `0` disables collection.
* `1` samples every query.
* `N > 1` samples approximately `1 / N` of queries, hashed by `query_id`.

Lower values produce more data but add CPU work on the read path and more writes to the system log. After enabling the setting, use [`SYSTEM FLUSH LOGS`](/reference/statements/system#flush-logs) if you need rows to appear immediately.

<h2 id="columns">
  Columns
</h2>

* `hostname` ([LowCardinality(String)](/reference/data-types/lowcardinality)) — Hostname of the server executing the query.
* `event_date` ([Date](/reference/data-types/date)) — Event date.
* `event_time` ([DateTime](/reference/data-types/datetime)) — Timestamp when this log entry was written.
* `database` ([LowCardinality(String)](/reference/data-types/lowcardinality)) — Database name of the target table.
* `table` ([LowCardinality(String)](/reference/data-types/lowcardinality)) — Table name of the target table.
* `query_id` ([String](/reference/data-types/string)) — Query ID for linking back to query\_log.
* `predicate_expression` ([String](/reference/data-types/string)) — Whole filter expression handled by this prewhere/filter step (ActionsDAG dump).
* `input_rows` ([UInt64](/reference/data-types/int-uint)) — Rows entering this prewhere/filter step.
* `passed_rows` ([UInt64](/reference/data-types/int-uint)) — Rows surviving this prewhere/filter step.
* `filter_selectivity` ([Float64](/reference/data-types/float)) — Selectivity of this step: passed\_rows / input\_rows.
* `total_input_rows` ([UInt64](/reference/data-types/int-uint)) — Rows entering the first prewhere step (total rows read from granules).
* `total_passed_rows` ([UInt64](/reference/data-types/int-uint)) — Rows surviving all prewhere steps (rows delivered to the query).
* `total_selectivity` ([Float64](/reference/data-types/float)) — Selectivity of the whole predicate: total\_passed\_rows / total\_input\_rows.
* `index_names` ([Array(LowCardinality(String))](/reference/data-types/array)) — Names of indexes applied, e.g. \['PrimaryKey', 'idx\_bf\_status'] (index rows only).
* `index_types` ([Array(LowCardinality(String))](/reference/data-types/array)) — Types of indexes applied: PrimaryKey, Skip, MinMax, Partition (index rows only).
* `total_granules` ([Array(UInt64)](/reference/data-types/array)) — Granules entering each index stage (index rows only).
* `granules_after` ([Array(UInt64)](/reference/data-types/array)) — Granules remaining after each index stage (index rows only).
* `index_selectivities` ([Array(Float64)](/reference/data-types/array)) — Per-index selectivity: granules\_after / total\_granules (index rows only).

<h2 id="example">
  Example
</h2>

```sql theme={null}
SET predicate_statistics_sample_rate = 1;

SELECT *
FROM hits
WHERE URL LIKE '%/product/%' AND EventDate >= today() - 7
FORMAT Null;

SYSTEM FLUSH LOGS predicate_statistics_log;

SELECT
    query_id,
    predicate_expression,
    round(filter_selectivity, 3) AS step_selectivity,
    round(total_selectivity, 3) AS query_selectivity,
    index_names,
    index_selectivities
FROM system.predicate_statistics_log
WHERE table = 'hits'
ORDER BY event_time DESC
LIMIT 10;
```

<h2 id="see-also">
  See also
</h2>

* [`predicate_statistics_sample_rate`](/reference/settings/session-settings#predicate_statistics_sample_rate)
* [system.query\_log](/reference/system-tables/query_log)
