Efficient way to store data in pgsql to create facet buckets

Question

Efficient way to store data in pgsql to create facet buckets

169 Views Asked by Carlo Kok At 24 September 2023 at 07:45

I have product related data with an fts index on it to do fts search. I know there are multiple ways to create facet results, however I've not found any that have adequate performance with 15+ groups. Has anyone found an efficient way to store, filter and facet the data, that doesn't require tons of queries and gives group-> hit + count?

My dataset isn't huge (60k records) and I realize there are probably better solutions, but it would still be very nice if everything can me done in one db

Approach tried: https://bun.uptrace.dev/postgres/faceted-full-text-search-tsvector.html#creating-a-table with 40k items, running:

explain analyse 

SELECT 
  split_part(word, ':', 1) AS attr,
  split_part(word, ':', 2) AS value,
  ndoc AS count
FROM ts_stat($$ SELECT tsvagg FROM product $$)
    order by word

Excution time 3sec:

"Sort  (cost=814.41..839.41 rows=10000 width=100) (actual time=2897.014..3195.353 rows=162944 loops=1)"
"  Sort Key: word"
"  Sort Method: external merge  Disk: 27752kB"
"  ->  Function Scan on ts_stat  (cost=0.03..150.03 rows=10000 width=100) (actual time=1246.314..1349.076 rows=162944 loops=1)"
"Planning Time: 0.033 ms"
"Execution Time: 3230.321 ms"

Original Q&A

There are 1 best solutions below

**Abdul Aziz** · Answer 1 · 2023-09-25T09:15:46.897000

One way to improve the performance of faceted search with FTS is to precompute and store the facets as part of your data model as it avoids the need to repeatedly parse and analyze the FTS index on each query, but the trade off would be to use more storage. The downside is that you need to keep the facets table up-to-date as your product data changes, but this can be managed with database triggers or batch processes.

The layout you will be looking for is something like -

-- create table
CREATE TABLE product_facets (
    id serial primary key,
    attr text,
    value text,
    count integer
);

-- populate based on fts index
INSERT INTO product_facets (attr, value, count)
SELECT 
    split_part(word, ':', 1) AS attr,
    split_part(word, ':', 2) AS value,
    ndoc AS count
FROM ts_stat($$ SELECT tsvagg FROM product $$)
ORDER BY word;

Efficient way to store data in pgsql to create facet buckets

There are 1 best solutions below

Related Questions in POSTGRESQL

Related Questions in FACETED-SEARCH

Trending Questions

Popular # Hahtags

Popular Questions