*To get optimum performance with PostgreSQL full text search you need to create ...

electroly · on May 8, 2015

You can precompute to_tsvector() in parallel ahead of time if you're storing it in a dedicated column. CREATE INDEX runs on a single thread, including the part where it evaluates to_tsvector() for each row. If you ever need to recreate the index, it'll be faster if the tsvector is in a dedicated column.

I have a table with 30 million documents using pgsql's full text index. Creating the index takes ages, and search performance is generally very poor. The difference between creating the index with the precomputed column versus creating the index with the expression in the index itself (which is how I originally did it) was substantial.

atombender · on May 8, 2015

No, there's no real reason why you would not use a functional index. The reason so many tutorials use a dedicated column is simply lack of up-to-date information, I think.

mark_l_watson · on May 8, 2015

That is the way I do it also. I have used Lucene and Solr a lot in the past, but I now find Postgres text search to be more than adequate for my needs and it does make software development and deployment simpler.

tristandunn · on May 8, 2015

You're not including the title from the JSON column or setting weights in your alternate version. I'm not sure if that can be included in the index or not, but that might be the reasoning.

alexhill · on May 9, 2015

If you want to preprocess your document, or aggregate different parts of it with different labels and weights, it can be helpful to store it in a separate column.

Of course you can always just index your preprocessing/aggregating function and call it every time you want to search, but depending on how expensive that is, it might be in your interest to do it upfront and make your searches a bit quicker.

scosman · on May 8, 2015

It has the same effect as the UPDATE/INSERT trigger they use as far as I'm aware, without the additional storage.

fleetfox · on May 8, 2015

With triggers you can fetch and index data from relations.

joevandyk · on May 9, 2015

This is the main reason for me. I have to be able to quickly search data that's stored in a couple different tables.

Using an index means the read queries become a little more complex (they have to exactly match the index expression).

pgaddict · on May 8, 2015

relation == table

relationship == foreign key

fleetfox · on May 9, 2015

Did i phrase it wrong? You can fetch and index data from related tables.

spoiler · on May 9, 2015

I think he simply tried to clarify the difference to people who don't understand jargon.