Using .matches() on integer types

349 Views Asked by At

I have a Rails 4 application backed by Postgres and one of my tables looks like this:

ActiveRecord::Schema.define(version: 20141117132948) do
  # These are extensions that must be enabled in order to support this database
  enable_extension "plpgsql"
  ...
  create_table "boards", force: true do |t|
    ...
    t.string "part_name", null: false
    t.string "details", limit: 1024
    t.integer "sku", limit: 8, null: false
    t.integer "ct_id", limit: 8, null: false, default: 0
    ...
  end
  ...
end

And I want to be able to find all records that match a given search term. For example, if the search term is 123 then I want to be able to find all boards whose part_name,details,sku or ct_id contains 123. For example, I would like a board with sku of '9371239583' to be matched since it contains the sequence '123' in it.

This is the code that I'm using to find records that match.

t = Board.arel_table
q = params[:q] # This is the search term, e.g. '123'

@boards = @boards.where(
  t[:part_name].matches("%#{q}%")
  .or(t[:details].matches("%#{q}%"))
  .or(t[:sku].matches("%#{q}%"))   # This doesn't work because 'sku' is an integer
  .or(t[:ct_id].matches("%#{q}%")) # same here
)

I know that the SQL statement select * from boards where cast(sku as text) like '%123%' works fine but I can't find a way to tie both things together.

Is there a way to solve this problem without making changes to the DB schema?

1

There are 1 best solutions below

2
On BEST ANSWER

One of your problems is that the default index type in postgres is btree, and although it's pretty good at LIKE 'blah%', it's not very good at LIKE '%blah' [1]. To fix this, you should use a gist or gin index with the pg_trgm extension [2]. To install this, you should need to run the following command once on your database:

create extension pg_trgm;

Another is that you're comparing multiple columns at once. One way of handling this would be to index on an expression [3]. The disadvantage of this (and the solution I'll give you in general) is that it will slow down updates significantly.

CREATE INDEX board_concat_idx ON boards USING gin ((part_name || ' ' || details || ' ' || sku || ' ' || ct_id) gin_trgm_ops)

Finally, you should have the following performant indexed query:

SELECT * FROM boards WHERE (part_name || ' ' || details || ' ' || sku || ' ' || ct_id) LIKE '%123%'

Note - I have concatenated these columns with spaces, but you should be sure to do it with a character/sequence that will never show up in the columns or query. This is important, because you don't want part_name to end in 1, and details to start with 23, and get a match.

PS. I'm hoping you're on at least postgres 9.1. Also, I'm pretty curious why you need this.

  1. http://www.postgresql.org/docs/9.3/static/indexes-types.html
  2. http://www.depesz.com/2011/02/19/waiting-for-9-1-faster-likeilike/
  3. http://www.postgresql.org/docs/9.3/static/indexes-expressional.html