PostgreSQLFullTextTrgm

From Request Tracker Wiki
Revision as of 04:59, 6 April 2011 by 95.173.94.255 (talk) (filter by trigrams (index) & like (to remove false matches))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

An experimental PostgreSQL full text trigram based setup

This article describes a setup of PostgreSQL full text in a non native mode using trigrams based matching. PostgreSQL full text can't be used for substring searching in standard way. Version 8.4.x supports prefix matching, but it is still not sufficient for me. Inspired by http://kaiv.wordpress.com/2007/12/11/postgresql-substring-search/ and using know-how from PostgreSQLFullText I'm trying a fusion now :).

Read the page PostgreSQLFullText first! Changes are described bellow. You can use my script rt-mysql2pg to prepare database for full text without a tedious work.

  • Patch for SearchBuilder (I have placed the modified version into <rt-prefix>/local/lib/DBIx/SearchBuilder.pm.):
--- SearchBuilder.pm.orig	2011-03-24 16:26:16.000000000 +0100
+++ SearchBuilder.pm	2011-03-30 17:11:18.000000000 +0200
@@ -932,11 +932,33 @@
 
     }
 
-    my $clause = {
+    my @clause = ( {
         field => $QualifiedField,
         op => $args{'OPERATOR'},
         value => $args{'VALUE'},
-    };
+    } );
+
+    # Use FULLTEXT for large Attachments.Content and
+    # ObjectCustomFieldValues.Largecontent in PostgreSQL.
+    if ( $QualifiedField =~ m/^(?: Attachments_\d+\.Content
+		| ObjectCustomFieldValues_\d+\.Largecontent )$/xi) {
+	if ( $args{'OPERATOR'} =~ m/^(?:NOT )?I?LIKE$/
+		&& $args{'VALUE'} =~ m/^'%.*%'$/ ) {
+	    my $not = lc(substr($args{'OPERATOR'}, 0, 3)) eq 'not';
+	    my $value = $args{'VALUE'};
+	    $value  =~ s/^'%(.*)%'$/'$1'/;
+	    $value  = $not ? "(!! text_to_trgm_tsquery($value))" : "text_to_trgm_tsquery($value)";
+	    my $field = $QualifiedField;
+	    $field =~ s/\.(?:Content|Largecontent)$/.trigrams/;
+	    @clause = ( '(',
+		{
+		    field => $field,
+		    op => '@@',
+		    value => $value,
+		},
+		'AND', @clause, ')' );
+	}
+    }
 
     # Juju because this should come _AFTER_ the EA
     my @prefix;
@@ -945,10 +967,10 @@
     }
 
     if ( lc( $args{'ENTRYAGGREGATOR'} || "" ) eq 'none' || !@$restriction ) {
-        @$restriction = (@prefix, $clause);
+        @$restriction = (@prefix, @clause);
     }
     else {
-        push @$restriction, $args{'ENTRYAGGREGATOR'}, @prefix, $clause;
+        push @$restriction, $args{'ENTRYAGGREGATOR'}, @prefix, @clause;
     }
 
     return ( $args{'ALIAS'} );
  • To prepare already converted PostgreSQL database run:
rt-mysql2pg -v --dst-dsn dbi:Pg:dbname=rt3 --fulltext --vacuum

-- zito