This is a document for PGroonga 2.0.0 or later. See PGroonga 1.X document when you're using PGroonga 1.X.
&@* operatorSince 1.2.1.
&~? operator is deprecated since 1.2.2. Use &@* operator instead.
&@* operator performs similar search.
column &@* document
column is a column to be searched. It's text type, text[] type or varchar type.
document is a document for similar search. It's text type for text type or text[] type column. It's varchar type for varchar type column.
Similar search searches records that have similar content with document. If document is short content, similar search may return records that are less similar.
You need to specify one of the following operator classes to use this operator:
pgroonga_text_full_text_search_ops_v2: Default for text
pgroonga_text_array_full_text_search_ops_v2: Default for text[]
pgroonga_varchar_full_text_search_ops_v2: For varchar
Here are sample schema and data for examples:
CREATE TABLE memos (
id integer,
content text
);
CREATE INDEX pgroonga_content_index ON memos
USING pgroonga (content);
INSERT INTO memos VALUES (1, 'PostgreSQL is a relational database management system.');
INSERT INTO memos VALUES (2, 'Groonga is a fast full text search engine that supports all languages.');
INSERT INTO memos VALUES (3, 'PGroonga is a PostgreSQL extension that uses Groonga as index.');
INSERT INTO memos VALUES (4, 'There is groonga command.');
You can search records that are similar with the specified document by &@* operator:
SELECT * FROM memos WHERE content &@* 'Mroonga is a MySQL extension taht uses Groonga';
-- id | content
-- ----+----------------------------------------------------------------
-- 3 | PGroonga is a PostgreSQL extension that uses Groonga as index.
-- (1 row)
You can't use similar search with sequential scan. If you use similar search with sequential search, you get the following error:
SELECT * FROM memos WHERE content &@* 'Mroonga is a MySQL extension taht uses Groonga';
-- ERROR: pgroonga: operator &@* is available only in index scan
You should use TokenMecab tokenizer instead of the default TokenBigram for similar search against Japanese documents:
CREATE INDEX pgroonga_content_index ON memos
USING pgroonga (content)
WITH (tokenizer='TokenMecab');
TokenMecab will tokenize target documents to words. It improves similar search precision.
See also CREATE INDEX USING pgroonga how to specify TokenMecab tokenizer.