This is a document for PGroonga 2.X and 3.X. See PGroonga 1.x document when you're using old PGroonga.
Auto complete is useful feature for easy to use search box. PGroonga has features to implement auto complete.
You can implement auto complete by combining the following searches:
Only for Japanese: Prefix RK search for auto complete by readings
Loose full text search
Here is the sample schema:
CREATE TABLE terms (
term text,
readings text[]
);
Auto complete candidate terms are stored into term
. Readings of term
are stored in readings
. As you know, type of readings
is text[]
, multiple readings are stored into readings
.
Here is the sample index definition:
CREATE INDEX pgroonga_terms_prefix_search ON terms USING pgroonga
(readings pgroonga_text_array_term_search_ops_v2);
CREATE INDEX pgroonga_terms_full_text_search ON terms USING pgroonga
(term)
WITH (tokenizer = 'TokenBigramSplitSymbolAlphaDigit');
The above indexes are required for prefix RK search and full text search.
TokenBigramSplitSymbolAlphaDigit
tokenizer is suitable for loose full text search.
Prefix RK search is a prefix search variant. It supports searching katakana by romaji, hiragana or katakana. It's useful for Japanese.
Here is the sample data for prefix RK search:
INSERT INTO terms (term, readings) VALUES ('牛乳', ARRAY['ギュウニュウ', 'ミルク']);
Note that you need insert only katakana in readings
. This is required to search auto complete candidate terms with prefix RK search.
Then use &^~
operator against readings
for prefix RK search. Here are some examples about prefix RK search.
Prefix RK search with romaji
Prefix RK search with hiragana
Prefix RK search with katanaka
You can search "牛乳" as auto complete candidate of "gyu" (romaji) by prefix RK search:
SELECT term FROM terms WHERE readings &^~ 'gyu';
-- term
-- ------
-- 牛乳
-- (1 row)
You can search "牛乳" as auto complete candidate of ぎゅう" (hiragana) by prefix RK search:
SELECT term FROM terms WHERE readings &^~ 'ぎゅう';
-- term
-- ------
-- 牛乳
-- (1 row)
You can search "牛乳" as auto complete candidate of "ギュウ" (katanaka) by prefix RK search.
SELECT term FROM terms WHERE readings &^~ 'ギュウ';
-- term
-- ------
-- 牛乳
-- (1 row)
There is an advanced usage of readings
. If reading of synonym is stored in readings
, you can also search as auto complete candidate term:
SELECT term FROM terms WHERE readings &^~ 'mi';
-- term
-- ------
-- 牛乳
-- (1 row)
"ミルク" is a synonym of "牛乳". You can search "牛乳" by "mi" as auto complete candidate term because "ミルク" is stored in readings
column.
Use &@
against term
for loose full text search. Here is the result of it:
INSERT INTO terms (term, readings) VALUES ('auto-complete', ARRAY['オートコンプリート']);
SELECT term FROM terms WHERE term &@ 'mpl';
-- term
-- ---------------
-- auto-complete
-- (1 rows)
The result contains auto-complete
as auto complete candidate term.