This is a document for PGroonga 2.X and 3.X. See PGroonga 1.x document when you're using old PGroonga.
Create data in PostgreSQL, provide it as an API using PostgREST, and then easily create an "API for convenient data search". This is a beginner's guide to PostgREST with PGroonga.
Here, we will use PostgreSQL, which is usually available on most computersπ
createdb api
psql api
To make use of the convenient features of PGroonga, creating indexes that match the features is crucial.
In this example, we will create a memo table and create indexes that enable the following search functionalities for each column:
Treating uppercase and lowercase characters the same.
Creating a text bigram index, it allows words to be matched even when they are used within the middle of another word. For example, using 'ppl' as a search keyword can match 'Apple'.
Now, let's try it out!
CREATE EXTENSION IF NOT EXISTS pgroonga;
CREATE TABLE memos (
id integer,
title text,
content text
);
-- Please don't mind the randomness of the sample text π
INSERT INTO memos VALUES (1, 'PostgreSQL is a relational database management system.','Cool!');
INSERT INTO memos VALUES (2, 'Groonga is a fast full text search engine that supports all languages.','Fantastic!');
INSERT INTO memos VALUES (3, 'PGroonga is a PostgreSQL extension that uses Groonga as index.','Interesting!');
INSERT INTO memos VALUES (4, 'There is groonga command.','Is that so?');
CREATE INDEX pgroonga_title_search_index ON memos USING pgroonga (title)
WITH (
normalizers = 'NormalizerNFKC150',
tokenizer = 'TokenNgram("unify_symbol", false, "unify_alphabet", false, "unify_digit", false)'
);
CREATE INDEX pgroonga_content_search_index ON memos USING pgroonga (content)
WITH (
normalizers = 'NormalizerNFKC150',
tokenizer = 'TokenBigramSplitSymbolAlphaDigit'
);
To create the necessary permissions for PostgREST, you can follow these steps:
CREATE ROLE web_user nologin;
GRANT USAGE ON SCHEMA public TO web_user;
GRANT SELECT ON memos TO web_user;
CREATE ROLE authenticator noinherit login password 'mypassword';
GRANT web_user to authenticator;
vi memo.conf
Inside of the file:
db-uri = "postgres://authenticator:mypassword@localhost:5432/api"
db-schemas = "public"
db-anon-role = "web_user"
postgrest memo.conf
Oh, for the installation method of PostgREST, please refer to https://postgrest.org/en/stable/explanations/install.html π
Open your browser and access the following:
http://localhost:3000/memos
Result:
[
{"id":1,"title":"PostgreSQL is a relational database management system.","content":"Cool!"},
{"id":2,"title":"Groonga is a fast full text search engine that supports all languages.","content":"Fantastic!"},
{"id":3,"title":"PGroonga is a PostgreSQL extension that uses Groonga as index.","content":"Interesting!"},
{"id":4,"title":"There is groonga command.","content":"Is that so?"}
]
What? Is it really this easy to have a working REST API endpoint? It's too convenient! π€―
(However, please note that due to the permissions set up in this case, you can only perform SELECT
operations. So, you can search but not add, modify, or delete data. π)
This is the method to perform a LIKE
search using the standard functionality of PostgreSQL.
(By the way, normally, indexes do not work for PostgreSQL's partial match, but if you create an index with PGroonga, it will work. It's like magic! π)
Open your browser and access the following:
http://localhost:3000/memos?title=like.*data*
[{"id":1,"title":"PostgreSQL is a relational database management system.","content":"Cool!"}]
Open your browser and access the following:
http://localhost:3000/memos?content=like.*tastic*
[{"id":2,"title":"Groonga is a fast full text search engine that supports all languages.","content":"Fantastic!"}]
βοΈ With the standard LIKE search functionality, searching for 'γ·γ§γ¦' in katakana will not match hiragana.
Now, since the operators &@~
required for searching with PGroonga are not directly available in PostgREST, we will create a stored function to enable their usage.
psql api
Execute the following SQL statement:
CREATE FUNCTION find_title(keywords text) RETURNS SETOF memos AS $$
BEGIN
RETURN QUERY SELECT * FROM memos WHERE title &@~ keywords;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION find_content(keywords text) RETURNS SETOF memos AS $$
BEGIN
RETURN QUERY SELECT * FROM memos WHERE content &@~ keywords;
END;
$$ LANGUAGE plpgsql;
NOTE: You need to restart PostgREST when create a new functions
Before proceeding to the next section, please restart your PostgREST by pressing Ctrl + C
to stop the currently running PostgREST instance, and then run it again using the following command:
postgrest memo.conf
When calling stored functions in PostgREST, the URL format is /rpc/function_name
.
Open your browser and access the following:
http://localhost:3000/rpc/find_title?keywords=command
The following results will be returned.
[{"id":4,"title":"There is groonga command.","content":"Is that so?"}]
By the way, using a browser to hit the URL is easier than using curl because dealing with encoding can be cumbersome.
$ curl --get --data-urlencode keywords=command http://localhost:3000/rpc/find_title
[{"id":4,"title":"There is groonga command.","content":"Is that so?"}]
Unlike LIKE
search, PGroonga offers case-insensitive searching by default.
Open your browser and access the following:
http://localhost:3000/rpc/find_title?keywords=Groonga
[
{"id":2,"title":"Groonga is a fast full text search engine that supports all languages.","content":"Fantastic!"},
{"id":3,"title":"PGroonga is a PostgreSQL extension that uses Groonga as index.","content":"Interesting!"},
{"id":4,"title":"There is groonga command.","content":"Is that so?"}
]
Open your browser and access the following:
http://localhost:3000/rpc/find_title?keywords=Groonga command
[{"id":4,"title":"There is groonga command.","content":"Is that so?"}]
Open your browser and access the following:
http://localhost:3000/rpc/find_title?keywords=nga OR search
[
{"id":2,"title":"Groonga is a fast full text search engine that supports all languages.","content":"Fantastic!"},
{"id":3,"title":"PGroonga is a PostgreSQL extension that uses Groonga as index.","content":"Interesting!"},
{"id":4,"title":"There is groonga command.","content":"Is that so?"}
]
Open your browser and access the following:
http://localhost:3000/rpc/find_title?keywords=nga -pg
[
{"id":2,"title":"Groonga is a fast full text search engine that supports all languages.","content":"Fantastic!"},
{"id":4,"title":"There is groonga command.","content":"Is that so?"}
]
When you want to allow various additional searches, you can create multiple stored functions.
Here is an example of a stored function that performs a dynamic search by passing the column name:
CREATE OR REPLACE FUNCTION search_col(column_name text, keyword text)
RETURNS SETOF memos
LANGUAGE plpgsql
AS $$
BEGIN
IF column_name IN ('title', 'content') THEN -- Check if the column name is valid
RETURN QUERY EXECUTE format('SELECT * FROM memos WHERE %I &@~ ''%s''', column_name, keyword);
ELSE
RAISE EXCEPTION 'Invalid column name'; -- Return an error if the column name is invalid
END IF;
END;
$$;
CREATE FUNCTION
Stored function to search all columns with a keyword:
CREATE OR REPLACE FUNCTION memo_search(keyword text)
RETURNS SETOF memos
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY EXECUTE format('
SELECT *
FROM memos
WHERE title &@~ $1 OR content &@~ $1
') USING keyword;
END;
$$;
At times, you may want to conduct a search solely using keywords, rather than specifying particular fields. Let's explore how you can accomplish this.
Consider a personal library stored in a database table books
:
CREATE TABLE books (
id INTEGER,
title TEXT,
author TEXT
);
INSERT INTO books VALUES (1, 'Adventures of Sherlock Holmes', 'Arthur Conan Doyle');
INSERT INTO books VALUES (2, 'The Hound of the Baskervilles', 'Arthur Conan Doyle');
INSERT INTO books VALUES (3, 'The Memoirs of Sherlock Holmes', 'Arthur Conan Doyle');
INSERT INTO books VALUES (4, 'The Lion, the Witch, and the Wardrobe', 'C. S. Lewis');
Suppose you want to find books with the author name containing 'Conan Doyle' and titles that include 'Sherlock'. Normally, you would execute the following SQL query:
SELECT * FROM books WHERE author LIKE '%Conan Doyle%' and title LIKE '%Sherlock%';
-- id | title | author
-- ----+--------------------------------+--------------------
-- 1 | Adventures of Sherlock Holmes | Arthur Conan Doyle
-- 3 | The Memoirs of Sherlock Holmes | Arthur Conan Doyle
-- (2 rows)
However, if you're aiming for a Google-like keyword search experience, you would want to achieve the same results with a keyword string such as 'conan doyle sherlock'.
To create this functionality, you will need to design multiple array indexes. Here's how you can proceed:
CREATE INDEX pg_multi_book_index on books USING pgroonga
((ARRAY[title, author]) pgroonga_text_array_full_text_search_ops_v2)
WITH (
normalizers = 'NormalizerNFKC150
(
"unify_kana", true,
"unify_to_romaji", true,
"unify_hyphen_and_prolonged_sound_mark", true
)',
tokenizer = 'TokenNgram("unify_symbol", false, "unify_alphabet", false, "unify_digit", false)'
);
To emulate a Google-like search experience within your database, you can create a stored function that accepts a keyword and returns the relevant records from the books
table:
CREATE OR REPLACE FUNCTION gsearch(keyword text)
RETURNS SETOF books
LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY EXECUTE format('
SELECT *
FROM books
WHERE ARRAY[title, author] &@~ $1
') USING keyword;
END;
$$;
You'll also need to grant the appropriate permissions to allow users to access the books table. Use the following SQL command:
GRANT SELECT ON books TO web_user;
Now, you can perform a keyword-based search directly from your web browser. Simply navigate to the following URL:
http://localhost:3000/rpc/gsearch?keyword=conan doyle sherlock
[
{"id":1,"title":"Adventures of Sherlock Holmes","author":"Arthur Conan Doyle"},
{"id":3,"title":"The Memoirs of Sherlock Holmes","author":"Arthur Conan Doyle"}
]
This seamless and user-friendly approach to searching through your data with keywords is both practical and exciting. It's a fantastic way to enhance your search capabilities, don't you think?
PGroonga has features to implement auto complete which is explained in the auto complete how to section.
Here we will explore how to implement this using PostgREST and just a simple HTML with JavaScript.
CREATE TABLE terms (
term text,
readings text[]
);
CREATE INDEX pgroonga_terms_prefix_search ON terms USING pgroonga
(readings pgroonga_text_array_term_search_ops_v2);
CREATE INDEX pgroonga_terms_full_text_search ON terms USING pgroonga
(term)
WITH (tokenizer = 'TokenBigramSplitSymbolAlphaDigit');
INSERT INTO terms (term, readings) VALUES ('PostgreSQL', ARRAY['sql', 'postgres']);
INSERT INTO terms (term, readings) VALUES ('Groonga', ARRAY['elasticsearch','meilisearch']);
INSERT INTO terms (term, readings) VALUES ('PGroonga', ARRAY['postgresql','extension']);
GRANT SELECT ON terms TO web_user;
CREATE OR REPLACE FUNCTION autocomplete(keyword text) RETURNS SETOF text AS $$
DECLARE
result text[];
BEGIN
IF keyword = '' THEN
RETURN QUERY SELECT unnest(result);
ELSE
RETURN QUERY SELECT term FROM terms WHERE readings &@ keyword;
END IF;
END;
$$ LANGUAGE plpgsql;
Create following HTML file:
vi index.html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>PGroonga Auto Complete Search</title>
<link rel="stylesheet"
href="https://cdn.jsdelivr.net/npm/@tarekraafat/autocomplete.js@10.2.7/dist/css/autoComplete.min.css">
<style>
.center-container {
display: flex;
justify-content: center;
align-items: center;
min-height: 100vh;
}
</style>
</head>
<body>
<div class="center-container">
<form name="search" id="searchForm">
<input type="search" size="60" maxlength="60" name="key" id="autoComplete">
</form>
<script src="https://cdn.jsdelivr.net/npm/@tarekraafat/autocomplete.js@10.2.7/dist/autoComplete.min.js">
</script>
<script type="text/javascript">
const searchForm = document.getElementById('searchForm');
searchForm.addEventListener('submit', async (event) => {
event.preventDefault();
const query = document.getElementById('autoComplete').value;
if (query.length === 0) {
return;
}
const params = new URLSearchParams({keywords: query});
const response = await fetch(`http://localhost:3000/rpc/find_title?${params.toString()}`);
const result = await response.json();
const preElement = document.getElementById('output');
if (result.length > 0) {
const formattedValue = JSON.stringify(result, null, 2);
preElement.textContent = formattedValue;
} else {
preElement.textContent = 'No results found';
}
});
const dataSource = async (query) => {
const params = new URLSearchParams({keyword: query});
const source = await fetch(`http://localhost:3000/rpc/autocomplete?${params.toString()}`);
return await source.json();
};
const autoCompleteJS = new autoComplete({
placeHolder: "Enter a keyword",
data: {
src: dataSource
},
searchEngine: (query, record) => record,
events: {
input: {
selection: (event) => {
const selection = event.detail.selection.value;
autoCompleteJS.input.value = selection;
}
}
},
submit: true
});
</script>
<pre id="output"></pre>
</div>
</body>
</html>
Run your PostgREST service using following command:
postgrest memo.conf
Open index.html
with your browser.
Type something and it will show the suggestions.
When you press Search
button, it will performe keyword search on memos table title data.