> Full Neon documentation index: https://neon.com/docs/llms.txt

# Full Text Search using tsvector with Neon Postgres

A step-by-step guide describing how to implement full text search with tsvector in Postgres

The `tsvector` type enables you to use full text search on your text content in Postgres. Full text search allows you to search text content in a more flexible way than using `LIKE`. Full text search also supports features like _stemming_, which means searching for the word "run" will match variations like "ran" and "running".

## Steps

- Set up a table with a `tsvector` column
- Execute your first full text search
- Search for multiple words
- Rank the results
- Create a GIN index

## Set up a table with a `tsvector` column

To set up full text search, you need to create a column of type `tsvector` that will enable full text search. You can run the following `CREATE TABLE` statement in the [Neon SQL Editor](https://neon.com/docs/get-started/query-with-neon-sql-editor) or from a client such as [psql](https://neon.com/docs/connect/query-with-psql-editor) that is connected to Neon. This statement will create a table with a column `searchable` of type `tsvector`.

```sql
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  title TEXT,
  body TEXT,
  searchable tsvector
);
```

Next, insert two new rows into the `documents` table. The [to_tsvector()] ([https://www.postgresql.org/docs/current/textsearch-controls.html](https://www.postgresql.org/docs/current/textsearch-controls.html)) function takes in a language and the text content to tokenize. In the following example, the text content is the `title` and `body` columns concatenated together.

```sql
INSERT INTO documents (title, body, searchable)
  VALUES (
    'PostgreSQL Full-Text Search',
    'This is an introduction to full-text search in PostgreSQL.',
    to_tsvector('english', 'PostgreSQL Full-Text Search This is an introduction to full-text search in PostgreSQL.')
  );

INSERT INTO documents (title, body, searchable)
  VALUES (
    'My Mashed Potatoes Recipe',
    'These amazing homemade mashed potatoes are perfectly rich and creamy, full of great flavor, easy to make, and always a crowd fave!',
    to_tsvector('english', 'My Mashed Potatoes Recipe These amazing homemade mashed potatoes are perfectly rich and creamy, full of great flavor, easy to make, and always a crowd fave!')
  );
```

Once you have inserted the new rows, try running `SELECT * FROM documents;`. You will see that the data stored in the `searchable` column looks like the following.

```
'full':3,12 'full-text':2,11 'introduct':9 'postgresql':1,16 'search':5,14 'text':4,13
```

Internally, `to_tsvector()` uses a parser to break the text content into tokens for easier searching.

## Execute your first full text search

You can execute a full text search query using a `WHERE` clause with `@@ to_tsquery('english', 'content here')` as shown below. The following query returns the "mashed potatoes" row because, although the word "flavorful" does not appear in that row's text content, the word "flavor" does. And "flavor" matches "flavorful".

```sql
SELECT
    *
  FROM documents
  WHERE searchable @@ to_tsquery('english', 'flavorful');
```

Similarly, the following query returns the "PostgeSQL" row, even though the word "searching" does not appear in that row's text content, but "search" does.

```sql
SELECT
    *
  FROM documents
  WHERE searchable @@ to_tsquery('english', 'searching');
```

The `@@` operator is a special operator which compares the `tsvector` value stored in the `searchable` column with the `tsquery` value provided in the query. The `tsquery` type is different from the `tsvector` type. For example, if you run the following command, Postgres will return the string "searching".

```sql
SELECT to_tsquery('english', 'searching');
```

## Search for multiple words

If you try using `to_tsquery()` with multiple words, like `to_tsquery('english', 'searching text');`, Postgres will throw the following error.

```
ERROR: syntax error in tsquery: "searching text" (SQLSTATE 42601)
```

That's because the input to `to_tsquery()` must be tokens separated by `tsquery` operators like `&`. The correct way to search for "searching" and "text" would be `to_tsquery('english', 'searching & text');`. To make full text search easier to work with, Postgres also has a `phraseto_tsquery()` function that converts text into a `tsquery` with no need for operators. The following query will successfully return the "PostgreSQL" row.

```sql
SELECT
    *
  FROM documents
  WHERE searchable @@ phraseto_tsquery('english', 'searching text');
```

`tsquery` also supports negations. For example, the following query will search for rows whose text content matches "searching" and does **not** match "text".

```sql
SELECT
    *
  FROM documents
  WHERE searchable @@ to_tsquery('english', 'searching & !text');
```

Postgres also supports a `websearch_to_tsquery()` function, which uses an alternative syntax that doesn't require putting operators between all tokens. `websearch_to_tsquery()` supports negations by prefixing a token with `-`. The following query also searches for rows whose text content matches "searching" and does **not** match "text".

```sql
SELECT
    *
  FROM documents
  WHERE searchable @@ websearch_to_tsquery('english', 'searching -text');
```

## Rank the results

Postgres provides two functions for ranking the results, allowing you to sort by which results are the best match. The following statement sorts rows that match "searching text" using the `ts_rank()` function, which counts the number of tokens that match.

```sql
SELECT
    id,
    title,
    ts_rank(searchable, websearch_to_tsquery('english', 'searching text')) AS rank
  FROM documents
  WHERE searchable @@ websearch_to_tsquery('english', 'searching text')
  ORDER BY rank DESC;
```

To see how sorting works in practice, insert two more rows as follows. The first row contains 6 tokens that match "search" and "text", so that row should show up first.

```sql
INSERT INTO documents (title, body, searchable)
  VALUES (
    'PostgreSQL Text Search',
    'A comprehensive, searchable guide in plain text format. Covers full text search in PostgreSQL',
    to_tsvector('english', 'PostgreSQL Text Search A comprehensive, searchable guide in plain text format. Covers full text search in PostgreSQL')
  );
```

Running the `ts_rank()` `SELECT` statement with these two new rows outputs rows in the following order. "PostgreSQL Text Search" appears first because it has the most occurrences of tokens that match "search" and "text".

```sql
2	PostgreSQL Text Search	0.34941113
1	PostgreSQL Full-Text Search	0.3054688
```

Postgres also has a `ts_rank_cd()` function which uses an alternative ranking algorithm based on _cover density_. `ts_rank_cd()` also takes proximity of matching tokens into consideration, so the "PostgreSQL Text Search" row will rank slightly lower with `ts_rank_cd()` because there's more words between the matching tokens.

```sql
SELECT
    id,
    title,
    ts_rank_cd(searchable, websearch_to_tsquery('english', 'searching text')) AS rank
  FROM documents
  WHERE searchable @@ websearch_to_tsquery('english', 'searching text')
  ORDER BY rank DESC;
```

```
1	PostgreSQL Full-Text Search	0.21666667
2	PostgreSQL Text Search	0.21428572
```

## Create a GIN index

[GIN indexes](https://www.postgresql.org/docs/current/gin-intro.html) allow you to index your `tsvector` properties, which can make your full text search queries faster as your data grows. Just be careful, [GIN indexes can slow down your updates](https://pganalyze.com/blog/gin-index). Below is how you can create a GIN index on the `searchable` column.

```sql
CREATE INDEX searchable_idx ON documents USING GIN(searchable);
```

To test out the GIN index, let's first insert 100 copies of the "mashed potatoes" document. Sometimes Postgres decides to skip using indexes and use a sequential scan instead when a query matches most of the table.

```sql
DO $$
BEGIN
  FOR i IN 1..100 LOOP
    INSERT INTO documents (title, body, searchable)
    VALUES (
      'My Mashed Potatoes Recipe',
      'These amazing homemade mashed potatoes are perfectly rich and creamy, full of great flavor, easy to make, and always a crowd fave!',
      to_tsvector('english', 'My Mashed Potatoes Recipe These amazing homemade mashed potatoes are perfectly rich and creamy, full of great flavor, easy to make, and always a crowd fave!')
    );
  END LOOP;
END $$;
```

Next, you can run an `EXPLAIN ANALYZE` query (or just click the **Explain** button in the Neon SQL Editor) to confirm that Postgres is using your GIN index.

```sql
EXPLAIN ANALYZE
SELECT
    id, title
  FROM documents
WHERE searchable @@ to_tsquery('english', 'search');
```

The `EXPLAIN ANALYZE` query should produce output that resembles the following. The `Bitmap Index Scan on searchable_idx` means that Postgres is using a GIN index rather than a sequential scan to answer the query.

```
Bitmap Heap Scan on documents  (cost=8.54..13.10 rows=2 width=29) (actual time=0.021..0.022 rows=2 loops=1)
  Recheck Cond: (searchable @@ '''search'''::tsquery)
  Heap Blocks: exact=1
  ->  Bitmap Index Scan on searchable_idx  (cost=0.00..8.54 rows=2 width=0) (actual time=0.009..0.009 rows=2 loops=1)
        Index Cond: (searchable @@ '''search'''::tsquery)
Planning Time: 0.095 ms
Execution Time: 0.105 ms
```
