Waiting for PostgreSQL 11 – Indexes with INCLUDE columns and their support in B-tree

On 7th of April 2018, Teodor Sigaev committed patch:

Indexes with INCLUDE columns and their support in B-tree
 
This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.
 
Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.
 
In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.
 
Bump catalog version
 
Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
             David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/.4010101@postgrespro.ru

Continue reading Waiting for PostgreSQL 11 – Indexes with INCLUDE columns and their support in B-tree

Waiting for PostgreSQL 11 – Support parallel btree index builds.

Support parallel btree index builds.
 
 
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds.  Testing
to date shows that this can often be 2-3x faster than a serial
index build.
 
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature.  We can
refine it as we get more experience.
 
Peter Geoghegan with some help from Rushabh Lathia.  While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature.  Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
 
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com

Continue reading Waiting for PostgreSQL 11 – Support parallel btree index builds.

Waiting for 9.6 – Bloom index contrib module

On 1st of April, Teodor Sigaev committed patch:

Bloom index contrib module
 
Module provides new access method. It is actually a simple Bloom filter
implemented as pgsql's index. It could give some benefits on search
with large number of columns.
 
Module is a single way to test generic WAL interface committed earlier.
 
Author: Teodor Sigaev, Alexander Korotkov
Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby

Continue reading Waiting for 9.6 – Bloom index contrib module

Waiting for 9.6 – Allow usage of huge maintenance_work_mem for GIN build.

On 2nd of September, Teodor Sigaev committed patch:

Allow usage of huge maintenance_work_mem for GIN build.
 
Currently, in-memory posting list during GIN build process is limited 1GB
because of using repalloc. The patch replaces call of repalloc to repalloc_huge.
It increases limit of posting list from 180 millions
(1GB / sizeof(ItemPointerData)) to 4 billions limited by maxcount/count fields
in GinEntryAccumulator and subsequent calls. Check added.
 
Also, fix accounting of allocatedMemory during build to prevent integer
overflow with maintenance_work_mem > 4GB.
 
Robert Abraham <robert.abraham86@googlemail.com> with additions by me

Continue reading Waiting for 9.6 – Allow usage of huge maintenance_work_mem for GIN build.

Waiting for 9.5 – Add support for index-only scans in GiST.

On 26th of March, Heikki Linnakangas committed patch:

Add support for index-only scans in GiST.
 
This adds a new GiST opclass method, 'fetch', which is used to reconstruct
the original Datum from the value stored in the index. Also, the 'canreturn'
index AM interface function gains a new 'attno' argument. That makes it
possible to use index-only scans on a multi-column index where some of the
opclasses support index-only scans but some do not.
 
This patch adds support in the box and point opclasses. Other opclasses
can added later as follow-on patches (btree_gist would be particularly
interesting).
 
Anastasia Lubennikova, with additional fixes and modifications by me.

Continue reading Waiting for 9.5 – Add support for index-only scans in GiST.

Waiting for 9.5 – array_offset() and array_offsets()

On 18th of March, Alvaro Herrera committed patch:

array_offset() and array_offsets()
 
These functions return the offset position or positions of a value in an
array.
 
Author: Pavel Stěhule
Reviewed by: Jim Nasby

Continue reading Waiting for 9.5 – array_offset() and array_offsets()

Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.

On 19th of January, Robert Haas committed patch:

Use abbreviated keys for faster sorting of text datums.
 
This commit extends the SortSupport infrastructure to allow operator
classes the option to provide abbreviated representations of Datums;
in the case of text, we abbreviate by taking the first few characters
of the strxfrm() blob.  If the abbreviated comparison is insufficent
to resolve the comparison, we fall back on the normal comparator.
This can be much faster than the old way of doing sorting if the
first few bytes of the string are usually sufficient to resolve the
comparison.
 
There is the potential for a performance regression if all of the
strings to be sorted are identical for the first 8+ characters and
differ only in later positions; therefore, the SortSupport machinery
now provides an infrastructure to abort the use of abbreviation if
it appears that abbreviation is producing comparatively few distinct
keys.  HyperLogLog, a streaming cardinality estimator, is included in
this commit and used to make that determination for text.
 
Peter Geoghegan, reviewed by me.

Continue reading Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.

Waiting for 9.5 – BRIN: Block Range Indexes.

On 7th of November, Alvaro Herrera committed patch:

BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes.  They work by maintaining "summary" data about
block ranges.  Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not.  Normal index scans are not supported
because these indexes do not store TIDs.
 
As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.
 
For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range.  This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results.  In this commit I only include minmax.
 
Catalog version bumped due to new builtin catalog entries.
 
There's more that could be done here, but this is a good step forwards.
 
Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.
 
Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.
 
PS:
  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633.

Continue reading Waiting for 9.5 – BRIN: Block Range Indexes.