There exists an extension to PostgreSQL, which lets you use hypothetical indexes.
What are there? That's simple – these are indexes that don't really exist. So what good are they?
Let's see.
There exists an extension to PostgreSQL, which lets you use hypothetical indexes.
What are there? That's simple – these are indexes that don't really exist. So what good are they?
Let's see.
On 26th of March, Heikki Linnakangas committed patch:
Add support for index-only scans in GiST. This adds a new GiST opclass method, 'fetch', which is used to reconstruct the original Datum from the value stored in the index. Also, the 'canreturn' index AM interface function gains a new 'attno' argument. That makes it possible to use index-only scans on a multi-column index where some of the opclasses support index-only scans but some do not. This patch adds support in the box and point opclasses. Other opclasses can added later as follow-on patches (btree_gist would be particularly interesting). Anastasia Lubennikova, with additional fixes and modifications by me.
Continue reading Waiting for 9.5 – Add support for index-only scans in GiST.
On 18th of March, Alvaro Herrera committed patch:
array_offset() and array_offsets() These functions return the offset position or positions of a value in an array. Author: Pavel Stěhule Reviewed by: Jim Nasby
Continue reading Waiting for 9.5 – array_offset() and array_offsets()
On 19th of January, Robert Haas committed patch:
Use abbreviated keys for faster sorting of text datums. This commit extends the SortSupport infrastructure to allow operator classes the option to provide abbreviated representations of Datums; in the case of text, we abbreviate by taking the first few characters of the strxfrm() blob. If the abbreviated comparison is insufficent to resolve the comparison, we fall back on the normal comparator. This can be much faster than the old way of doing sorting if the first few bytes of the string are usually sufficient to resolve the comparison. There is the potential for a performance regression if all of the strings to be sorted are identical for the first 8+ characters and differ only in later positions; therefore, the SortSupport machinery now provides an infrastructure to abort the use of abbreviation if it appears that abbreviation is producing comparatively few distinct keys. HyperLogLog, a streaming cardinality estimator, is included in this commit and used to make that determination for text. Peter Geoghegan, reviewed by me.
Continue reading Waiting for 9.5 – Use abbreviated keys for faster sorting of text datums.
The general knowledge is that numerics are slower than integers/float, but offer precision and ranges that are better.
While I understand what is slow, I don't really know how much slower numerics are. So let's test it.
On 7th of November, Alvaro Herrera committed patch:
BRIN is a new index access method intended to accelerate scans of very large tables, without the maintenance overhead of btrees or other traditional indexes. They work by maintaining "summary" data about block ranges. Bitmap index scans work by reading each summary tuple and comparing them with the query quals; all pages in the range are returned in a lossy TID bitmap if the quals are consistent with the values in the summary tuple, otherwise not. Normal index scans are not supported because these indexes do not store TIDs. As new tuples are added into the index, the summary information is updated (if the block range in which the tuple is added is already summarized) or not; in the latter case, a subsequent pass of VACUUM or the brin_summarize_new_values() function will create the summary information. For data types with natural 1-D sort orders, the summary info consists of the maximum and the minimum values of each indexed column within each page range. This type of operator class we call "Minmax", and we supply a bunch of them for most data types with B-tree opclasses. Since the BRIN code is generalized, other approaches are possible for things such as arrays, geometric types, ranges, etc; even for things such as enum types we could do something different than minmax with better results. In this commit I only include minmax. Catalog version bumped due to new builtin catalog entries. There's more that could be done here, but this is a good step forwards. Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera, with contribution by Heikki Linnakangas. Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas. Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo. PS: The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 318633.
Continue reading Waiting for 9.5 – BRIN: Block Range Indexes.
Every now and then someone asks, on irc or mailing lists, some question which shows deep misunerstanding (or lack of understanding) of timestamps – especially the ones with time zones.
Since I got bitten by this before, let me describe what timestamps are, how to work with them, and what are the most common pitfalls that you can encounter.
On 23rd of March, Andrew Dunstan committed patch:
Introduce jsonb, a structured format for storing json. The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored. The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which was intended to provide similar facilities to a nested hstore type, but which in the end proved to have some significant compatibility issues. Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan. Review: Andres Freund
Continue reading Waiting for 9.4 – Introduce jsonb, a structured format for storing json.
Last time I wrote about what explain output shows. Now I'd like to talk more about various types of “nodes" / operations that you might see in explain plans.
On 9th of April, Tom Lane committed patch:
Support indexing of regular-expression searches in contrib/pg_trgm. This works by extracting trigrams from the given regular expression, in generally the same spirit as the previously-existing support for LIKE searches, though of course the details are far more complicated. Currently, only GIN indexes are supported. We might be able to make it work with GiST indexes later. The implementation includes adding API functions to backend/regex/ to provide a view of the search NFA created from a regular expression. These functions are meant to be generic enough to be supportable in a standalone version of the regex library, should that ever happen. Alexander Korotkov, reviewed by Heikki Linnakangas and Tom Lane
One day later Tom Lane added support for the same operations using GiST indexes (original patch was working only with GIN).