Counters table
It would seem - what is easier? We set up a separate plate, in it - an entry with a counter. We need to get a new identifier - read from there to write a new value - do it
UPDATE
...
Do n't do that ! Because tomorrow you will have to solve problems:
- persistent overlapping locks when
UPDATE
see PostgreSQL Antipatterns: fighting hordes of "dead" - gradual degradation of the speed of access to the data of the counter table
see PostgreSQL Antipatterns: updating a large table under load - ... and the need to clean it up with active transactions that will bother you,
see DBA: when VACUUM passes, we clean the table manually
SEQUENCE object
For such tasks, PostgreSQL provides a separate entity -
SEQUENCE
. It is nontransactional, that is, it does not cause locks , but two "parallel" transactions will certainly receive different values .
To get the next ID from a sequence, just use the function
nextval
:
SELECT nextval('seq_name'::regclass);
Sometimes you need to get several IDs at once - for streaming recording via COPY, for example. Using for this
setval(currval() + N)
is fundamentally wrong ! For the simple reason that between calls to the "inner" ( currval
) and "outer" ( setval
) functions, a concurrent transaction could change the current value of the sequence. The correct way is to call the nextval
required number of times:
SELECT
nextval('seq_name'::regclass)
FROM
generate_series(1, N);
Serial pseudo
It is not very convenient to work with sequences in "manual" mode. But our typical task is to ensure the insertion of a new record with a new sequence-ID! Especially for this purpose, PostgreSQL is invented
serial
, which, when generating a table, "expands" into something like .
There is no need to remember the name of the automatically generated sequence linked to the field, there is a function for this . The same function can be used in your own substitutions - for example, if there is a need to make a common sequence for several tables at once.
However, since working with the sequence is nontransactional, if the identifier from it was received by a rollbacked transaction, then the sequence of IDs in the saved table records will be "leaky"id integer NOT NULL DEFAULT nextval('tbl_id_seq')
pg_get_serial_sequence(table_name, column_name)
DEFAULT
...
GENERATED columns
Starting with PostgreSQL 10 , it is possible to declare an identity column (
GENERATED AS IDENTITY
) that conforms to the SQL: 2003 standard. In the variant, the GENERATED BY DEFAULT
behavior is equivalent serial
, but with GENERATED ALWAYS
everything more interesting:
CREATE TABLE tbl(
id
integer
GENERATED ALWAYS AS IDENTITY
);
INSERT INTO tbl(id) VALUES(DEFAULT);
-- : 10 .
INSERT INTO tbl(id) VALUES(1);
-- ERROR: cannot insert into column "id"
-- DETAIL: Column "id" is an identity column defined as GENERATED ALWAYS.
-- HINT: Use OVERRIDING SYSTEM VALUE to override.
Yes, in order to insert a specific value "across" such a column, you will have to make extra efforts with
OVERRIDING SYSTEM VALUE
:
INSERT INTO tbl(id) OVERRIDING SYSTEM VALUE VALUES(1);
-- : 11 .
Note that now we have two identical values โโin the table
id = 1
- that is, GENERATED does not impose additional UNIQUE conditions and indices , but is purely a declaration, as well as serial
.
In general, on modern PostgreSQL versions the use of serial is deprecated, with the preferred replacement for
GENERATED
. Except, perhaps, the situation of support for cross-version applications working with PGs below 10.
Generated UUID
Everything is fine as long as you work within one database instance. But when there are several of them, there is no adequate way to synchronize the sequences (however, this does not prevent you from โinadequatelyโ synchronizing them , if you really want to). This is where the type
UUID
and functions for generating values for it come to the rescue . I usually use it uuid_generate_v4()
as the most "casual" one.
Hidden system fields
tableoid / ctid
Sometimes, when fetching records from a table, you need to somehow address a specific "physical" record, or find out from which particular section a particular record was obtained when accessing the "parent" table using inheritance .
In this case, the hidden system fields present in each record will help us :
tableoid
stores theoid
-id of the table - that is,tableoid::regclass::text
gives the name of a particular table-sectionctid
- "physical" address of the record in the format(<>,<>)
For example,
ctid
it can be used for operations with a table without a primary key , but tableoid
for the implementation of certain kinds of foreign keys.
oid
Up to 11 PostgreSQL was possible to declare when you create the attribute table
WITH OIDS
:
CREATE TABLE tbl(id serial) WITH OIDS;
Each entry in this table gets an additional hidden field
oid
with a globally unique value within the database - as it was organized for system tables like pg_class
, pg_namespace
...
When you insert a record in a table generated value is returned immediately to the result of the query:
INSERT INTO tbl(id) VALUES(DEFAULT);
: OID 16400 11 .
Such a field is invisible for a "normal" table query:
SELECT * FROM tbl;
id
--
1
It, like other system fields, must be requested explicitly:
SELECT tableoid, ctid, xmin, xmax, cmin, cmax, oid, * FROM tbl;
tableoid | ctid | xmin | xmax | cmin | cmax | oid | id
---------------------------------------------------------
16596 | (0,1) | 572 | 0 | 0 | 0 | 16400 | 1
True, the value
oid
is only 32 bits , so it is very easy to get an overflow, after which oid
it will not even be possible to create any table (it needs a new one !). Therefore, since PostgreSQL 12, it is WITH OIDS
no longer supported .
"Fair" time clock_timestamp
Sometimes, during a long execution of a query or procedure, you want to bind the "current" time to the record. Failure awaits anyone who tries to use the function to do this
now()
- it will return the same value throughout the entire transaction .
To get the "right now" time, there is a function
clock_timestamp()
(and another bunch of its brothers). The difference in behavior of these functions can be seen on the example of a simple query:
SELECT
now()
, clock_timestamp()
FROM
generate_series(1, 4);
now | clock_timestamp
-------------------------------+-------------------------------
2020-08-19 16:26:05.626629+03 | 2020-08-19 16:26:05.626758+03
2020-08-19 16:26:05.626629+03 | 2020-08-19 16:26:05.626763+03
2020-08-19 16:26:05.626629+03 | 2020-08-19 16:26:05.626764+03
2020-08-19 16:26:05.626629+03 | 2020-08-19 16:26:05.626765+03