❮ Postgresql Expressions Postgresql Constraints ❯

PostgreSQL Data Types

In this section, we will discuss PostgreSQL's data types, which are the types we set for each field when creating tables.

Benefits of setting data types:

PostgreSQL provides a rich set of data types. Users can create new data types in the database using the CREATE TYPE command. PostgreSQL has many data types, and below we will explain them in detail.

Numeric Types

Numeric types consist of 2-byte, 4-byte, or 8-byte integers, as well as 4-byte or 8-byte floating-point numbers and decimal numbers with optional precision.

The table below lists the available numeric types.

Name	Storage Size	Description	Range
smallint	2 bytes	Small-range integer	-32768 to +32767
integer	4 bytes	Commonly used integer	-2147483648 to +2147483647
bigint	8 bytes	Large-range integer	-9223372036854775808 to +9223372036854775807
decimal	Variable	User-specified precision, exact	Up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
numeric	Variable	User-specified precision, exact	Up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
real	4 bytes	Variable precision, inexact	6 decimal digits precision
double precision	8 bytes	Variable precision, inexact	15 decimal digits precision
smallserial	2 bytes	Auto-incrementing small-range integer	1 to 32767
serial	4 bytes	Auto-incrementing integer	1 to 2147483647
bigserial	8 bytes	Auto-incrementing large-range integer	1 to 9223372036854775807

Monetary Types

The money type stores currency amounts with fixed decimal precision.

Values of type numeric, int, and bigint can be converted to money. It is not recommended to use floating-point numbers for monetary types due to the possibility of rounding errors.

Name	Storage Size	Description	Range
money	8 bytes	Currency amount	-92233720368547758.08 to +92233720368547758.07

Character Types

The table below lists the character types supported by PostgreSQL:

No.	Name & Description
1	character varying(n), varchar(n) Variable length with limit
2	character(n), char(n) Fixed-length, blank-padded
3	text Variable length with no limit

Date/Time Types

The table below lists the date and time types supported by PostgreSQL.

Name	Storage Size	Description	Min Value	Max Value	Resolution
timestamp [ (p) ] [ without time zone ]	8 bytes	Date and time (no time zone)	4713 BC	294276 AD	1 millisecond / 14 digits
timestamp [ (p) ] with time zone	8 bytes	Date and time, with time zone	4713 BC	294276 AD	1 millisecond / 14 digits
date	4 bytes	Date (no time of day)	4713 BC	5874897 AD	1 day
time [ (p) ] [ without time zone ]	8 bytes	Time of day (no date)	00:00:00	24:00:00	1 millisecond / 14 digits
time [ (p) ] with time zone	12 bytes	Time of day (no date), with time zone	00:00:00+1459	24:00:00-1459	1 millisecond / 14 digits
interval [ fields ] [ (p) ]	12 bytes	Time interval	-178000000 years	178000000 years	1 millisecond / 14 digits

Boolean Type

PostgreSQL supports the standard boolean data type.

The boolean type has three states: "true", "false", and a third "unknown" state, represented by NULL.

Name	Storage Format	Description
boolean	1 byte	true/false

Enumerated Types

An enumerated type is a data type consisting of a set of named values called elements, members, enumeral, or enumerators.

Enumerated types in PostgreSQL are similar to enum types in C.

Unlike other types, enumerated types need to be created using the CREATE TYPE command.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');

To create a type for the days of the week, as follows:

CREATE TYPE week AS ENUM ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun');

Like other types, once created, enumerated types can be used in table and function definitions.

CREATE TYPE mood AS ENUM ('sad', 'ok', 'happy');
CREATE TABLE person (
    name text,
    current_mood mood
);
INSERT INTO person VALUES ('Moe', 'happy');
SELECT * FROM person WHERE current_mood = 'happy';
 name | current_mood 
------+--------------
 Moe  | happy
(1 row)

Geometric Types

Geometric data types represent two-dimensional spatial objects.

The table below lists the geometric types supported by PostgreSQL.

The most fundamental type: point. It is the basis for all other types.

Name	Storage Size	Description	Representation
point	16 bytes	Point on a plane	(x,y)
line	32 bytes	Infinite line (not fully implemented)	((x1,y1),(x2,y2))
lseg	32 bytes	Finite line segment	((x1,y1),(x2,y2))
box	32 bytes	Rectangle	((x1,y1),(x2,y2))
path	16+16n bytes	Closed path (similar to polygon)	((x1,y1),...)
path	16+16n bytes	Open path	[(x1,y1),...]
polygon	40+16n bytes	Polygon (similar to closed path)	((x1,y1),...)
circle	24 bytes	Circle	<(x,y),r> (center and radius)

Network Address Types

PostgreSQL offers data types to store IPv4, IPv6, and MAC addresses.

It is better to use these types instead of plain text types to store network addresses, as they provide input error checking and specialized operators and functions.

Name	Storage Size	Description
cidr	7 or 19 bytes	IPv4 or IPv6 network
inet	7 or 19 bytes	IPv4 or IPv6 host and network
macaddr	6 bytes	MAC address

When sorting inet or cidr data types, IPv4 addresses always come before IPv6 addresses, including IPv4 addresses encapsulated or mapped within IPv6 addresses, such as ::10.2.3.4 or ::ffff:10.4.3.2.

Bit String Types

Bit strings are strings of 1's and 0's. They can be used to store and visualize bit masks. We have two SQL bit types: bit(n) and bit varying(n), where n is a positive integer.

Data of bit type must exactly match the length n; trying to store shorter or longer data is an error. Bit varying type data is of variable length up to the maximum n; longer strings will be rejected. Writing bit without a length is equivalent to bit(1), and bit varying without a length means no length limit.

Text Search Types

Full-text search is the process of searching through a collection of natural language documents to locate those that match a query.

PostgreSQL provides two data types to support full-text search:

Number	Name & Description
1	tsvector A tsvector value is a sorted list of distinct lexemes, which are words that have been normalized to merge different variants of the same word.
2	tsquery tsquery stores the searched words and uses Boolean operators &(AND),	(OR), and !(NOT) to combine them, with parentheses to emphasize operator grouping.

UUID Type

The UUID data type is used to store universally unique identifiers (UUIDs) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards. (Some systems refer to this data type as a globally unique identifier, or GUID.) This identifier is a 128-bit value generated by an algorithm that ensures it cannot be duplicated by any other identifier generated by the same algorithm within the same module. Therefore, for distributed systems, this identifier provides better uniqueness guarantees than sequences, which can only guarantee uniqueness within a single database.

A UUID is written as a sequence of lower-case hexadecimal digits, separated into groups by hyphens, specifically a group of 8 digits followed by three groups of 4 digits followed by a group of 12 digits, totaling 32 digits representing the 128 bits. An example of a standard UUID is as follows:

a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11

XML Type

The XML data type can be used to store XML data. The advantage of storing XML data in an XML type over a text type is that it can check for well-formedness of the input values and also supports functions for type-safe operations on the data. To use this data type, compilation must be done with configure --with-libxml.

XML can store "documents" as defined by the XML standard, as well as "content" fragments defined by the XMLDecl? content in the XML standard. Roughly, this means that content fragments can have multiple top-level elements or character nodes. The expression xmlvalue IS DOCUMENT can be used to determine if a specific XML value is a complete document or a content fragment.

Creating XML Values

Use the function xmlparse to produce values of type XML from character data:

XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')

JSON Type

The JSON data type can be used to store JSON (JavaScript Object Notation) data. Such data can also be stored as text, but the JSON data type ensures that each stored value is a valid JSON value.

Additionally, there are functions to work with JSON data:

Example	Example Result
array_to_json('{{1,5},{99,100}}'::int[])	[[1,5],[99,100]]
row_to_json(row(1,'foo'))	{"f1":1,"f2":"foo"}

Array Type

PostgreSQL allows columns to be defined as variable-length multidimensional arrays.

Array types can be any built-in type, user-defined type, enumerated type, or composite type.

Declaring Arrays

Arrays can be declared when creating a table, as follows:

CREATE TABLE sal_emp (
    name            text,
    pay_by_quarter  integer[],
    schedule        text[][]
);

pay_by_quarter is a one-dimensional array of integers, and schedule is a two-dimensional array of text.

We can also use the "ARRAY" keyword, as shown below:

CREATE TABLE sal_emp (
   name text,
   pay_by_quarter integer ARRAY[4],
   schedule text[][]
);

Inserting Values

Values are inserted using curly braces {}, with elements separated by commas within the braces:

INSERT INTO sal_emp
    VALUES ('Bill',
    '{10000, 10000, 10000, 10000}',
    '{{"meeting", "lunch"}, {"training", "presentation"}}');

INSERT INTO sal_emp
    VALUES ('Carol',
    '{20000, 25000, 25000, 25000}',
    '{{"breakfast", "consulting"}, {"meeting", "lunch"}}');

Accessing Arrays

Now we can run some queries on this table.

First, we demonstrate how to access an element of an array. This query retrieves the names of employees whose salary changed in the second quarter:

SELECT name FROM sal_emp WHERE pay_by_quarter[1] <> pay_by_quarter[2];

 name
-------
 Carol
(1 row)

Array subscripts are written in square brackets.

Modifying Arrays

We can modify the values of arrays:

UPDATE sal_emp SET pay_by_quarter = '{25000,25000,27000,27000}' WHERE name = 'Carol';

UPDATE sal_emp SET pay_by_quarter = '{25000,25000,27000,27000}'
    WHERE name = 'Carol';

Alternatively, you can use the ARRAY constructor syntax:

UPDATE sal_emp SET pay_by_quarter = ARRAY[25000,25000,27000,27000]
    WHERE name = 'Carol';

Retrieving from Arrays

To search for a value within an array, you must check each element of the array.

For example:

SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
                            pay_by_quarter[2] = 10000 OR
                            pay_by_quarter[3] = 10000 OR
                            pay_by_quarter[4] = 10000;

Alternatively, you can use the following statement to find rows where all elements of the array equal 10000:

SELECT * FROM sal_emp WHERE 10000 = ALL (pay_by_quarter);

Or, you can use the generate_subscripts function. For example:

SELECT * FROM
   (SELECT pay_by_quarter,
           generate_subscripts(pay_by_quarter, 1) AS s
      FROM sal_emp) AS foo
 WHERE pay_by_quarter[s] = 10000;

Composite Types

Composite types represent the structure of a row or a record; it is essentially just a list of field names and their data types. PostgreSQL allows composite types to be used like simple data types. For instance, a field in a table can be declared as a composite type.

Declaring Composite Types

Here are two simple examples of defining composite types:

CREATE TYPE complex AS (
    r       double precision,
    i       double precision
);

CREATE TYPE inventory_item AS (
    name            text,
    supplier_id     integer,
    price           numeric
);

The syntax is similar to CREATE TABLE, except that only field names and types can be declared here.

Once the type is defined, you can use it to create a table:

CREATE TABLE on_hand (
    item      inventory_item,
    count     integer
);

INSERT INTO on_hand VALUES (ROW('fuzzy dice', 42, 1.99), 1000);

Inputting Composite Type Values

To write a composite type value as a text constant, enclose the field values in parentheses and separate them with commas. You can enclose any field value in double quotes if it contains commas or parentheses.

The general format for a composite type constant is as follows:

'( val1 , val2 , ... )'

An example is:

'("fuzzy dice",42,1.99)'

Accessing Composite Types

To access a field of a composite type, write a dot and the field name, very similar to selecting a field from a table name. Since this is very similar to selecting a field from a table name, you often need to use parentheses to avoid confusion for the parser. For example, you might want to select some subfields from the on_hand example table, like this:

SELECT item.name FROM on_hand WHERE item.price > 9.99;

This will not work because, according to SQL syntax, item is selected from a table name, not a field name. You must write it like this:

SELECT (item).name FROM on_hand WHERE (item).price > 9.99;

Or if you also need to use the table name (for example, in a multi-table query), write it like this:

SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price > 9.99;

Now the parentheses correctly resolve to a reference to the item field, and you can select subfields from it.

Range Types

Range data types represent a range of values of a certain element type.

For example, a timestamp range might be used to represent the time range for which a meeting room is booked. PostgreSQL's built-in range types include:

int4range — integer range
int8range — bigint range
numrange — numeric range
tsrange — timestamp without time zone range
tstzrange — timestamp with time zone range
daterange — date range

Additionally, you can define your own range types.

CREATE TABLE reservation (room int, during tsrange);
INSERT INTO reservation VALUES
    (1108, '[2010-01-01 14:30, 2010-01-01 15:30)');

-- Contains
SELECT int4range(10, 20) @> 3;

-- Overlaps
SELECT numrange(11.1, 22.2) && numrange(20.0, 30.0);

-- Extract upper bound
SELECT upper(int8range(15, 25));

-- Calculate intersection
SELECT int4range(10, 20) * int4range(15, 25);

-- Check if range is empty
SELECT isempty(numrange(1, 5));

Range values must follow these formats:

(lower bound, upper bound)
(lower bound, upper bound]
[lower bound, upper bound)
[lower bound, upper bound]
empty

Parentheses or brackets indicate whether the lower and upper bounds are inclusive or exclusive. Note that the last format is 'empty', representing an empty range (a range with no values).

-- Includes 3, excludes 7, and includes all points between
SELECT '[3,7)'::int4range;

-- Excludes 3 and 7, but includes all points between
SELECT '(3,7)'::int4range;

-- Includes only the single value 4
SELECT '[4,4]'::int4range;

-- Excludes any points (normalized to 'empty')
SELECT '[4,4)'::int4range;

Object Identifier Types

PostgreSQL uses object identifiers (OIDs) internally as primary keys for various system tables.

Additionally, the system does not add an OID system column to user-created tables (unless the table is created with WITH OIDS or the configuration parameter default_with_oids is enabled). The oid type represents an object identifier. There are several aliases for oid: regproc, regprocedure, regoper, regoperator, regclass, regtype, regconfig, and regdictionary.

Name	Reference	Description	Example
oid	any	Numeric object identifier	564182
regproc	pg_proc	Function name	sum
regprocedure	pg_proc	Function with argument types	sum(int4)
regoper	pg_operator	Operator name	+
regoperator	pg_operator	Operator with argument types	*(integer,integer) or -(NONE,integer)
regclass	pg_class	Relation name	pg_type
regtype	pg_type	Data type name	integer
regconfig	pg_ts_config	Text search configuration	english
regdictionary	pg_ts_dict	Text search dictionary	simple

Pseudo-Types

PostgreSQL's type system includes a number of entries for special purposes, known as pseudo-types. Pseudo-types cannot be used as column data types, but they can be used to declare a function's parameter or result type. Pseudo-types are useful when a function does not simply accept and return a standard SQL data type.

The following table lists all pseudo-types:

Name	Description
any	Indicates a function accepts any input data type.
anyelement	Indicates a function accepts any data type.
anyarray	Indicates a function accepts any array data type.
anynonarray	Indicates a function accepts any non-array data type.
anyenum	Indicates a function accepts any enum data type.
anyrange	Indicates a function accepts any range data type.
cstring	Indicates a function accepts or returns a null-terminated C string.
internal	Indicates a function accepts or returns a server-internal data type.
language_handler	A procedural language call handler declared to return language_handler.
fdw_handler	An external data wrapper declared to return fdw_handler.
record	Identifies a function that returns an undeclared row type.
trigger	A trigger function declared to return trigger.
void	Indicates a function that does not return a value.
opaque	A deprecated type formerly used for all the above purposes.

For more information, refer to: PostgreSQL Data Types

❮ Postgresql Expressions Postgresql Constraints ❯