Monday, March 25

Having Group By Clauses — elein’s GeneralBits

Elein's PlanetPostgreSQL
Some people go to great lengths to avoid GROUP BY and HAVING clauses in their queries. The error messages are fussy but they are usually right. GROUP BY and HAVING key words are essential for good SQL reporting. The primary reason for GROUP BY is to reduce the number of rows, usually by aggregation. It produces only one row for each matching grouping from the input. This allows you to make sophisticated calculations via ordinary SQL. Fruit Example: We have some fruit: item | source | amt | fresh_until ---------+---------+-----+------------ bananas | Chile | 50 | 2019-05-01 bananas | Bolivia | 25 | 2019-04-15 bananas | Chile | 150 | 2019-07-10 apples | USA-WA | 75 | 2019-07-01 apples | USA-CA | 75 | 2019-08-15 apples | Canada | 80 | 2019-08-01 (more…)

JOIN LATERAL

Elein's PlanetPostgreSQL, PostgreSQL
 LATERAL The primary feature of LATERAL JOIN is to enable access elements of a main query in a subquery which can be very powerful. Several common uses of LATERAL are to: denormalize arrays into parent child tables aggregation across several tables row or action generation. Note, however, that the subquery will execute for each main query row since the values used in the subquery will change. This might make a slower query. USAGE SELECT <target list> FROM <table> JOIN LATERAL (<subquery using table.column>) as foo; Here are three examples of using LATERAL.  Obviously there are many more.: Normalizing In the Normalization example we have a table (denorm) containing ids and an array of other ids. We want to to flatten the arrays, creating (more…)

Webinar: Achieving High Availability with PostgreSQL [Follow Up]

Liaqat's PlanetPostgreSQL
High availability has become a requirement for any application these days. Achieving high availability in PostgreSQL is not just about replicating data across different nodes, but to do so in a way that optimizes the replication process. 2ndQuadrant hosted the "Achieving High Availability with PostgreSQL" webinar to explain High Availability concepts in PostgreSQL, best practices, how to configure application setup, backups, and more. The webinar was presented by Martín Marqués, Deputy Head of Support at 2ndQuadrant. Those who weren’t able to attend the live event can now view the recording here. For any questions or comments regarding PostgreSQL replication, please send an email to [email protected] (more…)

PostgreSQL with passphrase-protected SSL keys under systemd

Eisentraut's PlanetPostgreSQL
PostgreSQL supports SSL, and SSL private keys can be protected by a passphrase. Many people choose not to use passphrases with their SSL keys, and that's perhaps fine. This blog post is about what happens when you do have a passphrase. If you have SSL enabled and a key with a passphrase and you start the server, the server will stop to ask for the passphrase. This happens automatically from within the OpenSSL library. Stopping to ask for a passphrase obviously prevents automatic starts, restarts, and reboots, but we're assuming here that you have made that tradeoff consciously. When you run PostgreSQL under systemd, which is very common nowadays, there is an additional problem. Under systemd, the server process does not have terminal access, and so it cannot ask for any (more…)

Webinar: Banking on Postgres – Financial Application Considerations [Follow up]

Liaqat's PlanetPostgreSQL, PostgreSQL
The demand for PostgreSQL within the financial industry has been rapidly increasing in the recent years; mainly due to reduction in licensing costs, better choice of open source tools, and the robust enterprise features that PostgreSQL provides. 2ndQuadrant hosted the "Banking on Postgres" webinar to discuss attributes of financial databases based on Postgres, configuration processes, hardware needs, availability, backups, and more. The webinar was presented by Shaun Thomas, Principal Consultant at 2ndQuadrant. Those who weren’t able to attend the live event can now view the recording here. For any questions or comments regarding Postgres-BDR, please send an email to [email protected] (more…)

PG Phriday: Terrific Throughput Tracking

Shaun's PlanetPostgreSQL
Postgres has a lot of built-in information functions that most people don't know about. Some of these are critical components to identifying replication lag, but what if we could use them for something else, like throughput? This man's one simple trick can track actual database throughput; DBAs hate him! Everybody Knows Let's take a look at a common query we might use to track replication lag between a Postgres 11 Primary and one or more Replica nodes. SELECT client_addr, pg_wal_lsn_diff( pg_current_wal_lsn(), sent_lsn ) AS sent_lag, pg_wal_lsn_diff( pg_current_wal_lsn(), write_lsn ) AS write_lag, pg_wal_lsn_diff( pg_current_wal_lsn(), flush_lsn ) AS flush_lag (more…)

Maintaining feature branches and submitting patches with Git

Eisentraut's PlanetPostgreSQL
I have developed a particular Git workflow for maintaining PostgreSQL feature branches and submitting patches to the pgsql-hackers mailing list and commit fests. Perhaps it's also useful to others. This workflow is useful for features that take a long time to develop, will be submitted for review several times, and will require a significant amount of changes over time. In simpler cases, it's probably too much overhead. You start as usual with a new feature branch off master git checkout -b reindex-concurrently master and code away. Make as many commits as you like for every change you make. Never rebase this branch. Push it somewhere else regularly for backup. When it's time to submit your feature for the first time, first merge in the current master branch, fix any (more…)

PG Phriday: PgBouncer or Bust

Shaun's PlanetPostgreSQL
What is the role of PgBouncer in a Postgres High Availability stack? What even is PgBouncer at the end of the day? Is it a glorified traffic cop, or an integral component critical to the long-term survival of a Postgres deployment? When we talk about Postgres High Availability, a lot of terms might spring to mind. Replicas, streaming, disaster recovery, fail-over, automation; it's a ceaseless litany of architectural concepts and methodologies. The real question is: how do we get from Here to There? The Importance of Proxies It's no secret that the application stack must communicate with the database. Regardless of how many layers of decoupling, queues, and atomicity of our implementation, data must eventually be stored for reference. But where is that endpoint? Presuming that write (more…)

Sequential UUID Generators on SSD

2ndQuadrant, Tomas' PlanetPostgreSQL
After I shared the sequential UUID benchmarks a couple of weeks ago, one of the points raised in feedback was the choice of the storage space. I've intentionally used a fairly weak storage system (RAID10 on three 7.2k SATA drives) because I wanted to demonstrate the benefits. But a couple of readers suggested using SSDs might significantly reduce the difference between regular and sequential UUIDs due to SSDs handling random I/O much better than rotational storage. My hypothesis was that while using SSDs may reduce the gap, it certainly won't eliminate it entirely because the amplification (both in terms of number of I/O requests and WAL volume) is independent of the storage system. But the only way to verify this it is to repeat the tests, this time on SSDs. So here we go ... (more…)