Monday, November 12

Managing useful clusters with repmgr

To create a standby in PostgreSQL is relatively easy, and because of the hot standby feature, that alone is certainly very useful.

But creating a useful cluster that gives us real HA that’s another story. The complexities added by failure detection, STONITH rules and choosing a new master could be overwhelming. But still, it is an important part of maintaining a healthy PostgreSQL environment.

Repmgr 2.0 gives you the opportunity to configure a more streamlined failover process for when you need it.

Repmgr 2.0 was released as a beta about a month ago, and we incorporated many feature requests received from the community. If you requested something of the tool which didn’t make it into this release, keep on us, as we’ve been working hard to insure we a had release out prior to 9.2 launch.

For those of you familiar with our initial version, repmgr 2.0 contains some functionality updates to make the tool more useful.

Because of the auto failover functionality,we no longer have the need for history logging to choose the best candidate to promote as new master saving precious IO bandwidth. The script accomplishes this by auto detecting which node is most up to date and failing over to that one. We also added a node_name in repl_nodes table, so we can identify servers better, because this name is used as application_name in the replication connection to the server, when we see pg_stat_replication we can better identify every connection.

Subsequent to that there is also CLUSTER SHOW command (thanks to our friends in Italy), allowing the user to query and see what nodes are part of our cluster if it’s working (we can connect to it) and its role on the cluster: primary, standby or witness.

The CLUSTER_CLEANUP command (we heard from many users on this) is also important as it negates the task of manually cleaning up repl_monitor table, saving time and accidental deletion of a live node from the table.

My personal favorite segment of the tool update is the autofailover. It’s so flexible that we allow the user to supply a customized script to execute several commands at failover time. For example it could be combined with pgbouncer to make a simple failover setup that did not require root privileges. 

9 Comments

  • It looks like the repmgr DB tables/schemas are not created in this version though ?

    [email protected]:~$ psql -x -d repl -c “SELECT * FROM repmgr_test.repl_status”
    (No rows)
    [email protected]:~$

    It looks like there is NO sign of repmgr_test ? ->

    [email protected]:~$ psql -d repl
    psql (9.1.6)
    Type “help” for help.

    repl=# \d
    List of relations
    Schema | Name | Type | Owner
    ——–+——————+——-+———-
    public | pgbench_accounts | table | postgres
    public | pgbench_branches | table | postgres
    public | pgbench_history | table | postgres
    public | pgbench_tellers | table | postgres
    (4 rows)

    repl=#

    [email protected]:~$ psql -d repl
    psql (9.1.6)
    Type “help” for help.

    repl=# \l
    List of databases
    Name | Owner | Encoding | Collate | Ctype | Access privileges
    ———–+———-+———-+————-+————-+————————-
    postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
    repl | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres +
    | | | | | postgres=CTc/postgres +
    | | | | | gocardless=CTc/postgres
    template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
    | | | | | postgres=CTc/postgres
    template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
    | | | | | postgres=CTc/postgres
    (4 rows)

    [email protected]:~$ repmgr -f /var/lib/postgresql/rempgr/repmgr.conf cluster show
    Role | Connection String
    * master | host=db1repl.common.staging user=repmgr dbname=repl
    standby | host=db2repl.common.staging user=repmgr dbname=repl
    [email protected]:~$

    • The schema exists otherwise you have got an error when executing “SELECT * FROM repmgr_test.repl_status” instead of the “(No rows)” message.
      But monitoring is not longer enabled by default so repl_status is normally empty, if you want the monitoring enabled run repmgrd with the option
      –monitoring-history

  • Hi Jaime,

    You are doing an awesome job with repmgr, so congratulations about it!! I have been testing repmgr during some time but still doing everything manually (all works fine manually); however, I have many doubts about how to implement the autofailover when using repmgr in the 2.0 beta release. Unfortunately the documentation is still poor about it. I have been googling about it without any luck and even I reviewed this video http://www.youtube.com/watch?v=cv03MDdll4U, where almost at the end there are some small tips about how to do that, but still having doubts, due there are references about some configuration parameters but it is not clear where to specify them, in other words, if should be located at postgres.conf or repmrg.conf or pgbouncer.conf file.

    Would you please give me a hand about it by providing me with a more detailed procedure or steps I should follow in order to accomplish this?

    Thanks in advance!
    Best regards,
    Osmel

    PS: you can reply in english or spanish, it doesn’t matter I spoke fluently both languages ;)

  • Jaime, thanks a lot for your quick reply and for providing this information. I had not reviewed that information yet as per when googling regarding this topic I was not able to find that URL. This seems to be what I was looking for in order to continue testing repmgr autofailover capabilities. As an additional question, do you have an example of the promote_command.sh shell script? I noticed that in the referenced Youtube video in my previous comments, there is a reference about this script using pgbouncer; however, is pgbouncer necessary? or it is possible to implement the promotion steps using a different procedure in the promote_command.sh shell script?

    Thanks again for any help about it! ;)

    • i use pgbouncer just as a quick (and maybe dirty) way to do fencing. it is not necesary.
      The only important thing you *must* have in it is the command that causes the promotion, you can use whatever you want if you know how to promote a standby node or use the command “repmgr standby promote -f repmgr.conf”

  • Thank you very much! I just was needing confirmation about it, if it was possible to use a simple promote command using repmgr syntax for doing this. The only detail I noticed is that I need to modify the postgresql.conf file manually before executing this command in order repmgr restart Postgres service using the correct settings on the new master node, so now I have a good idea about how to create my own shell script for performing those steps automatically.

    Thanks again for all your help!, I really appreciated it.

  • ns

    Hi,
    I want have Master-slave postgres replication with auto failover.

    My Environment:
    repmgr version 2.0
    Postgres 9.2.6
    OS : redhat linux 6.5

    I am installed the repmgr(version 2.0) and Postgres 9.2.6. I am configured the repmgr on Master and Standby servers and when i tried to configure it on witness server as
    repmgr -d repmgr -U repmgr -h [Master-IP] -D $PGDATA -f /var/lib/pgsql/repmgr/repmgr.conf –verbose witness create

    i am getting the error
    “Connection to database failed: fe_sendauth: no password supplied”

    and my configuration file on witness server is

    cluster=media_cluster
    node=3
    node_name=node3
    conninfo=’host=[witness server-ip] dbname=repmgr user=repmgr password=repmgr’
    master_response_timeout=60
    reconnect_attempts=6
    reconnect_interval=10
    failover=automatic
    promote_command=’repmgr standby promote -f /var/lib/pgsql/repmgr/repmgr.conf’
    follow_command=’repmgr standby follow -f /var/lib/pgsql/repmgr/repmgr.conf’
    pg_bindir=’/usr/bin’
    loglevel=DEBUG

Leave a Reply

Your email address will not be published. Required fields are marked *