Upgrading Postgres/tsearch2 on CentOS 5

July 23rd, 2009

For anyone using the tsearch2 module with postgres version 8.1 included with CentOS 5, upgrading to 8.3 or later can be a little tricky and IMO, the official documentation and a number of blog posts I've read skip over the details. I documented what I did when I first upgraded one of our servers from 8.1 to 8.4, and I've posted an edited version here for anyone that might find it useful. The instructions should also be applicable to users of RHEL and Fedora.

Normally, when you upgrade postgres, you can just dump all databases in the cluster, using pg_dumpall, and then restore them once you've upgraded. Unfortunately, none of the tsearch2 stuff from pre-8.3 versions will work if you do this, so for each of the databases using tsearch2, you need to avoid restoring all that old tsearch2 stuff. A simple solution to this is to dump the databases that use tsearch2 separately, without the clean option, then when restoring, create each of these database manually and install the tsearch2 compatibility module before restoring the data from the dump file.

Doing this avoids restoring the old tsearch2 stuff, because the various tsearch functions, operators etc. will already exist, having been created by the tsearch2 compatibility module, by the time you restore from the dump. It also means you should avoid having to update your old code to work with the full text search features as they are implemented in postgres 8.3 and later.

One thing to note with this process is that it assumes that you have a lot of databases, but only a handful using tsearch2. If you have loads of databases in your cluster using the tsearch2 module, this method might be a little tedious.

I was logged in to a CentOS 5 server as root while performing the upgrade, and we allow root to connect as user postgres using ident authentication. You might have to modify the commands slightly.

  • First, download the latest yum setup rpm for your distro from http://yum.pgsqlrpms.org/reporpms/repoview/index.html
  • Install the package:
    rpm -Uvh pgdg-centos-8.4-1.noarch.rpm
  • Edit base cent os repo file (I use joe because I'm an old git, you probably use something else):
    joe /etc/yum.repos.d/CentOS-Base.repo
    Note: if you're using RedHat or Fedora, you'll need to edit the equivalent repo file (sorry, I'm not sure what it's called).
  • Add a line to exclude postgres related packages to both the base and updates sections of the file. You should end up with something like this:
    [base]
    name=CentOS-$releasever - Base
    mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
    #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
    gpgcheck=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
    exclude=postgresql*
    
    #released updates
    [updates]
    name=CentOS-$releasever - Updates
    mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates
    #baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
    gpgcheck=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5
    exclude=postgresql*
    
  • Dump all databases using the clean option:
    pg_dumpall -U postgres -c > /var/backup/postgres/postgres_all.sql
    Note: I'd also recommend making a separate backup that uses inserts rather than copy statements to keep in reserve in case something goes pear shaped
  • Dump each database that uses tsearch2 separately, without using the clean option! (read that again, DO NOT USE THE CLEAN OPTION):
    pg_dump -U postgres my_tsearch_db > /var/backup/postgres/my_tsearch_db.sql
  • Update server to latest version:
    yum update
  • Stop the server and move your old data directory:
    service postgresql stop
    mv /var/lib/pgsql/data /var/lib/pgsql/data_8.1
  • Init a new postgres 8.4 database cluster:
    service postgresql initdb
  • Update the configuration settings as necessary in the new postgresql.conf:
    joe /var/lib/pgsql/postgresql.conf
    ...
    log_directory = '/var/log/postgresql'
    log_line_prefix = '%t %d'
    ...
    
  • Edit the pg_ident.conf file if necessary to allow ident authentication (we allow root to connect as user postgres):
    joe /var/lib/pgsql/data/pg_ident.conf
    ...
    # MAPNAME     IDENT-USERNAME    PG-USERNAME
    localusers      root            postgres
    ...
    
  • Edit the pg_hba.conf file if necessary to allow your users connect:
    joe /var/lib/pgsql/data/pg_hba.conf
    ....
    # TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
    # "local" is for Unix domain socket connections only
    local   all         all                               ident map=localusers
    # IPv4 local connections:
    host    all         all         127.0.0.1/32          password
    ....
    
    Note that with ident authentication, the format of the options provided to each connection method has changed and now requires name value pairs. For example, if you had an ident map name of localusers in postgres 8.1, the method field in your pg_hba.conf could be set to "ident localusers". With 8.4, this needs to be set to "ident map=localusers".
  • Just to be on the safe side, make sure the permissions on the config files are correct (in our case, they should be owned by user postgres):
    chown postgres:postgres /var/lib/pgsql/data/*.conf
  • Start the database cluster:
    service postgresql start
    If this fails, try running the postgres binary manually to see what's causing the problem. Open postmaster.opts to get the full command to execute, switch to user postgres and execute it.
  • Check that the connection to the server is working (and fix your config if it isn't):
    psql -U postgres
  • Restore the dump of all databases:
    psql -U postgres < /var/backup/postgres/postgres_all.sql
  • Switch to postgres user and drop the databases that use tsearch2:
    su postgres
    dropdb my_tsearch_db
    ...
  • Create empty databases for each database that uses tsearch2:
    createdb -E UTF-8 my_tsearch_db
    ...
    exit
  • Install tsearch2 compatibility module and then restore each database that uses tsearch2:
    psql -U postgres my_tsearch_db < /usr/share/pgsql/contrib/tsearch2.sql
    psql -U postgres my_tsearch_db < /var/backup/postgres/my_tsearch_db.sql

    Note: you will see lots of errors when restoring the databases, this is expected and is due to the fact that the various functions and operators for tsearch already exist.
  • Quickly test that tsearch features are working as expected:
    psql -U postgres my_tsearch_db
    # SELECT * FROM some_table_with_a_tsvector_column WHERE that_tsvector_column @@ to_tsquery('English','test');
    
    Fingers-crossed, you won't see any errors and everything will be working perfectly.

And that's it, upgrade complete, yay!

Sorry, comments are closed for this article.