17 May, 2013

Database redundancy for your vCenter database(s)

The most important database within a vSphere environment is the vCenter database without a doubt. VMware therefore has enclosed detailed instructions on how to setup and configure this database, they have a guide for every supported database type. Recently I ran into a situation which made me believe that VMware "forgot" some details on this database configuration guide, at least when you have your vCenter database running on Oracle.
A customer has chosen to put their vCenter database on Oracle as this was their preferred database knowledge wise. And they set it up to also be resilient, the way they achieved this was by having a active and a standby database placed on 2 different database server in separated datacenters. To me it looked like a very solid solution. On the vCenter part they modified the TNSNames.ora in such a way it now included 2 database server addresses and also contained the parameters for connect-time failover and load balancing.
By doing this they made sure that vCenter could (almost) always connect to one of the two database servers, it would simply do a failover when the connection time would expire. In this case the failover would not have been quick enough to keep vCenter up-and-running but it would need a reboot (or at least a restart of services) to get connection again. But this would not affect the running VMs at all.
For maintenance purposes to the database servers, we had to switch from the active server to the backup server. As this was a planned action, we could first gracefully stop the vCenter services and after switch to the standby database server. After the switch all vCenter services were started again and vCenter went up-and-running like it supposed to do.
One issue that occurred during this database server switch was that VMware Orchestrator, which was installed on a separate server stopped working, logging all kinds of database related error's. With a quick look at the database configuration of Orchestrator I remembered that it could not cope with multiple database server addresses and was set to connect to the database server that now had become the standby. By changing the database server and starting the Orchestrator services again this problem was solved.
At least until the next day when I took a look at the vCenter Operations dashboard and found that the health of vCenter was 0

When I looked into more detail on what caused this I found VMware vCenter Storage Monitoring Service - Service initalization failed on only thing I found that could link this alert to the database failover was the timestamp, it was recorded right at the same time the failover had happened.


Not really knowing where to start investigating on the vCenter server, I first tried to find some information on the VMware KB and the first article that came up described the exact same error message. When reading kb2016472 I quickly found confirmation that this issue was related to the database failover although it refers to vCenter 4.X and 5.0 with the use of a SQL database instead of vCenter 5.1 / Oracle database.
It appears that this vCenter Storage Monitoring Service does not use the TNSNames.ora for the database connection, it has it's own configuration / connection file called vcdb.properties. This file has only the first of the two database server addresses.
Thru the information in the KB article I knew what to change to get the connection set to the backup database server, and after a restart of the vCenter Server service the vCenter Storage Monitoring Service initialized ok and started without any error.
So my conclusion is that even when you have redundancy or failover setup on vCenter database level, there are still some vCenter related products and services that need some manual action to continue to work in case of a (planned) database failover.

2 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete