-->

DEVOPSZONES

  • Recent blogs

    What does autodisable mean? Why did VCS autodisable my Service Group?

    VCS does not allow failovers or online operation of a Service Group if it is autodisabled. VCS has to autodisable a Service Group when VCS on a particular node shuts down *but* the GAB heartbeat is still running. Once GAB is unloaded, e.g. when the node actually shuts down to PROM level, reboots, or powers off, VCS on the other nodes can automatically clear the autodisable flag. During the time interval a Group is autodisable, VCS won't allow that Group to failover or be onlined anywhere within the cluster. This is a safety feature to protect against "split brains", when more than one machine is using the same resources, like the same filesystems and virtual IP at the same time. Once a node leaves the cluster, VCS has to assume that machine can be user-controlled before it goes down, that theoretically someone can login to that machine and manually startup services. It is for that reason that VCS autodisables a Group within the existing cluster. But VCS does let you clear the autodisable flag yourself. Once you're sure that the node that left the cluster doesn't have any services running, you can clear the autodisable flag with this command: 

    hagrp -autoenable {name of Group} -sys {name of node} 

    Repeat the command for each Group that has been autodisabled. The Groups that are autodisabled and the nodes they are autodisabled for can be found with this command: hastatus -sum 

    Most of the time VCS autodisables a Group for a short period of time and then clears the autodisable flag without you knowing it. If the node that leaves the cluster actually shuts down, the GAB module is also unloaded, and VCS running on the other nodes will assume that node has shutdown. VCS will then automatically clear the autodisable flags for you. There's one catch...by default VCS on the running cluster requires GAB to be unloaded within 60 seconds after VCS on that node is stopped. After 60 seconds, if GAB still isn't unloaded, VCS on the existing cluster will assume that node isn't shutting down, and will keep the autodisable flags until the administrator clears them. To increase the 60 second window to 120 seconds, run this: 

    hasys -modify ShutdownTimeout 120 

    For large systems that take a long time to shutdown, it is a good idea to increase ShutdownTimeout.

    No comments