Failover cluster
Last updated
Last updated
(WFC) is a component of Windows Server OS, that allows several machines (here nodes) to function together as a failover cluster. The purpose of a cluster is to administer resources assigned to it. A cluster monitors the health of resources and can restart or migrate them on another node if needed. Resources can be dependent on other resources and also belong to resource groups so that all resources from a group are running on the same node. WFC supports multiple different resources, three of them are of our particular interest – Generic Application, Generic Script, and IP Address.
A Generic Application is a resource that is backed by a customer-specified executable. WFC starts the executable on one of its nodes when a resource should go online and then tracks the process. If the process terminates or the node goes irresponsive, the cluster takes corrective actions, like restarting the process or migrating it to another node.
A Generic Script resource is a customer provided -script. This script is used to manage some resource, and the cluster calls the script for tasks like setting the resource's online/offline status and retrieving the resource's current state.
An IP Address resource is, as its name implies, just an IP address. Whenever a node with this resource is online, it gets this address assigned to it.
WCF also provides for (CSV) services. CSV is a shared synchronized storage that is available for all cluster nodes, and which is presented to a cluster node as a regular NTFS volume. It provides all regular storage services, for example file-system locks that effectively become distributed locks in a cluster.
*Note: WebApp
is just a name of a .NET Core Starcounter 3.0 Web Application.*
Now we've covered all the resources we need to setup a Starcounter failover cluster. First, we start with an easy setup and show how it recovers from possible faults. Then we point to the drawbacks of this setup and show how we address it with a more advanced approach that would be appropriate for real deployments.
An easy setup could look like this:
Here "Starcounter" is a resource group containg three resources: WebApp
, a Starcounter application we want to make highly available, and two other resources that WebApp
depends on: an IP Address and a Database.
When starting a resource group, the cluster assigns it to some cluster node. On this node, it first starts WebApp
's dependencies, i.e. the IP Address and Database resources, and then the application.
The Database resource is a Generic Application resource. Starting it just starts the scdata
process, thus making the database available to connect to. The scdata
process locks the transaction log on CSV, reads it and then gets ready to serve incoming requests.
All write transactions also go to CSV. Once its prerequisite resources are started, the cluster starts the WebApp
resource by launching WebApp.exe
. WebApp.exe
, being a web app, binds to all local addresses, including the IP Address resource we assigned to it.
Now the group is fully started and ready to serve requests.
Let's consider possible faults and correcting actions:
Fault
Recovery action
WebApp.exe
crashes
Cluster detects that WebApp.exe
has terminated and restarts it. The new WebApp.exe
process connects to the Database and binds to the IP Address.
scdata.exe
crashes
Cluster detects scdata.exe
has terminated. It shuts down WebApp.exe
, as it is a dependent resource. Then the usual starting sequence occurs. scdata.exe
locks and reads the transaction log from CSV and WebApp.exe
binds to the IP Address.
Node goes offline (network failure or power outage)
Cluster detects that the node is offline and decides to move the role to another node. First it dismounts CSV from the old node, so that all locks are released. Then it selects a new hosting node and starts all resources on it. Due to the locks being released, scdata.exe
on the new node can lock the transaction log. IP Address is also transferred.
*Note: the basic setup is not recommended for production use and described for educational purposes only.*
scdata
in standby modeThe setup described above has an important drawback. In certain cases, recovery will require a fresh start of the scdata
process, which can take a significant amount of time. To overcome this, we need to run our scdata
instance in a special standby mode, in which it can:
Function without locking the transaction log.
Periodically read and apply the transaction log.
Switch to active mode upon request.
To keep scdata
running on a non-active cluster node we can't use cluster resources, as WFC ensures that all resources are online on a single node. Instead we must provide an auto-started windows service, starservice
, that administers scdata.exe
for us.
This is how it works:
Event
Reaction
starservice
starts
starservice
starts scdata
in standby mode and periodically sends requests to poll transaction log
starservice
stops
starservice
stops scdata
starservice
terminates unexpectedly
OS kills scdata
scdata
terminates unexpectedly
starservice
detects it and stops itself
starservice
receives a request to promote scdata
to active mode
starservice
passes this request on to scdata
To properly control starservice
, i.e. starting, stopping and sending promotion requests, we use a Generic Script resource.
Now the setup looks like this:
Every cluster node has an instance of configured starservice
.
We configure these cluster resources as such:
The Database script has the following workflow:
Cluster event
Action
Go Online
Start starservice
¹. Send activate request.
Go Offline
Restart starservice ².
¹ As a safety measure. The normal condition for a service is to be always started.
² As of now we can't switch scdata
from active to standby mode, so we restart the service and thus scdata
, so it restarts in standby mode. It's not a problem since the resource goes offline most likely because we're transferring the group to another node, so we have enough time to load the database on this node. Next time the cluster decides to host the group again on this node, scdata
will already be prepared.
Now instead of starting scdata
when the group moves to a new node, the cluster will start the Database script resource, which in turn ensures that scdata
is started and active. WCF will handle migration of CSV, IP Address, and WebApp
.
This new setup shares one drawback with the first one: if scdata
crashes, the cluster will restart it on the same node first. And it might take time. This issue could be seen as marginal, however, as scdata
should never crash. A crashing scdata
process is in of itself a more severe problem than a slow recovery.
We plan to design scdata
to allow it to serve read requests in standby mode. With this feature, every cluster node will become an eventually consistent read-only replica.
*Note: It is important to specify the database path using exactly the same value in all places where it occurs. Values such as C:\Path\To\Db
& C:/Path/To/Db
are treated as different.*
Using the star
tool:
Using the native sccreatedb
tool:
starservice
on every nodeDownload, unzip, and copy the starservice
files to all nodes. These files should have the same lication on all nodes.
Create a service to start the database. The service name should be the same on all nodes.
Using the sc.exe
tool:
Using the starservice.exe
tool itself:
Copy the scripts
folder from the previously downloaded archive to all nodes. The local path should be the same on all nodes. Don't use the CSV volume as it will complicate resource upgrade and troubleshooting.
Then:
WebApp
resourceCopy the WebApp
files to all nodes. The local path should be the same across nodes. Don't use CSV volume as it will complicate resource upgrade and troubleshooting.
Then:
Make sure to specify a reasonable configuration for the maximum allowed failures over time for the required resources. The default configuration is very limiting.
Make sure to have at least three nodes in a cluster, or a file share witness to keep the cluster alive when a node goes down.
Starcounter 3 Release Candidate does not yet support failover on Linux operating systems out of the box. If you have a Linux production environment which requires failover, please contact us.
See also the article about the .
.
.
.
.