Failover cluster
Starcounter failover on Windows Failover Cluster
Introduction
Windows Failover Cluster (WFC) is a component of Windows Server OS, that allows several machines (here nodes) to function together as a failover cluster. The purpose of a cluster is to administer resources assigned to it. A cluster monitors the health of resources and can restart or migrate them on another node if needed. Resources can be dependent on other resources and also belong to resource groups so that all resources from a group are running on the same node. WFC supports multiple different resources, three of them are of our particular interest – Generic Application, Generic Script, and IP Address.
A Generic Application is a resource that is backed by a customer-specified executable. WFC starts the executable on one of its nodes when a resource should go online and then tracks the process. If the process terminates or the node goes irresponsive, the cluster takes corrective actions, like restarting the process or migrating it to another node.
A Generic Script resource is a customer provided WSH-script. This script is used to manage some resource, and the cluster calls the script for tasks like setting the resource's online/offline status and retrieving the resource's current state.
An IP Address resource is, as its name implies, just an IP address. Whenever a node with this resource is online, it gets this address assigned to it.
WCF also provides for Clustered Shared Volume (CSV) services. CSV is a shared synchronized storage that is available for all cluster nodes, and which is presented to a cluster node as a regular NTFS volume. It provides all regular storage services, for example file-system locks that effectively become distributed locks in a cluster.
Basic setup
*Note: WebApp
is just a name of a .NET Core Starcounter 3.0 Web Application.*
Now we've covered all the resources we need to setup a Starcounter failover cluster. First, we start with an easy setup and show how it recovers from possible faults. Then we point to the drawbacks of this setup and show how we address it with a more advanced approach that would be appropriate for real deployments.
An easy setup could look like this:
Here "Starcounter" is a resource group containg three resources: WebApp
, a Starcounter application we want to make highly available, and two other resources that WebApp
depends on: an IP Address and a Database.
When starting a resource group, the cluster assigns it to some cluster node. On this node, it first starts WebApp
's dependencies, i.e. the IP Address and Database resources, and then the application.
The Database resource is a Generic Application resource. Starting it just starts the scdata
process, thus making the database available to connect to. The scdata
process locks the transaction log on CSV, reads it and then gets ready to serve incoming requests.
All write transactions also go to CSV. Once its prerequisite resources are started, the cluster starts the WebApp
resource by launching WebApp.exe
. WebApp.exe
, being a web app, binds to all local addresses, including the IP Address resource we assigned to it.
Now the group is fully started and ready to serve requests.
Let's consider possible faults and correcting actions:
Fault
Recovery action
WebApp.exe
crashes
Cluster detects that WebApp.exe
has terminated and restarts it. The new WebApp.exe
process connects to the Database and binds to the IP Address.
scdata.exe
crashes
Cluster detects scdata.exe
has terminated. It shuts down WebApp.exe
, as it is a dependent resource. Then the usual starting sequence occurs. scdata.exe
locks and reads the transaction log from CSV and WebApp.exe
binds to the IP Address.
Node goes offline (network failure or power outage)
Cluster detects that the node is offline and decides to move the role to another node. First it dismounts CSV from the old node, so that all locks are released. Then it selects a new hosting node and starts all resources on it. Due to the locks being released, scdata.exe
on the new node can lock the transaction log. IP Address is also transferred.
*Note: the basic setup is not recommended for production use and described for educational purposes only.*
Setup with scdata
in standby mode
scdata
in standby modeThe setup described above has an important drawback. In certain cases, recovery will require a fresh start of the scdata
process, which can take a significant amount of time. To overcome this, we need to run our scdata
instance in a special standby mode, in which it can:
Function without locking the transaction log.
Periodically read and apply the transaction log.
Switch to active mode upon request.
To keep scdata
running on a non-active cluster node we can't use cluster resources, as WFC ensures that all resources are online on a single node. Instead we must provide an auto-started windows service, starservice
, that administers scdata.exe
for us.
This is how it works:
Event
Reaction
starservice
starts
starservice
starts scdata
in standby mode and periodically sends requests to poll transaction log
starservice
stops
starservice
stops scdata
starservice
terminates unexpectedly
OS kills scdata
scdata
terminates unexpectedly
starservice
detects it and stops itself
starservice
receives a request to promote scdata
to active mode
starservice
passes this request on to scdata
To properly control starservice
, i.e. starting, stopping and sending promotion requests, we use a Generic Script resource.
Now the setup looks like this:
Every cluster node has an instance of configured
starservice
.We configure these cluster resources as such:
The Database script has the following workflow:
Cluster event
Action
Go Online
Start starservice
¹. Send activate request.
Go Offline
Restart starservice ².
¹ As a safety measure. The normal condition for a service is to be always started.
² As of now we can't switch
scdata
from active to standby mode, so we restart the service and thusscdata
, so it restarts in standby mode. It's not a problem since the resource goes offline most likely because we're transferring the group to another node, so we have enough time to load the database on this node. Next time the cluster decides to host the group again on this node,scdata
will already be prepared.
Now instead of starting scdata
when the group moves to a new node, the cluster will start the Database script resource, which in turn ensures that scdata
is started and active. WCF will handle migration of CSV, IP Address, and WebApp
.
This new setup shares one drawback with the first one: if scdata
crashes, the cluster will restart it on the same node first. And it might take time. This issue could be seen as marginal, however, as scdata
should never crash. A crashing scdata
process is in of itself a more severe problem than a slow recovery.
Future directions
We plan to design
scdata
to allow it to serve read requests in standby mode. With this feature, every cluster node will become an eventually consistent read-only replica.
Practical setup steps
*Note: It is important to specify the database path using exactly the same value in all places where it occurs. Values such as C:\Path\To\Db
& C:/Path/To/Db
are treated as different.*
See also the article about the Database connection string.
1. Setup cluster and CSV
2. Create database on CSV
Using the star
tool:
Using the native sccreatedb
tool:
3. Setup starservice
on every node
starservice
on every nodeDownload, unzip, and copy the
starservice
files to all nodes. These files should have the same lication on all nodes.Create a service to start the database. The service name should be the same on all nodes.
Using the sc.exe
tool:
Using the starservice.exe
tool itself:
4. Create a new cluster resource group
5. Create and configure the IP Adress resource
6. Create and configure the Database script resource
Copy the scripts
folder from the previously downloaded archive to all nodes. The local path should be the same on all nodes. Don't use the CSV volume as it will complicate resource upgrade and troubleshooting.
Then:
7. Create and configure the WebApp
resource
WebApp
resourceCopy the WebApp
files to all nodes. The local path should be the same across nodes. Don't use CSV volume as it will complicate resource upgrade and troubleshooting.
Then:
8. Setup resource dependencies
9. Start the group
10. Extra notes and resources
Make sure to specify a reasonable configuration for the maximum allowed failures over time for the required resources. The default configuration is very limiting.
Make sure to have at least three nodes in a cluster, or a file share witness to keep the cluster alive when a node goes down.
Starcounter failover on Linux
Starcounter 3 Release Candidate does not yet support failover on Linux operating systems out of the box. If you have a Linux production environment which requires failover, please contact us.
Last updated