Run Starcounter in Production
Introduction
While Starcounter offers replication facilities for the enterprises to implement their high availability business scenarios (e.g., when a point-of-sales is disconnected from a main server, but continues to operate autonomously until the connection resumes), it doesn't provide a built-in support for disasters-related scenarios such as hot failover, regular backups and disaster recovery. A large variety of ready-made facilities are available today to support those disaster-related scenarios out of the box. E.g., most of the facilities needed for those are shipped with Windows Server 2012 R2 Standard off the shelf.
When it comes to running applications, developers want to perform the "full replication" of their apps, which means not only data itself, but also database configurations, static files, text log files, bulk data database files, user downloads etc. This goes beyond the scope of pure database replication for disaster recovery (like SQL Server AlwaysOn): either you try to store everything in the database to rely on its facilities for failover, or you run database facilities along with manual maintenance of failover machine. Instead of pure database facilities, it is possible to choose more general facility which will fit in most of use cases and works on the level of operating system.
Starcounter is not taking a goal to outperform those with any of our potential house-grown tools, preferring to focus on our core unique features. Instead, we do our best to supply our customers with the latest information on doing these scenarios with the OS standard facilities. This article aims to provide such information for community version users as well.
Installation
Starcounter can be installed in three different modes: Developer Mode
, Production Mode
and Silent Mode
.
Developer Mode
This is the default installation mode where the Starcounter Server Service
is started on-demand, typically when you start a Starcounter Application.
Production Mode
In Production Mode
, the installer starts Starcounter Server Service
as part of the installation process and also sets the Startup Type
of the service to Automatic
so that it can be started on Windows Logon.
How to install
There are two different ways to install Starcounter in Production Mode.
Run the
<Your setup file name>.exe
by double clicking on it.Select
Production
on the Setup screen :
OR
Open a
Command Prompt
window withAdministrator
privilegesBrowse the directory where the Starcounter setup file exists.
Run the following command :
Silent Mode
In Silent Mode
the Starcounter installation process runs seamlessly without prompting any messages to the user or asking about any choices - Yes/No popups. This is particularly useful in production environments where user wants to install Starcounter in an unattended fashion with some commands or scripts.
How to install
Open a
Command Prompt
window withAdministrator
privilegesBrowse the directory where the Starcounter setup file exists.
Run the following command :
It should install Starcounter seamlessly
Logs and Checkpoint Files
Starcounter is an in-memory technology, hence it only uses disks for persistence and recovery rather than a primary storage. Starcounter streams changes in your data to disk, while uses only primary memory to read data. Changes from committed transactions are logged to disk by a means of log writer and stored in Starcounter .log
files. Periodically a disk image of the database is updated. This image, which is also known as a checkpoint, is stored in .sci
and .sci2
files. When checkpoint creation is executed, log files with records that the checkpoint now incorporates become redundant and thus are renamed from .log
to .log.bak
and are a subject of manual deletion. The fraction of .log
files that are still in work along with snapshot image files together represent a current durable state of the database.
Having .log.bak
files regularly erased. In order to clean .log.bak
files you need to create a task in your preferred task scheduler for Windows or in Windows Task Scheduler to run del
command on your .log.bak
files. Please make sure that the created task in Task Scheduler do actually run as expected. Sometimes it will not run because the correct user privileges aren't set.
Backup and Failover
Fault-tolerance is a property of a system to continue operating correctly after a software or a hardware failure. Fault-tolerance is achieved by redundancy on physical level together with some failover strategy. Failover strategies entail preventing malfunctions of hard drives (RAID, SAN), electricity chains problems (backup battery pack), general hardware failure (running multiple standby machines with different level of steadiness, running multiple active load-balanced machines).
Note that doing backups of data and providing failover strategy are not mutually exclusive. In general, you might end up with hot failover machine with a corrupted database if you corrupt your data within a human factor (e.g., uploading of a wrong application to production). Backups save you from human factor failure, failover is used for hardware factor failure.
While failover is about disaster recovery, scaling-out a system to multiple replicated nodes is done for high availability. Not only a hardware disaster may affect availability of a system, but also can do a deny of its service due to increased workloads. You might be familiar with situations when some database server "dies" under 1000 of simultaneous connections. Starcounter is highly available by design, so that a single node is capable of millions transactions per second out of the box. However, hardware issues might happen and you need to get ready for them. Different strategies for failover/disaster recovery are discussed in a checklist further.
Database Backups
Starcounter works with data files on a byte level, so that some of these files are locked and other processes and users are secured from reading them. This locking prevents users and services from making inconsistent copies of the files that are currently under operation.
The simplest way to backup your data is a cold backup: stop Starcounter, wait when log writer finishes writing changes to disk, copy data files, start Starcounter. We do not recommend doing cold data backup unless you have a strong reason to "stop the world". A better, stop-less way to backup data is regularly available within all workstation and server editions of Windows operating system.
Windows Volume Shadow Copy Service (VSS) allows making consistent copies of data files (known as VSS snapshots) even when the files are in use, and hence enables hot backup of your data. A VSS snapshot is a read-only point-in-time copy of a (disk) volume. Snapshots allow the creation of consistent backups of the volume, ensuring that the contents do not change and are not locked while the backup is being made. This is indeed an optimistic snapshot isolation technique similar to how transactions work in Starcounter, applied on a disk volume level by the operating system. While VSS snapshots the whole volume, it enables extraction of separate files from a snapshot, which is a point-in-time consistent backup of these files.
Hot backup of database files using VSS. Native tools to create and access VSS snapshots are available in Windows Server (diskshadow.exe
), but are not included in other Windows environments. We will use a tool that works effortlessly on all modern Windows operating systems (desktop and server). The below instruction explains how to backup the most recently available VSS snapshot of all files from a given location to another given location.
Objective: backup a location c:\Users\User\Documents\Starcounter\Personal\Data\Default\Default-20150915T100722555\
to Y:\Backup
.
Start by downloading the latest version of the shadowspawn.exe
utility from https://github.com/candera/shadowspawn/downloads. Put the utility on your machine in a globally available location, such as C:\Windows\System32
.
Run this backup command
This will create a shadow copy of a drive, mount shadowed copy of a source folder to Q:
(the drive letter Q
must be free, otherwise choose other letter), copy contents of Q:\
to a sub-folder in Y:\Backup
named after the current date and time, such as Y:\Backup\2015-09-15_16-41-48
, unmount and destroy a shadow copy. Copying is done with robocopy
utility that ships with Windows, you can use any other command, e.g. you can call your favorite incremental binary backup utility and store the result in a cloud.
Read more about robocopy in the robycopy documentation. You can also read this thread on GitHub for more information: https://github.com/Starcounter/Home/issues/363.
Note: creating and removing a VSS snapshot during the described backup routine may affect performance of your Starcounter application when you have user activity peaks. Consider running backup scenario in periods of time when you have less than hundreds of thousands of simultaneously connected users.
Working with VSS snapshots. VSS snapshot is a powerful tool. You can use native Windows tool vssadmin
to manage snapshots in a command line. We recommend a GUI-based tool ShadowExplorer (free) and Z-VSSCopy (free for non-commercial use) to create, view and mount VSS snapshots.
Files to backup: The files necessary for backup are:
<database_name>.cfg
<database_name>.*.log
<database_name>.*.optlog
The .cfg
file is locked while the database is running, but since this file doesn't change during the lifetime of the database, you only have to create a backup of it once.
Starcounter does not update the Date modified
attribute of the .log
files for performance reasons. Thus, don't rely on Date modified
to check if there are any changes.
Follow these steps to find these files:
Go to the server repository that is found, by default, at
%UserProfile%\Documents\Starcounter\Personal
. If configurations have been done in the installer, then it might reside somewhere else. In that case, it can be found by following the<server-dir>
path atProgram Files\Starcounter\configuration\Personal.xml
, if the default setting were accepted in the installer.Open
Personal.server.config
in this repository.Go to the path specified by
<DatabaseDirectory>
. It's normally something like:%UserProfile%\Documents\Starcounter\Personal\Databases
.In this directory, there will be a directory for each database with the same name as the database.
Open the directory for the database that will be backed up.
Open the file with the name
<database_name>.db.config
.Find the parameter
TransactionLogDirectory
and go to the path specified. There you will find the files that are necessary for backup.Use the directory above for the VSS backup.
In the case that there is no full .log
file in the directory, there will be no .optlog
file.
Restoring a Database From a Backup
Database is a set of files you have made backup for. To recover your database to one of the backups, you can create a new database in Starcounter Administrator and then, without starting it, copy all backup files into that database data folder. After doing so, you can shut down the old database and run your application on a new database with minimal downtime. If you have a failover machine, you can switch your web-frontend to route requests to a failover machine. The simplest way to restore a database from backup is to stop Starcounter, remove all database data files, copy backup files in data files location and start Starcounter. Starcounter reads the log files into memory on database start. Because of this, you have to restart the database to pickup the data from the backup files.
Starcounter in a Virtual Machine
Virtual machine is a convenient tool to make production environment easily deployable and manageable. For Starcounter apps running in virtual machine, the simplest way to do cold backup is to shut down VM, copy VM image, turn on VM. To perform hot backups of VMs, one should use Hyper-V facilities for shadow copying and Hyper-V asynchronous replication for disaster recovery. Please refer to the official Microsoft documentation on Hyper-V high availability features. Note that, since Starcounter is capable of doing millions of DB transactions per second on a single commodity server, it has a nice overcapacity to run with extreme performance even inside a VM.
SSD or HDD?
Starcounter streams changes to disk sequentially. Modern HDDs can stream terrabytes of data in seconds with sequential writes. That's why you can run your app suite on big commodity HDDs with our in-memory engine safely and performantly. Choice of HDD, SSD or PCI flash appliance will not affect your read performance at all, in most cases it is a choice of your preference. What is more, with cheap HHDs you can setup inexpensive RAID-1 array to make your solution be tolerant towards hardware faults and make your regular backups onto more disks. If you run an application with a vast amount of writes (more that 30% of writes among all database operations), consider going for faster storage devices to increase throughput while having tens of thousands of users operating simultaneously.
Disabling Anti-Virus Scan of Database Files
For higher performance, disable your anti-virus from scanning the database files in the server directory.
Checklist: Starcounter in Production
Always have enough spare space for the database log files and images on your hard-drive: 2-3x RAM.
Install task to clean
.log.bak
files in Windows Task Scheduler. Avoid disk overflow with.bak
files that you don't need! Verify that the task is actually executed as expected, since you might miss certain task settings so that Windows disallows it to run.Verify that a VSS-based backup routine is set. Always perform backups on a regular basis. Ensure that backup goes to a separate drive or to several independent drives. Please always remember that if you copy those files to a separate machine and use it as a failover replica, that replica is asynchronous!
For hot (seamless downtime) failover: consider synchronous replication (or adjustable asynchronous replication, when applicable in your business domain) with Windows Server 2016 Datacenter Edition Storage Replica and Failover Cluster features, or 3-rd party solutions available.
For hot asyncronous failover: consider running Starcounter is a VM over Hyper-V hypervisor with Replica feature. This also enables a convenient backup scheme: the replica will be the backup of the recent point-in-time for the master machine.
For warm (seconds downtime) synchronous failover: consider shared disk setup (SAN or NAS + round-robin front-end machine). Assert that there should always be a single master in such scheme: only one machine at a time can be active, several machines must not write to the same data files on a shared disk! As a commodity alternative, Microsoft Windows Server 2008, 2008 R2, 2012 and 2012 R2 provides a new feature called cluster shared volume. Coupled with Hyper-V Failover Cluster
Running on-premise: consider RAID-1 as a failover tool. The failure rates of modern HDDs vary from 1% to 10%, hence running some RAID-based solution is a necessity. In case of a cloud provider, most likely it will take care of making your data secured for hardware failures. However, problems occur in clouds sometimes, so you should care of backing up your data in the cloud environment.
Please always remember that your data is one of your most important assets! Implement the proper strategy for backup, failover and disaster recovery from day one. Using advices from the article, you can make your data safety strategy gradual. First of all, start with simple Shadow Copy backups. Use Dropbox for Business and asynchronous cryptography (with private and public keys) to store terabytes of backups for just hundreds of dollars per year. If you run Starcounter in a cloud like Amazon, then you most likely run Windows Server 2012 R2 Standard, which ships with Failover Clustering role out of the box. Use the facilities like Failover Clustering that go for free with your operating system. Create a simple Failover Cluster with one master machine and one hot-standby machine which would always be a replica of your master machine. Store your database image files on a Cluster Shared Volume, so that the hot-standby machine will continue to run on exact the same data. It will take some seconds for the replica machine to start Starcounter (since two machines cannot open the same database at one time), so the resulted failover will be warm. However, today you can achieve true synchronous hot failover using Windows Server 2016 Storage Replica (Datacenter edition is required).
Last updated