Designing an FRS Deployment

There are a small number of golden rules that should be followed when planning a FRS to ensure a consistent service levels.

After you have chosen FRS as the means to replicate data, you should support that service and not supplement or override it

If replication stops for some reason, the very worst thing you can do is to copy files manually to replication partners. This will cause additional replication traffic, backlog, and possible replication conflicts.

The correct action is to find the root cause that stopped replication from progressing and resolve that. Common causes include low disk free space, poor connectivity, excessive file updates, and files that are in use and cannot be replicated.

Proactively monitor the status of replication by using Sonar or Ultrasound.

If a system problem has prevented replication from progressing, then it is important to determine this and remedy the issue in a timely manner.

If replication outages persist unnoticed for long periods of time, then failures start to compound upon each other. For example, backlogs of data on one computer can cause accumulation of files in the staging directory to radiate out to other computers and progressively cause network-wide congestion, and also can affect other replica sets on the same computers.

For more information about monitoring FRS, see Monitoring FRS Using Sonar and Ultrasound Overview.

Look for unexpected sources of replication traffic and file locks

Any application can potentially alter a file. Some applications can unexpectedly alter many files on a regular basis. Examples include disk defragmentation tools, antivirus software, and file system policy templates.

Some applications might hold files open for long periods of time, preventing updated files from being applied from another computer. The most common example of this is "press-a-key" prompts and message boxes in login scripts where the user goes away for a long period of time after logging in.

An operator in one part of a system might make changes without realizing that another operator is making conflicting changes in the same area from some other computer. This is known as "dueling admins" and in some cases a manual replication feedback loop is created as the admins keep reapplying their changes, unaware that another is also making changes.

Build an FRS deployment plan that handles bandwidth availability, topology definition, data quantity, data change rate, and monitoring procedures

There are no fixed limits to how much data or data change rate that FRS can support; however, with more data and more volatile data, the system designer needs to consider how to optimize the topology and replica member configuration to support the required level of replication traffic.

Take the time to understand how FRS works

A significant amount of detailed FRS training material is now available. If you are using FRS in an advanced manner (large/complex topology, large amounts of replication traffic, and so on) then nominate a staff member to undergo this training so that they can be aware of potential issues and how to troubleshoot and repair any that occur.