Server failover

YSoft SafeQ Server Failover System

images/download/attachments/21955323/quotes.png As a System Administrator, I would like to configure my printing environment in the way there is not a single point of failure, so that I can significantly increase high availability of printing services.

System readiness (especially 3rd party technologies, like NLB) must be consulted with Y Soft before using with customer.

Failover - Overview

Note: Failover works only within high-speed and low-latency networks (preferably 1Gb ethernet networks). Based on experience from our existing customers, printing via slow networks is generally unacceptable for end-users even for a short period of time. The failover currently works only with SafeQ CML "cluster". Although ORS servers can also form "clusters/near roaming groups", automated failover isn't fully supported at the moment.

Assumptions

SafeQ System offers application-level failover and load balancing which encompasses all system components. This failover and load balancing is however subject to limitations imposed by components beyond SafeQ CML System supplier control.

Solution is based on the YSoft SafeQ Application level clustering technology. This technology is based on the inter-server communication, message exchange and metadata synchronization among individual YSoft SafeQ CML Servers.

There are two pre-requisites:

  • to have the CML servers available on high-speed and low-latency network (preferably fiber optics).

  • to have all YSoft SafeQ components (CML Server Core, Terminal Server) installed at the same server and to have database instance for every system in the cluster.

To enable full failover and load balancing for MFPs connected with YSoft SafeQ system, there are three options to be used:

  1. For some devices and technologies, client-based failover mechanism can be used

  2. Microsoft Cluster Services (MSCS)

  3. Microsoft Windows Network Load Balancing (NLB) technology can be leveraged

Failover options

Option

Pros

Cons

Client based server real-time failover mechanism

  • doesn't require any infrastructure changes

  • device are equity divided among the nodes

  • requires application support

  • requires client software to be installed at workstations

  • for embedded (WS-based) technologies, it requires re-installation of the device and may confuse users

Server OS based real-time server failover mechanism (MSCS)

  • state of the art solution

  • requires complex infrastructure configuration 

  • expensive initial investments

  • expensive maintenance

  • virtual IP must be used for administration (Terminal Server connected with local server only)

Server OS based real-time server failover mechanism (WNLB)

  • included in Standard Windows server version

  • fully transparent to the end users

  • fast and reliable solution that covers both load balancing and fail over

  • requires extra infrastructure configuration

  • network multicasting among servers shall be configured to get the best performance

Client based server real-time failover mechanism

Client based failover system expects that there is a smart component installed at the client system that checks for server availability. Such system is fully active providing both server failover and load-balancing.

images/download/attachments/21955323/Failover-CB.png
( blue line - print from workstation , green line - print to the printer , black line - authentication from terminal )

YSoft SafeQ Hardware Terminal for Printers/MFPs

  • Hardware terminal checks the server availability upon every authentication requests and initialize the session with responding server.

YSoft SafeQ Workstation Client for Print Servers and User Workstations

  • The client component verifies availability of individual servers and re-directs print job to available print server.

    • If the server fails, the client automatically prints the document to the other server

    • Print Job Data is not synchronized among servers, to ensure availability of stored print jobs, it is recommended to use shared SAN storage.

images/s/-3eliqb/8502/404359a7d2ab19c9c7c58d12013124a386b28257/_/images/icons/emoticons/lightbulb_on.svg This option is by default available for External Terminals and any workstation or print server with installed YSoft SafeQ Client and print driver (system print queue).

Server OS based real-time server failover mechanism

OS based failover system works with assumption there is an operating system level mechanism that can

  1. provide single IP address / host name to external systems (so that print servers, user workstations and multi-functional printers don't need to verify server availability or switch among multiple servers and

  2. provide mechanism to re-direct traffic to the secondary server(s) in case of primary server's outage.

images/download/attachments/21955323/Failover-WNLB.png
( blue line - print from workstation , green line - print to the printer , black line - authentication from terminal )

Currently supported mechanisms are MSCS and NLB:

MSCS (Microsoft Cluster Server) - Windows Server Enterprise edition

  • Windows Shared Print Queues (print drivers + Enterprise version of YSoft SafeQ Client) and SafeQ components are deployed as a clustered services.

  • In this case redundancy requires that applications be installed on multiple servers within the cluster. However, an application is online on only one node at any point in time.

There are two possible ways to install clustered system:

  1. Clustering all SafeQ services (Active-Passive)

    • Example of Active-Passive cluster:

      images/download/attachments/21955323/MSCS-full.png
    • Caveats:

      • SafeQ can have only one active CML node

      • such environment requires independent license for each cluster node

      • in case of failure, user has to wait to system restart (up to 5 minutes)

      • some pull accounting logs can be lost or duplicated in case of server failure

  2. Clustering of Terminal Server service (Active-Passive) and having two SafeQ CML services running (Active-Active)

    • Example of Terminal Server cluster:

      images/download/attachments/21955323/MSCS-DS.png
    • Caveats:

      • System can run up to 4 SafeQ CML nodes, however only one Terminal Server can be running in system (as TS do not have unique identifier for communication with CML)

    • Limitations:

      • this solution is not working with pull accounting (e.g. Xerox with EIP version lesser than 2)

Microsoft Window Network Load Balancing (NLB)

The NLB operates as a cluster which shall be formed over all servers running SafeQ Cluster nodes. All other SafeQ system operations are handled by SafeQ application-level clustering and load balancing.
The NLB technology provides single, virtual IP address. Multicast operation model of the NLB technology is required.
The Terminal Server shall use the NLB virtual IP address for registering all sub-API applications on MFPs. This IP address (or FQDN name as the case may be) shall be specified in SafeQ configuration settings by the system administrator.

  • Terminal Server Component is registered to a network load balancing virtual IP/Hostname

  • New node (CML+TS) can be added to system any time (up 4 CML nodes)

  • see Configuring WNLB Server Failover for more information

The SafeQ System is extended with NLB support in the following way:

  • When the SafeQ System detects a failure of a particular server node, the newly elected master node shall temporarily remove such failed node by using Microsoft-supported tools, such as PowerShell or NLB.EXE.

  • Reasonably after the previously-failed node is operational again, the current master node shall re-add the previously failed node to the NLB cluster. A timeout mechanism shall prevent the cluster from removing and re-adding the nodes in a quick succession in case of short-term failures (such as temporary lack of network connectivity which only spans few seconds) which would result in decreased system performance due to thrashing.

Example of NLB cluster:
images/download/attachments/21955323/NLB.png

Limitations:

  • Y Soft supports using NLB for introducing failover and load balancing only on Microsoft Windows servers.

  • images/s/-3eliqb/8502/404359a7d2ab19c9c7c58d12013124a386b28257/_/images/icons/emoticons/lightbulb_on.svg  NLB needs to be configured in multicast mode to span SafeQ cluster nodes running in different data centers or connected to different networks. Network multicast is ONLY required to be supported among servers, but the cluster virtual IP address has to be network-reachable from the MFPs or workstations.

Network Configuration

Pros

Cons

Network uni-cast

No extra network configuration needed

All data is transferred to all servers in the cluster, increasing load on the network.

Network multicast (among servers)

Recommended solution by Microsoft
Faster, more reliable with less data traffic.

Requires infrastructure support.

  • The NLB shall be configured with port rules and affinity to maintain short term affinity of MFPs with individual Tevice Server services to preserve user sessions. User sessions are not shared among Terminal Server services.

  • All limitations of the NLB technology apply with the following exception(s):

    • The NLB technology alone does not support application failover and provides only load balancing functionality – this limitation is lifted by adding NLB support to SafeQ for failing over to correctly running SafeQ cluster nodes.

    • Status of services is monitored internally by SafeQ and in case of shutdown the node is disconnected from NLB. SafeQ services must be set to un-register from NLB via services.msc in case of crash.

    • manual re-balancing is needed in case of failure to restore pull accounting functionality. New re-balancing cycle is needed after failback.

Re-balancing

To restore full accounting functionality in case of failure, master node provides option to automatically/manually re-balance devices among all running DS.

Requirements / typical architecture overview

  • System shall be capable of deployment of group of servers providing redundant solution (both servers shall be able to accept sessions from workstations and terminals)

  • System shall be capable of operation regardless to which server in group the printing client or terminal connects.

  • Servers in the group shall be able to share print job list and request print data from any other server.

  • Servers in the group shall be able to access print job data on shared SAN storage.

  • System shall provide automated means for the system or client components to detect outage of one server and re-direct incoming traffic to another server.

  • System shall alert SafeQ administrator about server outage via email.

  • For server-originated actions (e.g.job based accounting log collection, ...), the system shall hand-over the operation to secondary server automatically or upon administrator's confirmation.

  • System shall provide cluster visual health overview via Administrative interface Dashboard Widget

Dependencies / Non-functional requirements

  • Servers in a YSoft SafeQ failover cluster must be installed and available on a high-speed, low-latency LAN.

  • Load balancing is only supported if print drives and YSoft SafeQ clients are installed locally on every workstation or on a print server

  • Server based real-time server failover system works only on single sub-net and vLAN

  • NLB server based real-time server failover system requires Windows server operating system

Global caveats

  • If one server fails, all operations (copy/scan) processing at that moment will be terminated and user must start them again (e.g. re-authenticate at the terminal).

  • Web reports are only available on first node. If first node fails, web reports will not be available until first node is recovered.

  • Print Job Data are not synchronized among servers, to ensure availability of stored print jobs, it is recommended to use external shared SAN storage.

  • YSoft SafeQ CML cluster can be build from up to 4 servers (In extreme cases, when connecting more than 800 devices in one location and Print roaming is not possible to configure, a 5th server can be connected to the system. Please consult your Y Soft specialist).

Error rendering macro 'excerpt-include'

No link could be created for 'YSoft SafeQ Multi-server Technology'.