High Availability by Example

This example works with the assumption that there is the following situation with 4 servers:

2 servers that are both running 3 Application servers,
1 server that is running the Manager,
1 server, running one Spare-manager and 3 Spare-application servers.

When everything runs properly, the situation is like the picture below shows.

If for some reason AppServer1 and Appserver5 crash, the Manager will take care that the two Spare-application servers will takeover the tasks of the ones that went down. All the users connected to those Application servers will automatically be reconnected to the Spare-application servers.

When AppServer 1 and Appserver 5 are restarted, both Spare-application servers will return to spare state again. All users connected to these Spare-application servers will be re-routed to the "normal" Application servers.

This mechanism can also catch unexpected hardware errors of the server or Windows-crashes. For example if Server1 crashes, we get the following situation:

The Spare-manager notices that the Manager is not active anymore, and becomes the active Manager at that moment. The Manager also knows that three of the six Application servers are gone, so the three Spare-application servers will turn into "normal" Application servers. Clients, disconnected from Server2 will contact the new Manager to receive an address of an active Application server.

The same mechanism as is described for the Application server also applies to the Batch controller.