[jdhicks]: Can you help me with a load balancing issue with Isis2 V2.2.2003

Coordinator
Jun 22, 2015 at 6:58 PM
[reposted as a new thread on behalf of JD]

I have moved to testing two servers behind a load balancer. One issue I need to overcome is how multiple copies of the server software decide they are the Leader. Do I need to pin this to a specific network card? The physical servers have two NICS and when I start the second instance on another server behind the load balancer - It also think it is the Leader and both instances crash.


Thanks

J.D.
Coordinator
Jun 22, 2015 at 7:14 PM
Isis2 offers a number of options you can work with. I recommend that you decide what you actually want the system to do, and then we can figure out how to make it happen.

Execution model:
  1. Simplest: Whoever gets a read-only request just handles it. Use OrderedSend for update requests. Be cautious about locking, obviously (different threads will handle each kind of request). So here the load-balancer out in front of the cluster is really picking the leader for read operations.
  2. Fancier: Perhaps among the update requests, there are some potentially slow computational tasks. You want these to be handled fault-tolerantly with a leader and one or more backups (this used to be called the "coordinator/cohort" model, to balance the work. After doing the update computation the leader for that operation broadcasts a concise list of the updates to do ("change record xxx to value 124, change ...") and everyone applies these updates in the same order. Then whoever got the request in the first place (from the load balancer) replies to the external client.
Now for version 2, if you use the view to pick the leader (I do this often), then one process does all the crunching and the other sits idle. Not what you wanted. So you could, for example, hash the requests somehow (some cheap hash function) and then take the remainder modulo the number of group members. This guy can be the leader in the view where the OrderedSend is delivered. Should be very randomized, but not in a way sensitive to workload.

Even fancier, still for version 2. You could have the rank-zero member track loads on the group members and, as each new hard-to-compute-update request is delivered in the leader, it echoes "request 2173 from client IP 182.11.88.001 will be computed by member with rank 2". Just keep in mind that this multicast might arrive in a different view (if membership is changing), so you can't blindly assume that the view actually has 3 members when this message comes in. Do the same modulus trick to be on the safe side.

Or you could come up with a parallel way to handle a single request in K parallel chunks, computed by the group members simultaneously. Then they reply to the process who originally got the request, it combines the results, and it replies to the client.

So these are ways to load balance.

Now, you also report a bug in Isis2, or in your code, or in Linux -- hard to tell. To narrow this down quickly, I would do a hello-world load-balanced application in C#. Just make sure that both server instances print "hello" more or less in a balanced rate, test the scenario of starting the second instance when the first is active, etc. Get this completely solid. In principle there is a demo program to do this from your LB vendor, although maybe in C++ instead of C#. That would be ok too.

Once you are sure that the LB setup is right and that servers not using Isis2 can handle this properly, maybe Isis2 would just run as long as your server code has a thread (not the same one Isis2 runs on!) to handle client requests. One thing to be careful about is that (1) you don't call IsisSystem.Start() more than once, (2) the thread that calls IsisSystem.Start() doesn't terminate. Those are examples of things that cause a shutdown in Isis2. You can read the log file to see if a shutdown exception was thrown and if so, why. It would probably be something like these if Isis2 had this issue.

In our experience (lots of experience), Isis2 can co-exist with RestFUL RPC or the WCF framework RPC, as long as they run in different threads and you follow these rules. I can't think of any reason that pinning to a particular NIC would be an issue, but it is easy to do if needed (ISIS_NETWORK_INTERFACES=.... with a list of which ones to use).

So my next step in debugging would be to launch an Isis2 server that doesn't actually start Isis2 (doesn't call the system), and uses the LB, and prints hello.

Then my next step would be to enable Isis2 by calling IsisSystem.Start(), but not use any Isis2 functionality. Make sure it prints hello in the usual balanced way.

Then you can retry your fancier logic...
Jun 22, 2015 at 8:04 PM
I am starting Isis multiple times.

This is what the constructor does...
Should I also not register types and create the group again?
        IsisSystem.Start();

        Msg.RegisterType(typeof(AGSSocketPacket), 0);
        Msg.RegisterType(typeof(BaseMessageObjectUDP), 1);
        Msg.RegisterType(typeof(LGSGetDrawUDP ), 2);
        Msg.RegisterType(typeof(BallDraw), 3);


        lgsGroup = new Group("LGS");

        ...
        ...
        ...

        lgsGroup.Handlers[UPDATE] += addBallDrawRequestToList;
        lgsGroup.ViewHandlers += viewHandler; 
        ...
        ...
        ...



Coordinator
Jun 22, 2015 at 9:20 PM
Edited Jun 22, 2015 at 9:21 PM
Well, if you were to call this code more than once in a single process (the same program, still running on the same machine, process id unchanged) then it would cause an exception to be thrown the second time.

In Isis2 you only call IsisSystem.Start() once per application program instance -- once per process. In fact at one time I was playing with ways to call IsisSystem.Shutdown() and then later call IsisSystem.Start() again, but the costs were unreasonably high and it didn't look like very useful functionality.

For any given group, you call new Group once, then attach handlers, then call join. After calling join you can't attach more handlers on the group. In contrast to IsisSystem.Start() you actually can leave a group and later rejoin it, but only after Isis2 has enough time to clean up. We use this in some of our cloud management applications, but it isn't really something I view as a normal model for Isis2. Normally you join the groups you plan to join, then stay in them....

So probably your code is causing this kind of "can't call xxxx twice" exception. If you look at the "log file" for your application (if the system has a place to put it -- it won't create them if there is no "logs" folder) you would see a printed message about the reason for the exception and it would have been self-explanatory. Exceptions also print to the console and of course can be caught, although this particular kind of exception is unrecoverable.
Jun 23, 2015 at 1:35 PM

I'm running on two separate machines behind a load balancer. Both instances of the program start up and report being the leader. Occasionally I get this error and both crash:

ORACLERUNNING: (6508)

Partitioning event, this node was in the minority partition

Additionally the first server application does not get a View event when I start the second server application.

The application is identical on both servers.

J.D.




Coordinator
Jun 23, 2015 at 1:59 PM
Edited Jun 23, 2015 at 2:04 PM
Interesting. It seems to be saying that the two copies don't find each other when IsisSystem.Start() is called in the second one you launch. (Obviously, the first copy starts up and Isis2 is then running. When the second copy is launched, it broadcasts to ask "is anyone there?" and is supposed to find the first copy and joint that version rather than starting a new system instance).

Then later the second guy DOES hear from the first guy, and the second one shuts down because there shouldn't be two instances of the system running. In effect it is acting like the network was disconnected at first, and later got connected up properly.

We have seen situations similar to this inside Amazon AWS, where nodes have two IP addresses. On AWS you get an internal IP address for node-to-node communication internal and between your applications. Those are the ones to tell Isis2 to use. And then you get an external IP address for WCF or RestFul RPC. Isis2 shouldn't be told to use the external ones for a reason like yours -- during setup those basically don't work yet plus the Amazon LB can reroute those packets all over the place, delay them, etc.

Sort of like "My name is Professor Birman, but my friends call me Ken." So in this case the external name is Professor Birman, and the internal name is Ken. Exactly analogous, except we are using IP addresses for Isis2, obviously. We want the Isis2 nodes to talk to each other via internal names: Ken1 talks to Ken2 and Ken3. Trying to use the external names sort of works, but is slow and erratic and loops out through the LB and back into the cluster -- really bad routings that are slow and loss-prone.

So my guess is that when you read the LB documentation they gave you, you actually may find the identical thing -- that the LB exposes one set of IP addresses to the outside world, but that you don't want Isis2 to use them. And then a second set for internal communication.

So now that you've explained this, I can see why you asked about the ISIS_NETWORK_INTERFACES parameter. You are quite right: that is exactly the way you do this. By telling Isis2 which network interfaces to communicate on, you can easily get it focused on those internal node-to-node routing paths that don't go via the LB and work normally, and avoid it trying to talk through the LB "looped back", which doesn't seem to work well at all.

You can read the network interface names off the interfaces by using ifconfig -all or a similar command and just looking at the way your Linux nodes are set up. Once you can draw a picture of the network layout on paper, you'll see which paths you want Isis2 to be using, and quite possibly that will eliminate this whole issue. In theory once you know the IP addresses in use for the various links, you can figure out which ones go up via the LB to external clients, and which ones run sideways (Ken1 to Ken2, etc). You tell Isis2 to just ignore the ones via the LB.

Another recommendation is to not have the two copies start exactly at the same millisecond. You might consider a delay in the second one that starts up to make sure that the first guy is fully operational before you let the second one start up. If you start two programs at the same instant Isis2 is supposed to sort that out and just have one of them initiate the new system configuration, and have the other politely wait and then join. But perhaps something about your LB setup is messing up this logic (e.g. perhaps until some form of initialization finishes, maybe the LB is blocking them from communicating to each other, like with a firewall or some kind of missing routing rule).

To do this, look at the actual IP addresses inside your cluster. Usually it is easy to get a node number from this and then one of them, maybe 192.168.80.0 can be the official first one to start, and the other, maybe 192.168.80.1 can wait 30 seconds before starting. We actually usually do that on AWS too, because the initialization of the AWS internal network links is slow and flakey. Letting our head node start and be stable for a little while seems to make it way easier for the other nodes to join in. In fact on AWS we have a whole slew of backup nodes joining, maybe 100 or more, so we use the "master/worker" startup approach on that platform. But with just two nodes of course you won't need something so fancy.
Jun 23, 2015 at 2:53 PM

About the (ISIS_NETWORK_INTERFACES ) parameter - I do not see it in the help files. I want the isis applications to use the infiniband nics exclusively. How do I do this?

J.D.




Coordinator
Jun 23, 2015 at 3:00 PM
Edited Jun 23, 2015 at 3:02 PM
You'll find a discussion of it and other such parameters in the actual user manual, around page 64. You just give a list of the interfaces by name or number or IP address (each NIC has an IP address...). For example, ISIS_NETWORK_INTERFACES=ib0,ib1. As I said, you use the ifconfig (Windows: ipconfig) command to list them. You normally would set these as environment variables in Linux or as bash shell variables (via the "export" command). On Windows they can be in the registry, or there is a way to set them on the command line of the batch file.
Jun 23, 2015 at 8:59 PM

O.K. I have that working...

Now I need to know what order to call what in for the second instance of the program on a second server.

Right now I do this on both servers...

IsisSystem.Start();

Msg.RegisterType(typeof(AGSSocketPacket), 0);

Msg.RegisterType(typeof(BaseMessageObjectUDP), 1);

Msg.RegisterType(typeof(LGSGetDrawUDP), 2);

Msg.RegisterType(typeof(BallDraw), 3);

lgsGroup = new Group("LGS");

...

...

...

lgsGroup.Handlers[UPDATE] += addBallDrawRequestToList;

lgsGroup.ViewHandlers += viewHandler;

...

...

...

lgsGroup.Join();

J.D.




Coordinator
Jun 23, 2015 at 9:10 PM
Edited Jun 23, 2015 at 9:45 PM
Yes, this looks right. I assume that it eventually calls IsisSystem.WaitForever(); so that the thread issuing these calls doesn't exit? I ask because if that thread were to exit for any reason, Isis would shut down on that server.

So basically, I would launch just one server. Wait until it is up and running (IsisSystem.Start() has returned, etc). That would normally take ~90 seconds.

Then launch the second server. In principle it will join instantly (~1s).

If it pauses for 90 seconds and starts a completely different Isis2 instance, which you can see from the log file output, then the two copies are not able to talk to each other. Either (1) broadcast isn't working, in which case switch to ISIS_UNICAST_ONLY=true and set ISIS_HOSTS=xxxx,yyyy for the machine names, or (2) enable broadcast on the network interfaces. or (3) they are using the wrong NICs and are basically trying to talk to each other "through" the load balancer. In this case you use ISIS_NETWORK_INTEFACES=... to force Isis2 to use the ones you want it to run on.
Jun 26, 2015 at 1:49 PM

Good news. The software worked the first time the IT guys got the load balancer working. This is based on a single test run for about 15 minutes but it proves the software architecture is fundamentally o.k.

Now I can update the version of Isis2 that I'm using.

J.D.




Coordinator
Jun 26, 2015 at 3:37 PM
Great! And good news from my point of view for another reason too: I'm off on vacation for 2 weeks with just an iPad and didn't really want to try to debug anything from France... But I'll keep one eye on the discussion and issues sections of the isis2.codeplex.com...
Jun 26, 2015 at 3:52 PM

Enjoy your vacation and thanks again for all your help.

J.D.