Is Isis2 V2.2.1937 safe to use in safety-critical settings?

Mar 3, 2015 at 2:24 PM
Edited Mar 3, 2015 at 3:36 PM
I sometimes get emails from people wondering if they can use Isis2 in settings where they might have used the old Isis Toolkit back in the late 1990's. Some of those applications are safety-critical, such as air traffic control or automobile "convoy" control (a modern variant), health monitoring or even medical care systems, etc. Others just would have a "critical systems" role, such as helping manage and monitor the bulk electric power grid. So the question of using Isis2 in such settings really is a broader question of whether Isis2 can be part of a professional process that might create solutions for these kinds of mission-critical use cases.

Here's my view on this, which has evolved somewhat over the past few years. When I created Isis2, I had limited resources for testing the system. In contrast, back when I ran Isis Distributed Systems Inc, in the 1990's, I spent millions on quality assurance. So my concern was that you really wouldn't be wise to just use this new academically developed software in situations like an air traffic control system, which was the kind of thing people used to do with the Isis Toolkit. I put a comment to that effect both here on the web site, and also in the source code. It basically said that the system was just a research prototype and not safe for such uses.

But since the first releases of the system, much time has passed. We've had hundreds of people use it (275 downloads of the current release alone) and years of experience with it. While this isn't the same as a professional Q/A process, it isn't bad. Furthermore, no matter how strongly guarantees Isis2 itself might be, Isis2+your application would always need further testing. After all, perhaps your application is buggy. Thus even if Isis2 was perfect in every way, without more testing it would not be safe to just throw some Isis2 application into a dangerous situation. And anyhow, I'm sure that Isis2 has issues, even if I'm not aware of any right at the moment (the last bug reported to me was back in early 2014).

So my view now is that yes, one could use Isis2 in safety or even life-critical settings, provided however that your team, developing the application, undertakes the requisite quality process. This typically involves designing a fail-safe mechanism so that in the worst case, your end user isn't endangered if the system shuts down and can't restart itself (which can happen if the network partitions, for example). Design your application carefully and don't hesitate to consult with us here or via email on any questions -- we help for free. Finally, you should test under a wide variety of realistic conditions - many people make the mistake of designing unrealistic tests (like extreme overloads) and neglecting to test under the conditions actually seen in the target setting. Make your tests realistic and thorough, and then run them for a long time. (True story: Back in 1995, I remember once struggling with a bug we never ever saw during Q/A, and yet it would arise in deployment at a big VLSI fab line, and cause crashes. After much agony we finally discovered that the bug was triggered by leaving the system totally silent for five minutes at a time, then sending a single multicast at the end of each five minute period. This slow-motion pattern was causing a memory leak, but that same leak didn't arise under heavier loads! My point being: you need to test a system in the way it will be used.)

But if you can pass this "gauntlet" then yes, I think it is fine to deploy Isis2 into high-risk settings. I'm not aware of anything more problematic about my code than about Linux or Mono or other elements of the runtime environment. So have at it!