RELIABILITY VS ACCURACY
We reveal the results from a survey
of over 100 risk practitioners
Mat Newman, Head of Product Management:
Adaptiv, SunGard Capital Markets & Investment Banking
Speed - the desire for increased speed is seemingly everywhere in our modern world. Maybe something in our genes drives us towards greater speed and power, in cars and boats in our leisure, and, through more powerful computers and software in our work environment.
This intrinsically attractive proposal of going ever faster should be examined however - for example, why is it that the best loved car in the UK in 2008 was the humble Skoda Octavia? Clearly it is not because of the speed, but there are other underlying factors which drive customer satisfaction.
When dealing with risk calculations, there is similar demand for faster computers and software and a pervasive emphasis on results being delivered fast. Sometimes the desire for speed can however cloud our view on what is really necessary for risk calculations. The benefits of a fast calculation are very quickly eroded if the calculation fails to complete with any regularity.
So What About Reliability?
Reliability in a calculation engine is the ability of the engine to handle and recover from data errors, to be failsafe against hardware and network failures and to handle the unexpected as elegantly and robustly as possible.
Let's consider the instances where speed is desirable in our risk calculations:
- Overnight runs must complete in the allotted time window
- Intra-day re-runs caused by failures or errors in the overnight run
- Intra-day runs for further analysis, new deals, what-if scenarios etc.
To determine the weight actual users put on these areas, we conducted a survey of over 100 risk practitioners, asking them a number of questions, including rating the relative priorities of reliability, speed, flexibility and Total Cost of Ownership (TCO). The results threw up some interesting findings: 57% of respondents rated reliability as their top priority, while a whopping 82% rated reliability as one of their top two priorities.
Clearly the indication is that reliability is more important to the users than speed. This is despite the estimated average runtime for a benchmark portfolio on the client system being more than four times slower than the runtime on our system. So clearly performance is not yet as good as it could be at these institutions, but still they view reliability as the key requirement of a successful system. This shows just how critical reliability continues to be.
On the other hand, only 13% rated speed as their top priority, and less than a third put it in the top two priorities. We gather from these results that fast intra-day calculations are still not a top priority for financial institutions. In my view, this agrees well with common sense, as without a reliability of close to 100%, intra-day runs are close to useless, and will never even be performed, as users will be too busy recovering from last night's run failure.
Interestingly in the survey, TCO came in as the lowest priority. Although I take such ratings of TCO with a pinch of salt (things are always different when the budget is already there), I think this indicates that an ideal risk system would perform overnight calculations comfortably within the time window allocated, for a reasonable TCO, with very high reliability. To fit within the overnight window for a reasonable TCO, the key is that the system should be equipped with sufficient speed, but more importantly, should be scalable on cheap hardware. So the top priority in building a risk system should be to have an appropriate, robust architecture to hit speed and scalability targets. This architecture should prioritise reliability over un-stable performance tweaks. All too often reliability is sacrificed in order in favour of faster calculation times, when what is really desired is a solution that is scalable to hit target time windows.
I like the saying "More haste, less speed". The faster you try to do something, the more likely you are to make mistakes. I think this sums up the situation in risk engines very nicely.