Previously, I posted an algorithm for UTCI (see here). This algorithm takes advantage of the polynomial nature of the UTCI formula, to simplify the expression to 28 terms (compared to 84 terms).
Here, I examine the performance of 4 variations of the UTCI algorithm. To simplify the tests, we use a generalised case which closely resembles UTCI, but simplifies the weighted coefficients in front of each term. :
In the UTCI calculation, each term is weighted by a coefficient. One of the unresolved issues with the UTCI alorithm, is how to handle these weightings. It may be possible to derive a general expression that encompasses these, but it also may not (see Algorithm #1 below). Bearing in mind that the weighted values are constant, it may also be inefficient to calculate these for each term. Instead, it would be possible to have these calculated in a pre-defined sequence that can be folded into the algorithm (see Algorithm #2 below). The performance implications of these two approaches is a side-issue that the runtime analysis will explore.
The algorithm significantly reduces the lines of code, is easier to read and understand, and therefore more efficient to maintain.
I hypothesise that it will perform better that the existing equation used by Otto & Lemke, as it has significantly less terms.
def full_algorithm(x): for n in range(1, 7): for k in range(7 - n, 0, -1): yield (n * (n + 1) / 2) * x ** n * x ** k
Taking advantage of the fact that he weighted values are constant, these may be pre-calculated.
I hypothesis that this will improve the performance, compared to Algorithm # 1.
def partial_algorithm(x): weights = [1, 3, 6, 10, 15, 21] for n in range(1, 7): for k in range(7 - n, 0, -1): yield weights[n-1] * x **n * x **k
The UTCI calculation is a 6th degree polynomial. By combining similar terms, the magnitude of the existing equation used by Otto & Lemke can be significanlty reduced.
I hypothesise that this will have a significant affect upon the performance of this algorithm.
def simplified_equation(x): return sum((1 * x * x**6, 1 * x * x**5, 1 * x * x**4, 1 * x * x**3, 1 * x * x**2, 1 * x * x**1, 3 * x**2 * x**5, 3 * x**2 * x**4, 3 * x**2 * x**3, 3 * x**2 * x**2, 3 * x**2 * x**1, 6 * x**3 * x**4, 6 * x**3 * x**3, 6 * x**3 * x**2, 6 * x**3 * x**1, 10 * x**4 * x**3, 10 * x**4 * x**2, 10 * x**4 * x**1, 15 * x**5 * x**2, 15 * x**5 * x**1, 21 * x**6 * x**1))
Currently, the UTCI calculation used by Otto & Lemke contains 84 terms. Similarly, an 84 term equation has been used as the baseline for this experiment. The Baseline Algorithm contains the same terms as Algorithm #3, simply repeated to give a total of 84 terms. In this way, the simplified test algorithms and the baseline algorithm represent the difference in magnitude of the actual UTCI algorithms.
All 3 algorithms, and the baseline, were coded in Python and run on Fedora 19 (Schroedinger’s Cat), using Python 2.7.5. The algorithms were run over an increasing number of calculations from 1 to 5000000, simulating the requirements of the Climate Chip project. The running times were recorded. The results are displayed in the plot below:
- Overall running time of the calculation is linear with respect to the number of calculations
- The choice of algorithm significantly affects the overall running time
- Algorithm #1 simplifies the problem, and results in a 2 x increase in performance compared to the baseline
- Pre-calculation of the weighted constants, Algorithm #2, yields a small (approx. 10 %) improvement over Algorithm #1. While small, the cumulative effect is not insignificant over many millions of runs. Importantly, it is highly likely that we will have to adopt this approach, as it is unlikely that we will be able to derive a general case that will also express the coefficients of the terms.
- Algorithm #3 has the best running time by quite some margin, being 2 x as fast as Algorithm #1, and 4 x as fast as the baseline.
This seems to be conclusive evidence that a straight calculation is favourable over an iterative algorithm, when they are of the same order of magnitude. However, there is a tradeoff in terms of readability and maintainability. I propose that Algorithms #1 and #2 are more maintainable, and provide greater convenience in a research environment where frequent changes to the model are likely. However, Algorithm #3 would be preferable in a production environment.