Selected Performance Indicators

In order to assess the empirical performance of IOHs, some performance indicators should be adopted in IOHprofiler, ranging from simple descriptive statistics to empirical distribution functions of performance measures.

Descriptive Statistics

Regardless of whether the fixed-target analysis or the fixed-budget one is taken, some basic descriptive statistics are considered to compare the empirical performance from different angles. Such statistics include:

  • sample mean of running time/function value given a target/budget value: $$\bar{T}(v)=\frac{1}{r}\sum_{i=1}^r \min\left\{t_i(A,f,d,v), B\right\},\quad \bar{V}(t)=\frac{1}{r}\sum_{i=1}^r v_i(A,f,d,t).$$ Note that, when a run does not succeed in reaching the target $v$ (namely $t_i = \infty$), the benchmarking budget $B$ is taken to calculate the sample mean, which is also applied hereafter.
  • sample median of running time/function value given a target/budget value.
  • sample standard deviation of running time/function value.
  • ample quantiles of running time/function value. When the distribution of $T(A,f,d,v)$ and $V(A,f,d,t)$ shows a high skewness and/or kurtosis, the sample mean can be interpreted wrongly. Therefore IOHanalyzer also computes different quantiles of these distributions. By default, the following quantiles, $Q_{2\%}, Q_{5\%},\ldots, Q_{98\%}$ are calculated.
  • empirical success rate shows among all runs, the number of independent runs that an algorithm $A$ reaches the given target $v$ on some function: $$ \widehat{p}_s = \sum_{i=1}^r\mathbb{1}(t_i(A,f,d,v) < \infty) / r, $$ where $\mathbb{1}$ is the characteristic function. The empirical success rate estimates the probability of success $p_s(A,f,d,v) := \mathbb{E}\mathbb{1}(T(A,f,d,v) < \infty)$, that is an intrinsic characteristic of IOHs.

Expected Running Time

When running the IOHs, usually a benchmark budget $B$ is set and $r$ independent runs of an algorithm are conducted for each test function. Recall that, with a budget value $t$, not all the runs will successfully hit the target. Thus, the sample mean $\bar{T}$ calculated by the above equation underestimates the “true” mean running time required to reach a target. The true running time can be restored by considering a restarting execution: given a target $v$, an algorithm will be restarted from scratch (independent from the previous restart) repeatedly until $v$ is hit. Consider $R>0$ independent restarts are performed and the running time of each restart $T^{(1)},T^{(2)},\ldots,T^{(R)}$, where $T^{(1)},\ldots,T^{(R-1)}=\infty$ and $T^{(R)} < \infty$. The true running time is: $\tilde{T} = T^{(R)} + B(R-1),$ where $R$ follows a geometric distribution with expectation $\widehat{p}_s$. It is shown in that the expectation $\mathbb{E}\tilde{T}$ can be estimated consistently using the so-called Expected Running Time (ERT):

Note that ERT can take an infinite value when all the runs are unsuccessful to reach the target value. In IOHanalyzer, ERT is an important performance indicator in the fixed-target analysis.

Cumulative Distribution Functions

For fixed-target and fixed-budget analysis, IOHanalyzer provides empirical probability density (mass) functions as well as empirical cumulative distribution functions (ECDFs). Probability density functions are obtained using Kernel Density Estimation (KDE) for $V(A,f,d,t)$, using density method from stats package.

For fixed-target analysis, the ECDF of running time $T$ is defined as $\hat{F}_T(t)=\sum_{i=1}^r \mathbb{1}(t_i \leq t) / r$. In addition, it is sometimes convenient to gain the overview on ECDFs over a range of target values, resulting in an overall performance profile for some algorithm $A$. In IOHanalyzer, two levels of "aggregations" of ECDFs are implemented:

  • The aggregation over target values is defined in the following sense: for a set of target values $\mathcal{V}$, $r$ number of independent runs on each function, algorithm $A$ on function $f$, the aggregated ECDF is: $$ \hat{F}_T(t\; ; \; A,f,d,\mathcal{V}) = \frac{1}{r|\mathcal{V}|}\sum_{v\in \mathcal{V}}\sum_{i=1}^{r} \mathbb{1}(t_i(A,f,d,v) \leq t). $$
  • Given a set of test functions $\mathcal{F}$, the ECDF can be elevated to the aggregation over test functions: $$ \hat{F}_T(t \; ; \; A,\mathcal{F},d,\mathcal{V}) = \frac{1}{r|\mathcal{V}||\mathcal{F}|}\sum_{f\in\mathcal{F}}\sum_{v\in \mathcal{V}}\sum_{i=1}^{r} \mathbb{1}(t_i(A,f,d,v) \leq t). $$

The aggregated ECDFs for function value $V(A,f,d,t)$ can be defined in the similar way, e.g., for aggregations over a set of budget values $\mathcal{T}$: