## Selected Performance Indicators

In order to assess the empirical performance of IOHs, some performance indicators should be adopted in **IOHprofiler**, ranging from simple descriptive statistics to empirical distribution functions of performance measures.

### Descriptive Statistics

Regardless of whether the fixed-target analysis or the fixed-budget one is taken, some basic descriptive statistics are considered to compare the empirical performance from different angles. Such statistics include:

**sample mean**of running time/function value given a target/budget value: $$\bar{T}(v)=\frac{1}{r}\sum_{i=1}^r \min\left\{t_i(A,f,d,v), B\right\},\quad \bar{V}(t)=\frac{1}{r}\sum_{i=1}^r v_i(A,f,d,t).$$ Note that, when a run does not succeed in reaching the target $v$ (namely $t_i = \infty$), the benchmarking budget $B$ is taken to calculate the sample mean, which is also applied hereafter.**sample median**of running time/function value given a target/budget value.**sample standard deviation**of running time/function value.**ample quantiles**of running time/function value. When the distribution of $T(A,f,d,v)$ and $V(A,f,d,t)$ shows a high skewness and/or kurtosis, the sample mean can be interpreted wrongly. Therefore**IOHanalyzer**also computes different quantiles of these distributions. By default, the following quantiles, $Q_{2\%}, Q_{5\%},\ldots, Q_{98\%}$ are calculated.**empirical success rate**shows among all runs, the number of independent runs that an algorithm $A$ reaches the given target $v$ on some function: $$ \widehat{p}_s = \sum_{i=1}^r\mathbb{1}(t_i(A,f,d,v) < \infty) / r, $$ where $\mathbb{1}$ is the characteristic function. The empirical success rate estimates the**probability of success**$p_s(A,f,d,v) := \mathbb{E}\mathbb{1}(T(A,f,d,v) < \infty)$, that is an intrinsic characteristic of IOHs.

### Expected Running Time

When running the IOHs, usually a benchmark budget $B$ is set and $r$ independent runs of an algorithm are conducted for each test function. Recall that, with a budget value $t$, not all the runs will successfully hit the target. Thus, the sample mean $\bar{T}$ calculated by the above equation *underestimates* the “true” mean running time required to reach a target. The true running time can be restored by considering a restarting execution: given a target $v$, an algorithm will be restarted from scratch (independent from the previous restart) repeatedly until $v$ is hit.
Consider $R>0$ independent restarts are performed and the running time of each restart $T^{(1)},T^{(2)},\ldots,T^{(R)}$, where $T^{(1)},\ldots,T^{(R-1)}=\infty$ and $T^{(R)} < \infty$. The true running time is: $\tilde{T} = T^{(R)} + B(R-1),$
where $R$ follows a geometric distribution with expectation $\widehat{p}_s$. It is shown in https://ieeexplore.ieee.org/document/1554902 that the expectation $\mathbb{E}\tilde{T}$ can be estimated consistently using the so-called **Expected Running Time** (ERT):

Note that ERT can take an infinite value when all the runs are unsuccessful to reach the target value. In **IOHanalyzer**, ERT is an important performance indicator in the fixed-target analysis.

### Cumulative Distribution Functions

For fixed-target and fixed-budget analysis, **IOHanalyzer** provides empirical probability density (mass) functions as well as empirical cumulative distribution functions (ECDFs). Probability density functions are obtained using Kernel Density Estimation (KDE) for $V(A,f,d,t)$, using `density` method from **stats** package.

For fixed-target analysis, the ECDF of running time $T$ is defined as $\hat{F}_T(t)=\sum_{i=1}^r \mathbb{1}(t_i \leq t) / r$. In addition, it is sometimes convenient to gain the overview on ECDFs over a range of target values, resulting in an overall performance profile for some algorithm $A$. In **IOHanalyzer**, two levels of "aggregations" of ECDFs are implemented:

- The aggregation over
*target values*is defined in the following sense: for a set of target values $\mathcal{V}$, $r$ number of independent runs on each function, algorithm $A$ on function $f$, the aggregated ECDF is: $$ \hat{F}_T(t\; ; \; A,f,d,\mathcal{V}) = \frac{1}{r|\mathcal{V}|}\sum_{v\in \mathcal{V}}\sum_{i=1}^{r} \mathbb{1}(t_i(A,f,d,v) \leq t). $$ - Given a set of test functions $\mathcal{F}$, the ECDF can be elevated to the aggregation over
*test functions*: $$ \hat{F}_T(t \; ; \; A,\mathcal{F},d,\mathcal{V}) = \frac{1}{r|\mathcal{V}||\mathcal{F}|}\sum_{f\in\mathcal{F}}\sum_{v\in \mathcal{V}}\sum_{i=1}^{r} \mathbb{1}(t_i(A,f,d,v) \leq t). $$

The aggregated ECDFs for function value $V(A,f,d,t)$ can be defined in the similar way, e.g., for aggregations over a set of budget values $\mathcal{T}$: