5 LTV-SOBI Performance Measures and Simulation Studies

The accuracy and effectiveness of the LTV-SOBI algorithm principally depend on the source signals and observed mixture themselves, especially on the assumptions of uncorrelatedness and stationarity. Nevertheless, when compared with SOBI in non-time-varying structure, LTV-SOBI is expected to be embedded with extra accuracy losses due to its relatively more complex mixing mechanism and associating algorithms. In spite of mathematically negligible residuals/errors in each step of the algorithm, the inherent losses can accumulate in sample autocovariance decomposition, approximate joint diagonalization and deriving $$\boldsymbol{\mathcal E}$$ and $$\boldsymbol \Omega_0$$. Precision can be further compromised when nearPD has to enforce in case of non-positive semi-definite matrices. To compare different estimation algorithms, performance indices should be used. However, to address the time-varying characteristics of the mixing matrix, modification to existing BSS performance measures is needed.

5.1 Extension of Minimal Distance Index

The minimal distance index (MD) was initially introduced by Ilmonen (2010) to measure the ICA algorithm performance by comparing true mixing matrix $$\boldsymbol \Omega$$ and estimated unmixing matrix $$\widehat{\boldsymbol \Gamma}$$. For a non-time-varying model, MD-index is defined as,

$$$\text{MD} (\widehat{ \boldsymbol {\Gamma}}) = \frac 1 {\sqrt{p-1}} \inf\limits_{ \boldsymbol C \in \mathcal C} || \boldsymbol C \widehat{ \boldsymbol {\Gamma}} \boldsymbol \Omega - \boldsymbol I_p||, \tag{5.1}$$$

where $$\mathcal C$$ is the set of matrices that permits permutation and scale ambiguity, i.e. each row and column contain exactly one non-zero element. Even though MD-index is a well-designed measure for the majority of the BSS algorithms in addition to ICA, the time-varying case requires adaptation.

The true mixing matrix $$\boldsymbol \Omega_t$$ is varying over time $$t=1,2,\dots,T$$ in TV-SOS model, and same holds for $$\widehat{ \boldsymbol \Gamma_t}$$. Therefore, this thesis proposes a time-varying version of the MD-index as defined in equation (5.2).

$$$\text{tvMD}(\{\widehat{ \boldsymbol \Gamma}_1, \widehat{ \boldsymbol \Gamma}_2, \dots, \widehat{ \boldsymbol \Gamma}_T\}) = \frac 1 T \sum\limits_{t=1}^T \text{MD}(\widehat{ \boldsymbol \Gamma}_t) \ . \tag{5.2}$$$

Since TV-SOS model (3.1) determines all $$\widehat{\boldsymbol \Gamma_t} = \widehat{\boldsymbol \Omega_t}^{-1}$$ by the pseudo initial mixing matrix $$\boldsymbol \Omega_0$$ and the time varying factor $$\boldsymbol{\mathcal E}$$, the tvMD can be determined once LTV-SOBI algorithm is completed and the true mixing is known.

As tvMD is officially a mean value of MD-indies over a time span, it has a value between $$0$$ and $$1$$ that is identical to original MD by central limit theorem, and the smaller value suggests the better algorithm.

5.2 Extension of Signal-to-Inference Ratio

In information processing, researchers tend to decompose the restored signals into four parts: (1) target signals; (2) interference from other sources; (3) noise and (4) artifacts that are originated from separation and evaluation algorithm. The decomposition can be written as (Na & Chai, 2013; Vincent et al., 2006),

$$$\boldsymbol z = \boldsymbol s_\text{signal} + \boldsymbol s_\text{interf} + \boldsymbol e_\text{noise} + \boldsymbol e_\text{artif}\ . \tag{5.3}$$$

The target signal $$\boldsymbol s_\text{signal}$$ is not necessarily the exact source signal. Instead, carefully-selected and evaluated transformations of a source are permissible in response to the identifiability issue in BSS, and it is common to tolerate certain transformation. In particular, permutation and scaling do not usually affect signal interpretation. Some literature (e.g. Vincent et al., 2006) denotes interference as $$\boldsymbol e_\text{interf}$$ when falsely mixing of sources is regarded as an error even though it is originated from sources. For example, the restored signal series II mainly corresponds to source series IV, but also have a partial mixture from source series I. In this case, the former is undoubtful $$\boldsymbol s_\text{signal}$$, and the later should be treated as erroneous interference. For this reason, several BSS researches, especially under the information processing domain, use Signal-to-Inference Ratio (SIR) to measure the similarity between true and restored signals (Eriksson et al., 2000; Vincent et al., 2006). The ratio is defined as

$$$\text{SIR}= 10 \log_{10} \frac{|| \boldsymbol s_\text{target} ||^2}{|| \boldsymbol s_\text{interf}||^2}\ . \tag{5.4}$$$

The LTV-SOBI algorithm by nature does not involve any external noise, and even if the noise is present in source signals, it shall become a part of true signals. Further, $$\boldsymbol e_\text{artif}$$ is assumed to be $$\boldsymbol 0$$ for simplicity. Assume the restored signal to be $$\widehat{\boldsymbol z}$$ and the permutation/scaling matrix to be $$\boldsymbol C$$ as defined in (5.1); the signal decomposition of (5.3) is then

\begin{aligned} \widehat{\boldsymbol z} &= \boldsymbol {s}_\text{signal} + \boldsymbol s_\text{interf} \\ &= \boldsymbol {Cz} + (\widehat{ \boldsymbol x} - \boldsymbol {Cz}) \end{aligned}. \tag{5.5}

Without doubt, time-varying factor should be considered, and the extension can be achieved by introducing a time index, that is, $$\widehat{\boldsymbol z}_t = \boldsymbol C_t \boldsymbol z_t + (\widehat{ \boldsymbol z}_t - \boldsymbol C_t \boldsymbol z_t)$$. The SIR-index should also be slightly modified to include convolution over $$t=1,2,\dots,T$$.

The SIR-index for LTV-SOBI can be simplified by taking $$\boldsymbol C_t = \text{diag}( \boldsymbol \Omega_t \widehat{ \boldsymbol \Omega}_t^{-1})$$ after a permutation fix. In practice, the permutation is found by arranging the numerically largest values to diagonal position either row-by-row or column-or-column in $$\boldsymbol \Omega_t \widehat{ \boldsymbol \Omega}_t^{-1}$$. Finally, the time-varying SIR-index is the measure of all diagonal values against off-diagonal items, and it is formulated as,

$$$\text{tvSIR}= 10 \log_{10} \frac{ \sum\limits_{t=1}^T|| \text{diag}( \boldsymbol \Omega_t \widehat{ \boldsymbol \Omega}_t^{-1}) ||^2}{ \sum\limits_{t=1}^T || \text{off}( \boldsymbol \Omega_t \widehat{ \boldsymbol \Omega}_t^{-1})||^2} , \tag{5.6}$$$

where, $$\boldsymbol \Omega_t \widehat{ \boldsymbol \Omega}_t^{-1} = ( \boldsymbol I + t \boldsymbol{\mathcal E})\boldsymbol \Omega_0 \big[ ( \boldsymbol I + t \widehat{\boldsymbol{\mathcal E}}) \widehat{\boldsymbol \Omega}_0\big]^{-1}$$.

SIR and tvSIR do not have a direct mathematical connection to each other, but the values are comparable since they both measure the similarity between the true source and the restored one. SIR and tvSIR range from $$-\infty$$ to $$\infty$$, and the larger the better. It should also be noted that correlation-based SIR, that is, the implementation in JADE package (Miettinen et al., 2017) does not require true mixing parameters to be known, but tvSIR will always require so. This is a result of different mixing mechanism and tvSIR definition. As $$\boldsymbol{\mathcal E}$$ is assumed to be very small in value, the Pearson’s correlation between restored and original can potentially encounter extreme values and therefore, makes the correlation-based SIR unrobust.

5.3 Simulation Study

As previously discussed, the performance of LTV-SOBI is expected to be significantly influenced by signal inherited properties, dimension, length and the scale of mixing matrix. Despite the impossibility to inscribe exact factors that impair the performance, simulations have been devided into four cohorts and conducted in R. Intending to minimize potential bias, the four cohorts have similar sources of 3-dimensional signal that involve sinusoidal and electrocardiograph (ECG) time-series, together with a moving-average or auto-regressive series. The simulated signals are analogous to those in Figure 2.1 with the dissimilarity in that the simulation study has 1 less dimension of either moving-average or auto-regressive series. The reduction of dimension is due to computational efficiency and visual similarity of such two signals. For convenience, define two matrix constants to be,

$$$\boldsymbol \Omega_{\text{sim}} = \begin{bmatrix} 2 & -6 & 0.5 \\ -9 & 5 & 3 \\ -4 &6 &8 \end{bmatrix} \text{ and } \boldsymbol M = \begin{bmatrix} -3 & 6 & -6 \\ -4 &2.5 & 6\\ 9&2.1 &7 \end{bmatrix}\ . \tag{5.7}$$$

In simulation configuration, the initial mixing matrix $$\boldsymbol{\Omega}_0$$ is first arbitrarily fixed to $$\boldsymbol \Omega_{\text{sim}}$$ as in (5.7). Then, four cohorts of the simulation are parameterized with differed time-varying factors and signal length, while the difference is only in terms of the scale with details given in Table 5.1. In each cohort, the true source signal and mixture are thus fixed.

Table 5.1: Key Configuration Parameters of Simulation Cohorts
I II III IV
$$\boldsymbol{\mathcal E}$$ $$\boldsymbol M \times 10^{-5}$$ $$\boldsymbol M \times 10^{-4}$$ $$\boldsymbol M \times 10^{-5}$$ $$\boldsymbol M \times 10^{-4}$$
Simulated Total Length $$100000\ (10^5)$$ $$100000\ (10^5)$$ $$10000\ (10^4)$$ $$10000\ (10^4)$$
Sampling Frequency 1:1 - 1024:1 1:1 - 1024:1 1:1 - 512:1 1:1 - 512:1
Observed Length 100000 - 98 100000 - 98 10000 - 40 10000 - 40

In the realistic scenario, true source is unobservable while mixture can be observed only at certain sampling rates (aka. sampling frequencies). For example, a piece of sound shall include 4800k samples per second as a signal mixture, but the recording equipment can only sample at 48kHz, which means that merely 1 out of every 100 sources is sampled as an observation. Aiming to simulate in accordance with such scenario and to evaluate how the observation length could affect LTV-SOBI performance, the simulation study further generates artificial observed mixture and corresponding source upon different sampling rates based on the same source and mixture. Figure 5.1 illustrates the sampling mechanism, and Table 5.1 summarizes the sampling rates in each cohort. Consequently, multiple mixtures of different lengths (as a result of sampling rates) are simulated within each cohort. In brief, instead of generating new pseudo sources and mixtures, the simulation study considers the sampling frequencies as a more robust alternative.

In the next step, LTV-SOBI algorithms along with various alternatives are applied to each generated observed mixture, and the results are compared against the true mixing matrix using tvMD and the true signals using tvSIR. Each observed mixture has seven different algorithms applied, including two types of Yeredor’s TV-SOBI (with and without quadratic form), four types of LTV-SOBI (with and without symmetry fix, with and without quadratic form) and LTV-SOBI-alt.

Finally, over 1000 similar simulations are performed to further eliminate potential bias and outliers and enable reporting simulation results. Figure 5.2 overviews the simulation study in the manner of progress flow, and the full R code is attached in Appendix 6.3.

5.4 Simulation Study Results

The aforementioned tvSIR and tvMD, served as a time-varying application of SIR and MD correspondingly, are applied to measure the algorithm performance in terms of accuracy and capability of restoring original mixing matrices and signals. The results are reported in Figures 5.3-5.6. Each sub-graph reports a specific simulation cohort with a defined lag parameter used in the signal restoration algorithm. The curves visualize the performance influenced by sampling frequencies, which are equivalent to observed signal lengths. Different algorithms can be distinguished by the line color.

Simulations showed that minor alternatives do not significantly affect the performance of a given algorithm. For example, the symmetry correction for estimated matrices in LTV-SOBI does not affect the overall performance metric. Therefore, further results presented below are aggregated, for convenience and simplicity, only over the major algorithms, namely LTV-SOBI, LTV-SOBI-alt, and Y-TVSOBI. Consequently, a point in Figure 5.3 and 5.6 represent the mean value of simulations under same settings. Besides, there is a separate web-based interactive dashboard, which is available at http://bss.yan.fi, to enable investigation of performance metrics from various perspectives, including those that are not directly reported in the sections.

Figures 5.3 and 5.4 presumes to exhibit better performance measured by tvSIR with a larger value; the higher the points and curves the better. The missing values and other deviations will be deliberated in Section 6. As seen in the results, the new LTV-SOBI outperformed Yereodr’s original TV-SOBI in most cases, while the comparable advantage seems to be diminished with increased observed length. On the other hand, the LTV-SOBI-alt algorithm is relatively insensitive to length, though the benefits from larger observation size are also less significant.

Using tvMD, Figures 5.5 and 5.6 demonstrate better performance with a smaller value, and LTV-SOBI-alt is comparably the best algorithm in most cases except for simulation cohort III; larger sample size often leads to improved performance. The difference among the three major algorithms is essentially minor.

In conclusion, the extension of MD and SIR grants insights into algorithm performance; the simulation results suggest that time-varying blind source separation problems are yet rather challenging as the tvSIR and tvMD values are not sufficiently good. LTV-SOBI algorithm is a comparably more applicable and effective approach. Nonetheless, the metric is not robust enough under all scenarios, and Section 6 will attempt to detail underlying issues. Finally, the simulation results have to exclude the comparison with ordinary SOBI because of non-compatible metrics; SOBI would be penalized by the time-varying model while benefiting from its ignorance of possible extreme values caused by $$\boldsymbol{\mathcal E}$$. Supposing the metric is acceptable with SOBI, the results prefer SOBI under large sample sizes and LTV-SOBI when the length of the signal is relatively small. Appendix 6.4 presents a few simulation results with SOBI, and the interactive dashboard provides full SOBI results.

5.5 Performance Consistency over Singal Types and Time-Varying Rates

The above simulation includes a series of ECG-signal with doubtful stationarity. Further simulation has been performed to check the performance of LTV-SOBI when source signals are composed of three different moving-average series, and the effect of time-varying speed ($$\boldsymbol{\mathcal E}$$) is also briefly examined. The results suggest that LTV-SOBI performance stays relatively consistent under different signal types and rates/speed of time-varying. When signal length is set to be 10000, lag selected to be 3, the tvMD averages to $$4.33$$, $$4.29$$ and $$4.32$$ for $$\boldsymbol{\mathcal E} = \boldsymbol M \times 10^{-2}$$, $$\boldsymbol M \times 10^{-3}$$ and $$\boldsymbol M \times 10^{-4}$$ correspondingly using LTVI-SOBI. Performance is also tested to be insensitive to the signal length. The results are similar to those reported in Figure 5.3 and 5.4.

References

Eriksson, J., Karvanen, J., & Koivunen, V. (2000). Source distribution adaptive maximum likelihood estimation of ICA model. Proceedings of the 2nd International Conference on ICA and BSS, 227–232.

Ilmonen, P., Nordhausen, K., Oja, H., & Ollila, E. (2010). A new performance index for ICA: Properties, computation and asymptotic analysis. International Conference on Latent Variable Analysis and Signal Separation, 229–236.

Miettinen, J., Nordhausen, K., & Taskinen, S. (2017). Blind source separation based on joint diagonalization in R: The packages JADE and BSSasymp. Journal of Statistical Software, 76.

Na, Y., & Chai, B. (2013). Performance evaluation for frequency domain blind source separation algorithms. Journal of Computational Information Systems, 9(18), 7369–7379.

Vincent, E., Gribonval, R., & Févotte, C. (2006). Performance measurement in blind audio source separation. IEEE Transactions on Audio, Speech, and Language Processing, 14(4), 1462–1469.