A Practical Study on Recovering Spectra from RGB Images

Lin, Yi-Tun (2023) A Practical Study on Recovering Spectra from RGB Images. Doctoral thesis, University of East Anglia.

[thumbnail of YTL 231221 Final PhD Thesis.pdf]
Download (52MB) | Preview


RGB cameras make three measurements of the light entering the camera, whereas hyperspectral imaging devices, per pixel, record the spectrum of the light. Spectral images have been shown to be more useful than RGB images in solving problems in many industrial application areas, including remote sensing and medical imaging. Spectral Reconstruction (SR) refers to a computational algorithm that recovers spectra from the RGB camera responses. This “make-the-RGBs-more-informative” process is most commonly implemented by machine learning (ML) algorithms, given matching RGB and hyperspectral data for training. Two mainstream ML approaches used in SR are regression and Deep Neural Network (DNN). While the former often has simple closed-form formulations for a pixel-based mapping, the latter approach is much more complicated: millions of parameters are used to map large image patches, in the hope that the network could utilise the spatial context in which each RGB is seen to further improve SR. It is generally accepted that regressions have long since been superseded by DNN methods. Nevertheless, few studies have actually been dedicated to comparing the two approaches.

There are three main goals of this thesis. First, we benchmark regression- and DNN-based SR algorithms on the same hyperspectral image dataset. Here we pay close attention to the role that the spectral sensitivities of a camera play and also SR performance on unseen data. Second, we seek to improve regression-based algorithms and, in effect, attempt to close their gap in performance compared to DNN counterparts. Lastly, we investigate the practical issues faced by all SR algorithms. We consider SR performance as exposure changes and SR performance in a “closed-loop” imaging framework (i.e., do the spectra that an SR algorithm recovers integrate to the same input RGBs?).

Our baseline benchmarking experiments indicate that the best DNN method only delivers a 12% accuracy improvement compared to the best-performing regression. Moreover, a regression method trained for one camera might actually outperform a DNN trained on another camera. Additionally, we find that the DNN’s worst-case performance (for unseen and unexpected scenes) is no better than the simplest regression method. Concomitantly, this encourages us to see if we could improve the average performance of regression methods.

We propose three new improvements for regression methods. First, we reformulate the regressions so that they minimise a loss metric that is more similar to the one used to rank and train the leading DNN methods. Secondly, we revisit the regularisation step of the regression implementation. Regularisation is a technique for making the outputs of regressions more stable for unseen input and is usually governed by a single regularisation parameter. Here, we adopt as many regularisations as there are channels in a hyperspectral image, and this results in significant performance improvement. Lastly, we propose a new sparse regression framework. In sparse regression, we code RGBs in terms of the neighbourhood in the RGB space (via a clustering argument). We argue that this clustering is better performed in the spectral domain (where input RGBs are first regressed to some primary estimation of spectra). Combined, upgraded formulation and improved clustering, we develop a regression-based method found to work as well as the top DNN methods.

As important as spectral accuracy is, trained SR algorithms need to work in practice, e.g., where objects and scenes can be viewed in varying exposure conditions. Unfortunately, we find that leading methods, such as non-linear regressions and DNNs, do not work well when exposure changes. Consequently, we propose new training frameworks which ensure the DNNs and regressions continue to work well under changing exposures.

Finally, we investigate the following problem: we find that both regression- and DNNbased SR algorithms recover spectra that—when integrated with the camera’s spectral sensitivities—do not induce the same RGBs as the input to the algorithm. This means that the spectra that are recovered cannot (ever) be the correct spectra. Given this finding, we seek ways of adding physical plausibility (spectra should integrate to predict the input RGBs) to the SR algorithms. One of our proposed solutions is effectively a simple post-processing step which, provably, always improves the RMS (i.e., root-mean-square) performance of any SR algorithm.

Item Type: Thesis (Doctoral)
Faculty \ School: Faculty of Science > School of Computing Sciences
Depositing User: Jennifer Whitaker
Date Deposited: 20 Feb 2024 16:05
Last Modified: 20 Feb 2024 16:05
URI: https://ueaeprints.uea.ac.uk/id/eprint/94366

Actions (login required)

View Item View Item