Enhanced Object Detection with Hyperspectral Data

Enhanced Object Detection with Hyperspectral Data

The challenges computer vision developers face  

The reliance on large datasets has driven the development of algorithms that are expensive to train, limiting the availability of generic detection models to only the most widely labeled and broad object categories.  

For the average computer vision (CV) developer, this presents significant challenges. Most developers have access to only a limited number of labeled images – often just tens, hundreds, or, if they’re lucky, a few thousand – far fewer than the billion images used to train Segment Anything (SAM) or even the 1.2 million images used for AlexNet in 2012. YOLO from Ultralytics was trained on, at least, the 328 thousand images of MSCOCO (and likely more). If you were to label that data from scratch, at a competitively priced $1.33 per image that would cost $436,000.00.   

The substantial computational resources and data required to train these models are often out of reach for developers. This is exacerbated even further in specialised domains such as AgriTech, Medicine and Security. In these domains, data needs to be labelled by experts, may require strict confidentiality control, and the accurate detection of rare classes is of the utmost importance. This further drives up labelling costs and the required data volume to achieve the necessary accuracy levels.  

Comparing the data requirements of AlexNet and SAM, we see that there is a trend towards ever larger data volumes being necessary to train Deep Learning systems.  

Fundamentally, any CV algorithm based on RGB data alone will always face these problems as no amount of images or computational resources will make up for the information that isn’t in an RGB image to begin with. Hyperspectral information provides a short circuit to this due to the additional information held in each spectrum.

How hyperspectral data solves these problems?  

Hyperspectral imaging (HSI) offers an additional dimension which can be used to segment and classify spectra. The reflectance spectrum of an object contains information about its chemical make-up, showing which frequencies of light are absorbed and which are reflected.  

This rich colour information allows for the easy distinction between object subclasses, which are invisible in RGB, and removes the reliance on shape information meaning occlusion is no longer an issue.  

The nature of hyperspectral imaging means you can reduce the effects of lighting variation by using a reflectance conversion. This, in turn, means that the requirement for large amounts of training data to achieve high precision is no longer a concern.  

Additionally, the larger amount of colour information held in a spectrum makes a machine learning algorithm’s job easier. This is analogous to trying to distinguish between red pencil and green pencil in colour compared to in black and white.

Distinguishing between apple varieties; HSI vs. RGB

Enhanced Object Detection with Hyperspectral Data

Let’s take the example of distinguishing between different varieties of apples.   

Whilst telling the difference between a Granny Smith and a Pink Lady might be trivial in RGB, distinguishing between a Royal Gala, Jazz and Pink Lady is harder.  

Accessing an RGB apple detector is reasonably easy, `Apple` is one of the 1000 classes that YOLO comes trained on out of the box. However, if you want to distinguish between different types of apples, for example, to prevent people paying for cheap apples at a self-service checkout, whilst taking expensive ones, then RGB doesn’t cut it.  

Self-checkout theft is estimated to cost $1.97 billion to retailers annually. Whilst computer vision could be used to address this issue, it is likely that RGB only techniques would fail to provide the necessary levels of precision needed. Leading to annoyed customers, as the self-service checkout incessantly complains “Unexpected item in bagging area”.  

Fortunately, hyperspectral data makes distinguishing between different varieties of apples (and other produce) trivial. This makes HSI based computer vision techniques a game changer in the world of retail.

The separability of the spectra allows us to make use of a simple, random forest classifier to distinguish between apple varieties. Combining spectral detection with off-the-shelf  RGB segmentation gives the results seen below training from scratch with only 100 training images.

Enhanced Object Detection with Hyperspectral Data

This example demonstrates how easy it is to integrate the Living Optics Camera into computer vision workflows and start making use of hyperspectral data.  

Want to try our hyperspectral imaging technology yourself?

You can access our GitHub under the spatial-spectral-ml project and get the training data here.

Want to learn more about our cutting-edge hyperspectral technology and how it could be incorporated into your operations? Get in touch with our sales team today.

We would love
to hear from you