This is work in progress... (c) 2021 Z. Gajarska and H. Lohninger



Spectral Mixtures for Testing Classifiers

Think of the following scenario: let's assume that we want to develop a classifier which should recognize red apples, tomatoes and red peppers in a collection of fruits and vegetables by analyzing spectra obtained in the visible and near infrared range of light (approx. 400 to 1000 nm). The usual way is to create a training dataset containing both spectra of apples, tomatoes and red peppers and spectra of all kinds of other fruits and vegetables. The classifier is trained using this dataset and then applied to an independent test set.

When exploring the properties of the new classifier you might ask yourself how sensitive the new classifier is against spectra of yellow peppers. Ideally the classifier should not provide a response on yellow pepper. However, it is quite difficult to judge when the classifier begins to change the response if you apply it to different images of yellow peppers where some of the yellow peppers are actually more red than yellow.

To answer this question a tool of ImageLab comes in handy: the spectral mixture generator. The idea behind the spectral mixture generator is to use two reference spectra and calculate a continuous mix of the two spectra. In addition different levels of noise are added to the mixed spectra. Thus we take representative spectra of a yellow and a red pepper, mix them and add different levels of noise. This leads to an artificial image (a "mixture map") which exhibits mixed spectra whose proportion of red and yellow pepper spectra change between 0 to 100% along the x axis, while the amount of (heteroscedastic) noise added to the spectra increases along the y axis (Fig. 1).
Figure 1: The artificial mixture map shows various proportions of the two selected spectra along the x axis and an increasing amount of noise along the y axis.

 

The added noise is normally distributed with a zero mean and a standard deviation which is proportional to the signal level times the Noise Level control. Fig. 2 shows a copy of the user interface of the mixture generator: the top two traces show the spectra of red and yellow pepper, respectively. The bottom trace shows the mixture spectrum with 60% red pepper and 40% yellow pepper. The noise level is low at this point (see the position of the cursor in the mixture map at the right).

Figure 2: The user interface of the spectral mixture generator allows many options to synthesize the mixture map (see the ImageLab help file for details).

 

This mixture map is now subjected to the classifier (actually the classifier can be seen as three separate classifiers, being selective for red peppers, red apples and tomatoes). An ideal classifier should result in a classified mixture map which indicates class 1 on one side and class 2 on the other side, up to a high level of noise. In reality the behavior of the classifier can change considerably from case to case, depending on the quality of the spectra, the differences of the pure spectra, the used descriptors, the relative amount of noise and the quality of the training data, so mention just a few. But in principle, the typical response is like the response surface shown in Fig. 3.

Figure 3: A good classifier will generate a response which is symmetric around the 50% line. The stability against noise in the spectra can be seen from the form of the iso-lines, indicating lines of equal response.

Figure 4 shows the response of the red pepper classifier when applied to a mixture of red and yellow pepper spectra. From left to right the amount of yellow pepper increases while the proportion of the red pepper spectrum decreases. As you can see, the red pepper classifier does its job quite well as it correctly assigns a spectrum to the "red pepper" class as long as approx. 50% of red pepper spectrum is in the mixed spectrum. The range between 50 and 70% yellow pepper is indecisive. Above 70% of yellow pepper the red pepper classifiers returns "No". Of course, if the noise increases the indicisive range becomes broader and broader as the spectra cannot be assigned to one of the two classes if the amount of noise is too high.

Figure 4: The response of the red pepper classifier applied to the mixture map of red and yellow peppers.

 

Now, how do the other classifiers (i.e. the classifiers for apples, tomatoes and other vegetables) perform? In theory their respones should be close to zero for the entire mixture image (at least for pure spectra). As you can see from Fig. 5 the classifiers do quite well:



Figure 5: The classifiers for apples, tomatoes and other vegetables applied to the mixture map of red and yellow peppers. Both the apple and the tomato classifier do not show any positive classifications. The classifier for other vegetables also performs as expected: its reponse becomes positive at percentages of more than 70% yellow pepper (since yellow pepper belongs to "other vegetables".