Variety Matters When Testing Apples With NIR

October 3, 2023
3 min read
George Hill
Sagitto Ltd

Innovations in chip-scale sensors and NIR LEDs are creating exciting opportunities to accurately measure fruit quality with tiny, inexpensive spectral devices. Sagitto has undertaken extensive testing of kiwifruit and apples, using innovative new spectral sensors and NIR LEDs from ams-Osram. We have demonstrated that we can build robust predictive models for Total Dissolved Solids (°Brix) and Dry Matter (DM%) for a wide range of apple varieties.

Building Predictive Models Using NIR Spectral Sensors and LEDs

As one of ams-Osram's Preferred Partners, Sagitto had the opportunity during the 2021 and 2022 Southern Hemisphere harvest periods to take NIR spectral measurements and reference data from fifteen different apple varieties - including Sunrise, Gravenstein, Royal Gala, Sweetango, and Envy.

Three Envy apples from T&G Global, and a tiny NIR spectrometer from ams-Osram
Envy apples with the ams-Osram Demonstration Design NIR Spectrometer

ams-Osram have developed a demonstration design for a miniature NIR spectrometer, using very small but powerful NIR LEDs as the illumination source and operating over the spectral range of 750nm to 1050nm. Just as we can often see differences in colour between different apple varieties with our own eyes, apple varieties differ in the way that they absorb light in the near infrared part of the spectrum too, as seen in this plot.

NIR absorbance plots for five different varieties of apple
Absorbance in the NIR range 750nm to 1050nm for several apple varieties

By combining the non-destructive NIR spectral data with destructive measurements of °Brix and Dry Matter from almost 1,000 apples, Sagitto was able to build very accurate predictive models for these two important apple quality parameters.

Machine learning model using NIR to predict apple Brix
Predictive model for °Brix in apples built using ams-Osram NIR sensor and Sagitto machine learning
Machine learning model using NIR to predict apple dry matter %
Predictive model for Dry Matter % in apples built using ams-Osram NIR sensor and Sagitto machine learning

A Wide Range Of Apple Varieties

As the following histograms show, there are considerable differences in the profile of °Brix and Dry Matter between varieties. At one extreme is Sunrise, an early season apple that is naturally tart. At the other extreme, Envy apples are picked towards the end of the harvest season and have very high fruit sugars and dry matter.

Brix and Dry Matter distributions for Sunrise, Gravenstein, Royal Gala, Sweetango, and Envy apple varieties
°Brix and DM% for Sunrise, Gravenstein, Royal Gala, Sweetango, and Envy apple varieties

A model that is built on only one variety might be accurate for that particular variety, but probably would not be accurate for other apples - especially those with very different °Brix and Dry Matter profiles.

Demonstrating Importance Of Variety

To demonstrate the importance of having representation from different apple varieties in our machine learning training dataset, we conducted an experiment: for each of five varieties - Royal Gala, Envy, Sunrise, Sweetango, and Gravenstein - and each of the two quality parameters, we built two regression models.

The first regression model included all five varieties in its training set, so that the variety had been 'previously seen' by the model. The second regression model was trained on all except the variety being investigated - what we've called the 'Never-Seen-Before' variety.

The results were as we expected. Regardless of whether we predicted soluble sugars (°Brix) or dry matter (DM%), for each of the five varieties the models that had previously seen examples of spectra from the target variety performed better than the models that had never seen apples from that variety.

Predicted soluble sugars in Gravenstein apples
Predictions for soluble sugars (°Brix) in Gravenstein apples, using a model that has never seen a Gravenstein apple previously
Predicted soluble sugars (Brix) in Gravenstein apples
Predictions for soluble sugars (°Brix) in Gravenstein apples, using a model that includes examples of Gravenstein apples in the training set
Predicted dry matter in Envy apples
Predictions for dry matter (DM%) in Envy apples, using a model that has never seen an Envy apple previously
Predicted dry matter (DM%) in Envy apples
Predictions for dry matter (DM%) in Envy apples, using a model that includes examples of Envy apples in the training set

Do We Need Every Apple Variety?

We have demonstrated that ams-Osram's miniature NIR spectrometer can be used to accurately measure important quality parameters in apples, across a range of apple varieties, when those varieties are present in our training data - but there are thousands of different varieties of apple around the world.

We are not suggesting that a robust and accurate NIR model for apple quality parameters needs to include examples of all varieties - if we are only interested in testing one variety of apple, then we only require examples from that one specific variety. However, we have found that combining data from a range of varieties can lead to synergies: our accuracy for each variety can be greater than that of models built on a single variety's data alone.


Having a diverse and representative training set is crucial in machine learning because it allows the model to generalize better to new data. If the training set only contains a limited number of examples or is not representative of the population, the model may be 'overfitted' - that is, perform poorly on unseen data.

When developing NIR models to predict apple quality, it is important to carefully select the apple varieties in the training data.

Special thanks to the team at T&G Global of BayWa for supplying some amazing Envy apples for testing

Subscribe to Sagitto's Blog

Get industry insights that you won't delete, straight in your inbox.
We use contact information you provide to us to contact you about our relevant content, products, and services. You may unsubscribe from these communications at any time. For information, check out our Privacy Policy.
George Hill
Sagitto Ltd
Sagitto's founder, George Hill, first started working with artificial intelligence during the 1980s, while developing 'expert systems' within Bank of America in London. On returning to New Zealand, he undertook part-time study with the University of Waikato's Machine Learning Group while working for Hill Laboratories, a well-known New Zealand commercial testing laboratory. This led to the formation of Sagitto Limited, dedicated to combining the power of artificial intelligence and machine learning with spectroscopy.

More news

Vendor Lockin

You Should Be Free To Leave

We believe that our customers should subscribe to our services willingly, because of the value that they receive and not because they are locked in to using us. That is why we take particular care to provide a smooth pathway, should our corporate customers decide to no longer use Sagitto's services.

Read Article

Know Your Lavender Oil

'English lavender' oil is extracted from the flowers of Lavandula angustifolia, while 'Lavandin' oil is made from Lavandula x. intermedia, a hybrid cross between Lavandula angustifolia and Lavandula latifolia (Spike Lavender). Near infrared spectroscopy not only gives a very rapid and inexpensive method to tell the difference between these two types of oil, but also allows the composition of oils to be accurately measured.

Read Article
Data Science

Benchmark Against Machine Learning Models

The success of generative AI applications such as ChatGPT and DALL-E has increased public awareness of the power of artificial intelligence software. Sagitto's Benchmarking Service allows users of infrared spectroscopy instruments to benchmark their current models against models generated by machine learning.

Read Article