# Presentation at Sensometrics 2016

Ingunn Berget held a presentation entitled “Clustering data from projective mapping” at the  13th Sensometrics Conference . Coauthors of the work were P. Varela and T. Næs.

In this talk different strategies for clustering data from projective mapping were discussed. And a New methodologu, Sequential Clusterwise Rotations (SCR) was presented. This Method is based on a combination of procrustres analysis and a modified Fuzzy C means  algorithm.

The Conference was held in Brighton, UK from 27’th to 29’th of July with tutorials on the 26’th. There were 104 delegates from 16 countries around the globe. For more information about the program and the presentations, visit www.sensometrics.org.

Below a picture of the well known Brighton pier.

# The infamous Måge plot

Some multiblock regression Methods (such as SO-PLS and PO-PLS) allow for different numbers of components in each block. There are two strategies for selecting the numbers of components for these models: sequential and global. With the sequential strategy, the number of components to use for the first block is determined before the second block is introduced, and so on. With the global strategy, all blocks are taken into account from the beginning. Models With all combinations of components from each block are tested, and the combination giving the minimum prediction error is selected. Often, several combinations have approximately equally good prediction ability, and in such cases it is important to also take the total number of components into account. The Måge plot is a valuable tool for evaluating the models and selecting the optimal numbers of components.

The Måge plot shows the prediction error for each combination of components, as a function of the total number of components. From this perspective, it is possible to decide the total dimensionality of the system and the individual dimensionalities of each block at the same time. It is also easy to identify models that are indistinguishable from a prediction point of view. In the figure below, it is obvious that the total complexity is three. The two most predictive components are found in the first block,  and the predictive ability is almost equal whether the third component is taken from the second or third block (combination “210” and “201” are almost equal).

Matlab code for making the plot can be found here: MagePlot

References:

Måge, I., Mevik, B. H., & Næs, T. (2008). Regression models with process variables and parallel blocks of raw material measurements. Journal of Chemometrics, 22(8), 443–456.

Næs, T., Tomic, O., Afseth, N. K., Segtnan, V., & Måge, I. (2013). Multi-block regression based on combinations of orthogonalisation, PLS-regression and canonical correlation analysis. Chemometrics and Intelligent Laboratory Systems, 124, 32–42.

Biancolillo, A., Måge, I., & Næs, T. (2015). Combining SO-PLS and linear discriminant analysis for multi-block classification. Chemometrics and Intelligent Laboratory Systems, 141, 58–67.

# Nofima in COST action FoodMC

Kristian Liland and Ingrid Måge has joined the Management Committee of the COST Action “Mathematical and Computer Science Methods for Food Science and Industry” (FoodMC). The Action is chaired by Dr. Alberto Tonda at INRA (France), and Ingrid is leading the Working Group entitled “Modelling food products and food processes”.

## Goal

Support the food sector in facing future challenges in production and processing, adopting modelling and optimization methods from Maths and Computer Science.

## Background

The agriculture and food processing sector (agri-food) is facing sustainability challenges of growing complexity, from consumer expectations to concerns over food security, right through to environmental regulations. In such a context, innovation is becoming a decisive factor of competitiveness for companies in this field. Methodologies and tools from Maths and Computer Science (MCS) are emerging as key contributors to modernization and optimization of processes in various disciplines: the agri-food sector, however, is not a traditional domain of application for MCS, and at the moment there is no community organized around solving the issues of this field.

This COST Action brings together scientists and practitioners from MCS and agri-food domains, stimulating the emergence of new research, and structuring a new community to coordinate further investigation efforts. Exploiting approaches originating at different sub-fields of MCS, from applied mathematical models to knowledge engineering, this COST Action will cover two main topics: understanding and controlling agri-food processes; and eco-design of agri-food products.

http://www6.inra.fr/foodmc/

http://www.cost.eu/COST_Actions/ca/CA15118

# Variable selection in Multiblock regression

The focus of the present paper is to propose and discuss different procedures for performing variable selection in a multi-block regression context. In particular, the focus is on two multi-block regression methods: Multi-Block Partial Least Squares (MB-PLS) and Sequential and Orthogonalized Partial Least Squares (SO-PLS) regression. A small simulation study for regular PLS regression was conducted in order to select the most promising methods to investigate further in the multi-block context. The combinations of three variable selection methods with MB-PLS and SO-PLS are examined in detail. These methods are Variable Importance in Projection (VIP) Selectivity Ratio (SR) and forward selection. In this paper we focus on both prediction ability and interpretation. The different approaches are tested on three types of data: one sensory data set, one spectroscopic (Raman) data set and a number of simulated multi-block data sets.

## Reference

Biancolillo, A., Liland, K. H., Måge, I., Næs, T., & Bro, R. (2016). Variable selection in multi-block regression. Chemometrics and Intelligent Laboratory Systems, 156, 89–101. doi:10.1016/j.chemolab.2016.05.016

# 14th AgroStat symposium

The 2016 AgroStat symposium was held at Nestlé Research Center in Lausanne, Switzerland on March 21-24. It was professionally organized and had a lot of high quality scientific contributions in chemometrics, sensometrics, big data and risk & process. Most of the presentations and all posters were in English, while a few exceptions were French. The first day was reserved for a variety of workshops. There were around 120 participants, mostly French and Swiss, but also from Canada, Great Britain, Scandinavia, Portugal, Italy, USA, Germany and some other countries.

From Nofima Kristian Hovde Liland contributed with a poster describing the R package MatrixCorrelation and the newly developed Similarity of Matrices Index.

Presentations, posters and short papers are available from:
http://agrostat2016.sfds.asso.fr/e-proceedings/?lang=en

# Mini-workshop on TDS and TCATA with John Castura

Just before Easter, John Castura from Compusense Inc. (Ontario, Canada) visited Nofima just for a mini-workshop on comparison of TDS (temporal dominance of sensations) and TCATA (temporal check-all-that-apply) . These are two methods for studying the dymanics of the sensory perception during consumption. The methods differ in how samples are evaluated. While in TDS assessors should only select the dominant attribute at each time point, in TCATA they can select all the attributes they find relevant for describing the sample concurrently.

The participants in the workshop were in addition to John, Paula Varela, Tormod Næs, Mats Carlehög. Margrethe Hersleth and Ingunn Berget (all from Nofima). Paula first presented results from a qualitatiative study of understanding the concept of dominance. Next we discussed results with both methods on bread samples, and which data analytical tools to use to extract more information about similarities and differences between the methods. An abstract has been sent to Eurosense

TCATA trajectories

# Visiting scientist

As a spin-off from the Mini-Arctic Conference, Dr. Frans van der Kloet was visiting Nofima in December 2015. Frans is a Post-doc in the Biosystem Data Analysis group at the University of Amsterdam, working on integration and interpretation of -omics data.

During his stay he established a collaboration on the use of data fusion methods for separating common and distinct variation across datasets. Together With Dr. Ingrid Måge he will investigate the practical use and added value of methods such as DISCO, JIVE, OnPLS and PO-PCA.

References:

Schouteden, M., Van Deun, K., Wilderjans, T. F., & Van Mechelen, I. (2014). Performing DISCO-SCA to search for distinctive and common information in linked data. Behavior Research Methods, 46(2), 576–87. doi:10.3758/s13428-013-0374-6

Van Deun, K., Smilde, a. K. K., Thorrez, L., Kiers, H. a. L. a L., & Van Mechelen, I. (2013). Identifying common and distinctive processes underlying multiset data. Chemometrics and Intelligent Laboratory Systems. doi:10.1016/j.chemolab.2013.07.005

Lock, E. F., Hoadley, K. a., Marron, J. S., & Nobel, A. B. (2013). Joint and individual variation explained (JIVE) for integrated analysis of multiple data types. Annals of Applied Statistics, 7(1), 523–542. doi:10.1214/12-AOAS597

Löfstedt, T., Hoffman, D., & Trygg, J. (2013). Global, local and unique decompositions in OnPLS for multiblock data analysis. Analytica Chimica Acta, 791(June 2012), 13–24. doi:10.1016/j.aca.2013.06.026

Måge, I., Menichelli, E., & Næs, T. (2012). Preference mapping by PO-PLS: Separating common and unique information in several data blocks. Food Quality and Preference, 24(1), 8–16.

Næs, T., Tomic, O., Afseth, N. K., Segtnan, V., & Måge, I. (2013). Multi-block regression based on combinations of orthogonalisation, PLS-regression and canonical correlation analysis. Chemometrics and Intelligent Laboratory Systems, 124, 32–42.