Masters candidate in Biostatistics, Sungtae Kim, will present:
“Application of Sparse Multi-way Distance Weighted Discrimination to Metabolism and Iron-deficiency in Infancy”
Plan B Adviser: Eric Lock
Abstract: In biomedical research or other related fields, it is common to classify observations into groups using several variables. Previous methods including linear/logistic regression and support vector machines apply only to a two-way array dataset (i.e., a matrix). These methods do not accommodate higher-order array datasets. Therefore, researchers developed a Multi-way DWD method to classify observations using data from a higher-order array. However, this implementation of Multi-way DWD does not allow for variable selection (i.e., sparsity). In this research, we propose the application of sparsity to Multi-way DWD. Our interest was to discriminate iron-deficient monkeys and iron-sufficient infant monkeys based on the metabolic profiles (413 metabolites) at 6 and 12 months. We applied sparsity with the Multi-way DWD method to reduce the variables selected. The result of Sparse Multi-way DWD could not discriminate the monkeys properly; however, the version without sparsity did show a significant difference in the metabolomic profiles. Also, we demonstrate a caution against performing variable selection on the full dataset before performing cross-validation. We conclude that the sparsity did not help to discriminate the iron-deficient monkey and iron-sufficient monkeys. Thus, the evidence suggests that the distinction of iron-deficient and iron-sufficient monkeys is due to many metabolites with relatively small effects, rather than a small number of metabolites with large effects.
Refreshments will be served prior to the presentation.