(The 2nd Edition of Asia Pacific Online Seminars on Mathematics for Industry)
7 May 2021 Online
Abstract
Prof. Sael Lee (1:10-1:40 pm)
Department of Software and Computer Engineering/Artificial Intelligence, Ajou University, Korea
Interpretable Learning Methods for Biomedical Data Analysis
In this talk, I’ll introduce two types of Interpretable Learning Methods my lab has been developing for Biomedical Data Analysis. The first type of method is an interpretable multiway genomic data analysis method based on tensors. Tensors, multi-mode arrays, are natural representations of multi-mode data such as various omics data, such as miRNA, methylation, gene expression, and mutation information of cancer patients.
Tensor decomposition methods can be used to analyze the multi-mode data by analyzing their factor matrixes, i.e., the output of tensor decomposition. Although tensors a linear method, interpretation of the factor matrices are not trivial due to a vast number of parameters values.
Proposed work generates factor matrices that are easier to interpret based on prior classification of genes. We further generalize and introduce interpretability of Tensor decompositions for when prior knowledge is not provided.
The second type of method is based on decision trees. We improve the decision trees with the latest deep learning techniques so that they generalize well compared to traditional decision tree.
Prof. Atsushi Tero (1:50-2:20 pm)
Institute of Mathematics for Industry, Kyushu University, Japan
Mathematical modeling of behavior decision from single cell to academic society
A organism acts based on various information and situations.
Here, I will introduce the mathematical modeling from single cell to researchers based on various examples.
By these models we consider how an each object acts and the group makes behavior decision.
Prof. Daechan Park (2:30-3:00 pm)
Department of Biological Sciences, Ajou University, Korea
PDAC prognosis prediction by subtype classification
Pancreatic ductal adenocarcinoma (PDAC) is a devastating disease with poor prognosis, and the squamous subtype is the worst type of PDAC. Here, we report statistical models to predict PDAC prognosis based on gene expression of PDAC subtypes. First, we obtained the comprehensive gene expression profiles and subtype information of 96 patients from International Cancer Genome Consortium. Logistic regression and Support Vector Classification with one-vs-rest (SVC-OVR) were then applied, respectively, to select top 20 marker genes by area under curve (AUC) through comparing squamous subtype to non-squamous subtypes. Using the gene expression values of the 20 genes, the models of multiple logistic regression and multiple SVC-OVR were built. To validate the performance of the models, we made use of RNA sequencing data of 196 tumors from Seoul National University Hospital. Although subtype information was not available for the the PDAC samples, we were able to classify poor prognostic group with log rank test (p-value < 0.001). As the expression measurement of only 20 genes is experimentally practical, the model with the genes will be therefore able to apply for clinical use.
Prof. Shizuo Kaji (3:10-3:40 pm)
Institute of Mathematics for Industry, Kyushu University, Japan
Fast computation of persistent homology of volumetric data and its application in medical image analysis
Medical images such as CT and MRI are acquired in the form of volumetric data. We introduce our easy-to-use software, CubicalRipser, which is capable of fast persistent homology computation of volumetric data. As an application, we discuss how persistent homology can be enhanced to capture both local and global topological features of images to enable interpretable medical image analysis.