Statistical Machine Learning and Inference


(a) 1,000 training samples, and (b) RP-estimated JPD contour plots.

Project Description

Machine learning is a key technology in the field of big data. If a machine learning algorithm utilizes statistical concepts or methods, it is called a statistical machine learning (SML) algorithm. The central component of an SML algorithm is a joint probability distribution (JPD). Recently, Soroush Group introduced a scalable efficient method of estimating JPDs of continuous random variables that have arbitrary (non-monotonic or monotonic) relationships (e.g. the relationship between x1 and x2 in Figure (a)). As the backbone of the method is a set of monotonization transformations that ‘roll out’ the relationships, the method was named the rolling pin (RP) method. This method provides a model in the form of a joint probability distribution that describes the relationship among the variables that a dataset represents. RP method-based JPDs are suitable for conducting probabilistic modeling and inference systematically and efficiently. It has several advantages of this new method of inference over Bayesian network-based inference. SMREU students will develop the data-driven models for engineering systems, validate the models, and conduct forward and backward inference based on the models. In backward inference the most-likely root cause of an obervation is determined, while in forward inference the most-likely consequence of an observation is predicted. 

Research Goals

  • Build and test SML models
  • Validate SML models
  • Determine cause(s) of an observation and predict its consequence(s) based on SML models

Learning Goals

  • How to develop an SML model
  • How to determine the accuracy of an SML model
  • How to conduct inference using an SML model

Group Conducting Research

Soroush Lab: http://www.chemeng.drexel.edu/soroushresearchgroup/