Tutorial on Machine Learning and the Positive Unlabeled Learning Problem

Kristen Jaskie and Andreas Spanias SenSIP Center, Arizona State University

Abstract: This tutorial introduces the principles and applications of machine learning algorithms in general, and Positive Unlabeled learning in particular. The tutorial begins with an introduction to the basic ideas, algorithms, and applications of machine learning. After this general introduction, we will focus on the little known, yet important semi-supervised learning problem known as Positive Unlabeled learning (PU learning). PU learning enables classification with only a small subset of labeled positive data. This becomes particularly important in situations when obtaining complete training labels is expensive or impossible. We will present several real-world scenarios, emphasizing signal processing and sensor applications. Algorithms will be presented at a high level, with an emphasis on using pre-built functionality in MATLAB when possible. We will end with a discussion on algorithm and model evaluation. The tutorial includes notes and a survey paper on the Positive Unlabeled learning (PU learning).

Speaker Biographies

Kristen Jaskie is the owner and senior scientist of Data Analytics Consulting LLC. She is a Professor of Computer Science at Glendale Community College in Glendale, AZ. She is also a senior PhD student in Electrical Engineering at Arizona State University, specializing in Machine Learning and Signal Processing. She has a master’s degree in Computer Science with a focus in Machine Learning from UC San Diego. Her current research involves creating new machine learning algorithms to solve the Positive and Unlabeled learning problem (PU Learning), an extremely important semi-supervised classification problem for use in image processing and other applications. Additional research involves both signal processing and machine learning for smart grid energy load analysis. Past research includes developing and applying machine learning algorithms to computer security issues, software defined radio speaker authentication, microscopic cellular image analysis, marketing predictions based on demographic data, species presence using environmental data (one-class classification), and predicting which proteins have certain types of permeable barriers.

Andreas Spanias is Professor in the School of Electrical, Computer, and Energy Engineering at Arizona State University (ASU). He is also the director of the Sensor Signal and Information Processing (SenSIP) center and the founder of the SenSIP industry consortium (also an NSF I/UCRC site). His research interests are in the areas of adaptive signal processing, speech processing, machine learning and sensor systems. He and his student team developed the computer simulation software Java-DSP and its award-winning iPhone/iPad and Android versions. He is author of two textbooks: Audio Processing and Coding by Wiley and DSP; An Interactive Approach (2nd Ed.). He contributed to more than 300 papers, 7 monographs 9 full patents, 6 provisional patents and 10 patent pre-disclosures. He served as Associate Editor of the IEEE Transactions on Signal Processing and as General Co-chair of IEEE ICASSP-99. He also served as the IEEE Signal Processing Vice-President for Conferences. Andreas Spanias is co-recipient of the 2002 IEEE Donald G. Fink paper prize award and was elected Fellow of the IEEE in 2003. He served as Distinguished Lecturer for the IEEE Signal processing society in 2004. He is a series editor for the Morgan and Claypool lecture series on algorithms and software. He recently received the 2018 IEEE Phoenix Chapter award with citation: “For significant innovations and patents in signal processing for sensor systems.” He also received the 2018 IEEE Region 6 Educator Award (across 12 states) with citation: “For outstanding research and education contributions in signal processing.”

The participation in the tutorial is free for everyone.