Machine Learning with Python Professional Training
Advanced Data Mining Sessions

Course Descriptions and Videos

Support Vector Machine In-Depth

Module 8. Support Vector Machine In-Depth

In this module, we will describe one of the most widely used machine learning methods, i.e., Support Vector Machine (SVM).   SVM is an approach for classification that was developed in the computer science community, and it has been shown to perform well in a variety of settings and is often considered one of the best “out of the box” classifiers.   In addition, Support Vector Regression (SVR) is a variant of Support Vector Machine (SVM).  SVR performs regression-based prediction.   We focus on the classification method, i.e., SVM.

*Pre-recorded Lesson: https://rutgers.mediaspace.kaltura.com/media/Module+8+-+Support-Vector-Machine/1_a8fu2ev2

Prediction via Evidence Combination

Module 9. Prediction via Evidence Combination

Let’s now examine a different way of looking at drawing such conclusions. We could think about the things that we know about a data instance as evidence for or against different values for the target. The things that we know about the data instance are represented as the features of the instance. If we knew the strength of the evidence given by each feature, we could apply principled methods for combining evidence probabilistically to reach a conclusion as to the value for the target. We will determine the strength of any particular piece of evidence from the training data.

*Pre-recorded Lesson: https://rutgers.mediaspace.kaltura.com/media/Module+9+-+Prediction+via+Evidence+Combination/1_2atp92cj

Representing and Mining Text: Text and Sentiment Analysis

Module 10. Representing and Mining Text: Text and Sentiment Analysis

Data are represented in ways natural to problems from which they were derived. If we want to apply the many data mining tools that we have at our disposal, we must either engineer the data representation to match the tools or build new tools to match the data. Top-notch data scientists employ both strategies. It generally is simpler to first try to engineer the data to match existing tools, since they are well understood and numerous.  In this module, we will focus on one sort of data that has become extremely common as the Internet has become a ubiquitous channel of communication: text data. Examining text data allows us to illustrate many real complexities of data engineering and helps us to better understand a very important type of data.            

*Pre-recorded Lesson: https://rutgers.mediaspace.kaltura.com/media/Module+10-+Representing+and+Mining+Text/1_8z39j6dg

Unsupervised Learning Algorithms

Module 11. Unsupervised Learning Algorithms                    

Similarity underlies many data science methods and solutions to business problems. If two things (people, companies, products) are similar in some ways they often share other characteristics as well. Data mining procedures often are based on grouping things by similarity or searching for the “right” sort of similarity. We saw this implicitly in previous chapters where modeling procedures create boundaries for grouping instances together that have similar values for their target variables. In this module, we will look at similarity directly, and show how it applies to a variety of different tasks.  We may want to group similar items together into clusters, for example to see whether our customer base contains groups of similar customers and what these groups have in common. Previously we discussed supervised segmentation; this is unsupervised segmentation. After discussing the use of similarity for classification, we will discuss its use for clustering.     

*Pre-recorded Lesson: https://rutgers.mediaspace.kaltura.com/media/Module+11+-+Unsupervised+Data+Mining+and+Clustering/1_29zsu16i

Deep Learning

Module 12. Deep Learning                                                     

Deep Learning (DNN) is a branch of machine learning methods and originates from artificial neural networks (ANN).   Given the advancements in computer hardware, algorithms, and big data abundance, deep learning has gained rapid developments since the early 2000s.    It brings new approaches, new methodologies into bioinformatics and provides a new angle to approach challenging bioinformatics problems.   Deep learning has many architectures, such as Deep Multilayer perceptron’s (DMLP), deep belief networks (DBN), graph neural networks (GNN), recurrent neural networks (RNN), and convolutional neural networks (CNN).   They bring state-of-the-art performance to biomedical research, clinical patient care, electronic health record (EHR) analysis.   Sometimes, many think that it is a “magic bullet” for solving any challenging problem.  In the module, we demystify deep learning and uncover the fundamental connection between Machine Learning and Deep Learning and explain deep learning with the terminologies introduced in the previous modules.

 

*Pre-recorded Lesson: https://rutgers.mediaspace.kaltura.com/media/Module+12+-+Deep-Learning.mp4/1_yd6vp7o7