RapidMiner & Data Science -Foundations & Advanced: Data Mining and Predictive Analytics

3 - 6 Dec 2018 | Kuala Lumpur 10 - 13 Dec 2018 | Singapore 11 - 14 Dec 2018 | Bangkok
Book Your Seat Today!

Kindly advise me your company detail and our consultant will contact you soonest!

Course Objectives

  • Perform all common data preparations
  • Build sophisticated predictive models
  • Evaluate model quality with respect to different criteria
  • Deploy analytical predictive models
  • Utilize more complex functionality of RapidMiner Studio
  • Apply more sophisticated analytical approaches

Description

This four days course prepare analyst to take the knowledge gained and apply it to their own respective data mining problems, solving them quickly and easily. The lessons learnt will be applicable to areas such as customer analytics, targeted marketing, social media analytics, fraud detection, predictive maintenance, resource management, etc.

Target Audience

This course is suggested for analysts and data scientists.

Training Outline

Foundations

Overview
  • Business scenario
  • Analytics Taxonomy & Hierarchy
  • CRISP-DM & Data mining in the enterprise
Basic Usage
  • User interface
  • Creating and Managing RapidMiner repositories
  • Operators and processes
  • Storing data, processes, and results sets
EDA: Exploratory Data Analysis
  •  Loading Data
  • Quick Summary Statistics
  • Visualizing Data & Basic Charting
Data Preparation
  • Basic Data ETL (Extract, Transform, and Load)
  • Data Types & Transformations of value types
  • Handling missing values
  • Handling attribute roles
  • Normalization and standardization
  • Filtering examples and attributes
Building Better Processes
  • Organizing
  • Renaming
  • Relative Path
  • Sub-processes
  • Building Blocks
  •  Breakpoints
Predictive Model’s Algorithms
  • K-Nearest Neighbour
  • Correlations
  • Naive Bayes
  • Linear Regression
  • Rules
  • Decision Trees
Model Construction and Evaluation
  • Machine Learning Theory: Bias, Variance, Overfitting & Underfitting
  • Split and Cross Validation
  • Applying models
  • Optimization and Parameter Tuning
  • Splitting data
  • Evaluation methods & Performance criteria
Additional Workshops
  • Outlier Detection
  • Random Forests
  • Ensemble Modeling

Advanced

Overview
  • Business case
  • Intro course review
  • Loading new data
EDA: Exploratory Data Analysis
  • Multiple sources
  • Joins & Set Theory
  • Understanding new attributes
Data Preparation
  • Advanced Data ETL (Extract, Transform, and Load)
  • Aggregation & Multi-level aggregation
  • Pivot & De-Pivot
  • Calculated values
  • Regular Expressions
  • Changing value types
  • Feature Generation and Feature Engineering
  • Loops
  • Macros
Predictive Models Algorithms
  • Support Vector Machines
  • K-Means Clustering
  • Neural Networks
  • Logistic Regression
Model Construction and Evaluation
  • Advanced performance criteria
  • ROC plots
  • Comparison between models
  • Sampling
  • Weighting
  • Feature Selection: Forward Selection
  • Feature Selection: Backward Elimination
  • Validation of preprocessing and preprocessing models
  • Optimization & Logging results
Additional Workshops
  • Principal Components Analysis
  • Logistic Regression
  • Performance (Cost) Model Optimization

Prerequisite

Basic knowledge of computer programs and mathematics.