Book Your Seat Today!

Kindly advise me your company detail and our consultant will contact you soonest!

Course Objectives

  • Understand Hadoop infrastructure
  • Connect to a Hadoop cluster
  • Explore large data stores
  • Perform data extraction and loading tasks
  • Integrate in-cluster analyses into RapidMiner processes

Description

This one-day course designed to help leverage huge data collection by converting raw data into valuable information using RapidMiner Radoop. RapidMiner Radoop provides ETL, analytics and visualization in a single package and integrates seamlessly into new and existing RapidMiner processes to bring analytics into your Hadoop cluster. After completing this course, participants will have a solid understanding of how RapidMiner Radoop integrates with Hadoop. Participants will be able to connect to a Hadoop cluster, explore, extract and load data, and integrate in-cluster analyses into RapidMiner processes.

Target Audience

Advanced Analysts, Data Scientists.

Training Outline

Introduction to Hadoop
  • Distributions
  • General Infrastructure
Data Exploration
  • Browsing Tables
  • Viewing Statistics and High-Level Information
Data Extraction and Loading
  • Formulation of Queries
  • Pushing Data into Hadoop
Integration of In-cluster Analyses into RapidMiner Processes
  • Modeling Algorithms
  • Natural Aggregation
  • In-memory Training, in-Hadoop Scoring
Beyond Natural Aggregation
  • Chunking
  • Voting
  • In-Hadoop Modeling
  • Clustering

Prerequisite

Basic knowledge of computer programs and mathematics. Completed RapidMiner & DataScience: Foundations, RapidMiner & DataScience: Advanced, & RapidMiner Server: Deployment and Web Apps