Get $1 credit for every $25 spent!

Big Data Mastery with Hadoop Bundle

Ending In:
Add to Cart - $39
Add to Cart ($39)
$453
91% off
wishlist
(23)
Courses
8
Lessons
289
Enrolled
227

What's Included

Product Details

Access
Lifetime
Content
5 hours
Lessons
56

Taming Big Data with MapReduce & Hadoop

Analyze Large Amounts of Data with Today's Top Big Data Technologies

By Sundog Software | in Online Courses

Big data is hot, and data management and analytics skills are your ticket to a fast-growing, lucrative career. This course will quickly teach you two technologies fundamental to big data: MapReduce and Hadoop. Learn and master the art of framing data analysis problems as MapReduce problems with over 10 hands-on examples. Write, analyze, and run real code along with the instructor– both on your own system, and in the cloud using Amazon's Elastic MapReduce service. By course's end, you'll have a solid grasp of data management concepts.

  • Learn the concepts of MapReduce to analyze big sets of data w/ 56 lectures & 5.5 hours of content
  • Run MapReduce jobs quickly using Python & MRJob
  • Translate complex analysis problems into multi-stage MapReduce jobs
  • Scale up to larger data sets using Amazon's Elastic MapReduce service
  • Understand how Hadoop distributes MapReduce across computing clusters
  • Complete projects to get hands-on experience: analyze social media data, movie ratings & more
  • Learn about other Hadoop technologies, like Hive, Pig & Spark
Frank Kane spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis. For more details on this course and instructor, click here. This course is hosted by StackSkills, the premier eLearning destination for discovering top-shelf courses on everything from coding—to business—to fitness, and beyond!

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Introduction
    • Introduction
  • Getting Started
    • Installing Enthought Canopy
    • Installing MRJob
    • Downloading the MovieLens Data Set
    • Run Your First MapReduce Job
  • Understanding MapReduce
    • MapReduce Basic Concepts
    • Walkthrough of Rating Histogram Code
    • Understanding How MapReduce Scales / Distributed Computing
    • Average Friends by Age Example: Part 1 (3:04)
    • Average Friends by Age Example: Part 2
    • Minimum Temperature By Location Example
    • Maximum Temperature By Location Example
    • Word Frequency in a Book Example
    • Making the Word Frequency Mapper Better with Regular Expressions
    • Sorting the Word Frequency Results Using Multi-Stage MapReduce Jobs
    • Activity: Design a Mapper and Reducer for Total Spent by Customer (2:54)
    • Activity: Write Code for Total Spent by Customer
    • Compare Your Code to Mine. Activity: Sort Results by Amount Spent
    • Compare your Code to Mine for Sorted Results.
    • Combiners
  • Advanced MapReduce Examples
    • Example: Most Popular Movie
    • Including Ancillary Lookup Data in the Example
    • Example: Most Popular Superhero, Part 1 (4:22)
    • Example: Most Popular Superhero, Part 2 (6:31)
    • Example: Degrees of Separation: Concepts
    • Degrees of Separation: Preprocessing the Data
    • Degrees of Separation: Code Walkthrough
    • Degrees of Separation: Running and Analyzing the Results
    • Example: Similar Movies Based on Ratings: Concepts
    • Similar Movies: Code Walkthrough
    • Similar Movies: Running and Analyzing the Results
    • Learning Activity: Improving our Movie Similarities MapReduce Job
  • Using Hadoop and Elastic MapReduce
    • Fundamental Concepts of Hadoop
    • The Hadoop Distributed File System (HDFS)
    • Apache YARN (4:20)
    • Hadoop Streaming: How Hadoop Runs your Python Code
    • Setting Up Your Amazon Elastic MapReduce Account
    • Linking Your EMR Account with MRJob (3:40)
    • Exercise: Run Movie Recommendations on Elastic MapReduce
    • Analyze the Results of Your EMR Job
  • Advanced Hadoop and EMR
    • Distributed Computing Fundamentals
    • Activity: Running Movie Similarities on Four Machines
    • Analyzing the Results of the 4-Machine Job
    • Troubleshooting Hadoop Jobs with EMR and MRJob, Part 1
    • Troubleshooting Hadoop Jobs, Part 2
    • Analyzing One Million Movie Ratings Across 16 Machines, Part 1
    • Analyzing One Million Movie Ratings Across 16 Machines, Part 2
  • Other Hadoop Technologies
    • Introducing Apache Hive
    • Introducing Apache Pig
    • Apache Spark: Concepts
    • Spark Example: Part 1
    • Spark Example: Part 2
    • Congratulations! (0:41)
  • Where to Go from Here
    • New Lecture

View Full Curriculum


Access
Lifetime
Content
10 hours
Lessons
43

Projects in Hadoop and Big Data: Learn by Building Apps

Master One of the Most Important Big Data Technologies by Building Real Projects

By Eduonix Technologies | in Online Courses

Hadoop is perhaps the most important big data framework in existence, used by major data-driven companies around the globe. Hadoop and its associated technologies allow companies to manage huge amounts of data and make business decisions based on analytics surrounding that data. This course will take you from big data zero to hero, teaching you how to build Hadoop solutions that will solve real world problems - and qualify you for many high-paying jobs.

  • Access 43 lectures & 10 hours of content 24/7
  • Learn how technologies like Mapreduce apply to clustering problems
  • Parse a Twitter stream Python, extract keywords w/ Apache Pig, visualize data w/ NodeJS, & more
  • Set up a Kafka stream w/ Java code for producers & consumers
  • Explore real-world applications by building a relational schema for a health care data dictionary used by the US Department of Veterans Affairs
  • Log collections & analytics w/ the Hadoop distributed file system using Apache Flume & Apache HCatalog
Eduonix creates and distributes high-quality technology training content. Their team of industry professionals has been training manpower for more than a decade. They aim to teach technology the way it is used in the industry and professional world. They have a professional team of trainers for technologies ranging from Mobility, Web and Enterprise, and Database and Server Administration.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: intermediate

Compatibility

  • Internet required

Course Outline

  • Introduction
    • Introduction
  • Add Value to Existing Data with Mapreduce
    • Introduction to the Project
    • Build and Run the Basic Code
    • Understanding the Code
    • Dependencies and packages
  • Hadoop Analytics and NoSQL
    • Introduction to Hadoop Analytics
    • Introduction to NoSQL Database
    • Solution Architecture
    • Installing the Solution
  • Kafka Streaming with Yarn and Zookeeper
    • Introduction to Kafka Yarn and Zookeeper
    • Code Structure
    • Creating Kafka Streams
    • Yarn Job with Samza
  • Real Time Stream processing with Apache Kafka and Apache Storm
    • Real Time Streaming
    • Hortonbox Virtual Machine
    • Running in Cluster Mode
    • Submitting the Storm Jar
  • Big Data Applications for the Healthcare Industry with Apache Sqoop and Apache S
    • Introduction to the Project
    • Introduction to HDDAccess
    • Sqoop, Hive and Solr
    • Hive Usage
  • Log collection and analytics with the Hadoop Distributed File System using Apach
    • Apache Flume and HCatalog
    • Install and Configure Apache Flume
    • Visualisation of the Data
    • Embedded Pig Scripts
  • Data Science with Hadoop Predictive Analytics
    • Introduction to Data Science
    • Source Code Review
    • Setting Up the Machine
    • Project Review
  • Visual Analytics with Apache Spark on Yarn
    • Project Setup
    • Setting Up Java Dependencies
    • Spark Analytics with PySpark
    • Bringing it all together
  • Customer 360 degree view, Big Data Analytics for e-commerce
    • Ecommerce and Big Data
    • Installing Datameer
    • Analytics and Visualizations
    • Demonstration
  • Putting it all together Big Data with Amazon Elastic Map Reduce
    • Introduction to the Project
    • Configuration
    • Setting Up Cluster on EMR
    • Dedicated Task Cluster on EMR
  • Summary
    • Summary

View Full Curriculum


Access
Lifetime
Content
15.5 hours
Lessons
76

Learn Hadoop, MapReduce and Big Data from Scratch

Master Big Data Ecosystems & Implementation to Further Your IT Professional Dream

By Eduonix Learning Solutions | in Online Courses

Have you ever wondered how major companies, universities, and organizations manage and process all the data they've collected over time? Well, the answer is Big Data, and people who can work with it are in huge demand. In this course you'll cover the MapReduce algorithm and its most popular implementation, Apache Hadoop. Throughout this comprehensive course, you'll learn essential Big Data terminology, MapReduce concepts, advanced Hadoop development, and gain a complete understanding of the Hadoop ecosystem so you can become a big time IT professional.

  • Access 76 lectures & 15.5 hours of content 24/7
  • Learn how to setup Node Hadoop pseudo clusters
  • Understand & work w/ the architecture of clusters
  • Run multi-node clusters on Amazon's Elastic Map Reduce (EMR)
  • Master distributed file systems & operations including running Hadoop on HortonWorks Sandbok & Cloudera
  • Use MapReduce w/ Hive & Pig
  • Discover data mining & filtering
  • Learn the differences between Hadoop Distributed File System vs. Google File System
Eduonix creates and distributes high-quality technology training content. Their team of industry professionals has been training manpower for more than a decade. They aim to teach technology the way it is used in the industry and professional world. They have a professional team of trainers for technologies ranging from Mobility, Web and Enterprise, and Database and Server Administration.

Website - www.eduonix.com

For more details on this course and instructor, click here. This course is hosted by StackSkills, the premier eLearning destination for discovering top-shelf courses on everything from coding—to business—to fitness, and beyond!

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: beginner

Compatibility

  • Internet required

Course Outline

  • Introduction to Big Data
    • Introduction to the Course (4:55)
    • Why Hadoop, Big Data and Map Reduce Part - A
    • Why Hadoop, Big Data and Map Reduce Part - B
    • Why Hadoop, Big Data and Map Reduce Part - C
    • Architecture of Clusters
    • Virtual Machine (VM), Provisioning a VM with vagrant and puppet
  • Hadoop Architecture
    • Set up a single Node Hadoop pseudo cluster Part - A
    • Set up a single Node Hadoop pseudo cluster Part - B
    • Set up a single Node Hadoop pseudo cluster Part - c
    • Clusters and Nodes, Hadoop Cluster Part - A
    • Clusters and Nodes, Hadoop Cluster Part - B
    • NameNode, Secondary Name Node, Data Nodes Part - A
    • NameNode, Secondary Name Node, Data Nodes Part - B
    • Running Multi node clusters on Amazons EMR Part - A
    • Running Multi node clusters on Amazons EMR Part - B
    • Running Multi node clusters on Amazons EMR Part - C
    • Running Multi node clusters on Amazons EMR Part - D
    • Running Multi node clusters on Amazons EMR Part - E
  • Distributed file systems
    • Hdfs vs Gfs a comparison
    • Run hadoop on Cloudera, Web Administration
    • Run hadoop on Hortonworks Sandbox
    • File system operations with the HDFS shell Part - A
    • File system operations with the HDFS shell Part - B
    • Advanced hadoop development with Apache Bigtop Part - A
    • Advanced hadoop development with Apache Bigtop Part - B
  • Mapreduce Version 1
    • MapReduce Concepts in detail Part - A (13:12)
    • MapReduce Concepts in detail Part - B (13:12)
    • Jobs definition, Job configuration, submission, execution and monitoring Part -A (13:12)
    • Jobs definition, Job configuration, submission, execution and monitoring Part -B (13:12)
    • Jobs definition, Job configuration, submission, execution and monitoring Part -C (13:12)
    • Hadoop Data Types, Paths, FileSystem, Splitters, Readers and Writers Part A (13:12)
    • Hadoop Data Types, Paths, FileSystem, Splitters, Readers and Writers Part B (10:39)
    • Hadoop Data Types, Paths, FileSystem, Splitters, Readers and Writers Part C (10:39)
    • The ETL class, Definition, Extract, Transform, and Load Part - A (10:39)
    • The ETL class, Definition, Extract, Transform, and Load Part - B (10:39)
    • The UDF class, Definition, User Defined Functions Part - A (12:19)
    • The UDF class, Definition, User Defined Functions Part - B (12:19)
  • Mapreduce with Hive ( Data warehousing )
    • Schema design for a Data warehouse Part - A
    • Schema design for a Data warehouse Part - B
    • Hive Configuration
    • Hive Query Patterns Part - A
    • Hive Query Patterns Part - B
    • Hive Query Patterns Part - C
    • Hive Query Patterns Part - D
    • Example Hive ETL class Part - A
    • Example Hive ETL class Part - B
    • Example Hive ETL class Part - D
  • Mapreduce with Pig (Parallel processing)
    • Introduction to Apache Pig Part - A (14:02)
    • Introduction to Apache Pig Part - B (14:02)
    • Introduction to Apache Pig Part - C (9:08)
    • Introduction to Apache Pig Part - D (9:08)
    • Pig LoadFunc and EvalFunc classes (9:08)
    • Example Pig ETL class Part - A (9:08)
    • Example Pig ETL class Part - B (9:08)
  • The Hadoop Ecosystem
    • Introduction to Crunch Part - A (9:08)
    • Introduction to Crunch Part - B (12:53)
    • Introduction to Arvo (12:53)
    • Introduction to Mahout Part - A (12:53)
    • Introduction to Mahout Part - B (12:53)
    • Introduction to Mahout Part - C (13:32)
  • Mapreduce Version 2
    • Apache Hadoop 2 and YARN Part - A (13:32)
    • Apache Hadoop 2 and YARN Part - B (13:32)
    • Yarn Examples (13:32)
  • Putting it all together
    • Amazon EMR example Part - A (13:32)
    • Amazon EMR example Part - B (13:32)
    • Amazon EMR example Part - C (13:32)
    • Amazon EMR example Part - D (13:32)
    • Apache Bigtop example Part - A (12:46)
    • Apache Bigtop example Part - B (12:46)
    • Apache Bigtop example Part - C (12:46)
    • Apache Bigtop example Part - D (12:46)
    • Apache Bigtop example Part - E (12:46)
    • Apache Bigtop example Part - F (12:46)
    • Course Summary (12:46)
    • References
    • Example Hive ETL class Part C

View Full Curriculum


Access
Lifetime
Content
5 hours
Lessons
30

Introduction to Hadoop

Get Familiar with One of the Top Big Data Frameworks In the World

By Loony Corn | in Online Courses

Hadoop is one of the most commonly used Big Data frameworks, supporting the processing of large data sets in a distributed computing environment. This tool is becoming more and more essential to big business as the world becomes more data-driven. In this introduction, you'll cover the individual components of Hadoop in detail and get a higher level picture of how they interact with one another. It's an excellent first step towards mastering Big Data processes.

  • Access 30 lectures & 5 hours of content 24/7
  • Install Hadoop in Standalone, Pseudo-Distributed, & Fully Distributed mode
  • Set up a Hadoop cluster using Linux VMs
  • Build a cloud Hadoop cluster on AWS w/ Cloudera Manager
  • Understand HDFS, MapReduce, & YARN & their interactions
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: beginner
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:17)
  • Why is Big Data a Big Deal
    • The Big Data Paradigm (14:20)
    • Serial vs Distributed Computing (8:37)
    • What is Hadoop? (7:25)
    • HDFS or the Hadoop Distributed File System (11:01)
    • MapReduce Introduced (11:39)
    • YARN or Yet Another Resource Negotiator (4:00)
  • Installing Hadoop in a Local Environment
    • Hadoop Install Modes (8:32)
    • Setup a Virtual Linux Instance (For Windows users) (15:31)
    • Hadoop Standalone mode Install (9:33)
    • Hadoop Pseudo-Distributed mode Install (14:25)
  • The MapReduce "Hello World"
    • The basic philosophy underlying MapReduce (8:49)
    • MapReduce - Visualized And Explained (9:03)
    • MapReduce - Digging a little deeper at every step (10:21)
    • "Hello World" in MapReduce (10:29)
    • The Mapper (9:48)
    • The Reducer (7:46)
    • The Job (12:28)
  • Run a MapReduce Job
    • Get comfortable with HDFS (10:59)
    • Run your first MapReduce Job (14:30)
  • HDFS and Yarn
    • HDFS - Protecting against data loss using replication (15:32)
    • HDFS - Name nodes and why they're critical (6:48)
    • HDFS - Checkpointing to backup name node information (11:10)
    • Yarn - Basic components (8:33)
    • Yarn - Submitting a job to Yarn (13:10)
    • Yarn - Plug in scheduling policies (14:21)
    • Yarn - Configure the scheduler (12:26)
  • Setting up a Hadoop Cluster
    • Manually configuring a Hadoop cluster (Linux VMs) (13:50)
    • Getting started with Amazon Web Servicies (6:25)
    • Start a Hadoop Cluster with Cloudera Manager on AWS (13:04)

View Full Curriculum


Access
Lifetime
Content
4.5 hours
Lessons
24

Advanced MapReduce in Hadoop

Perform Advanced Big Data Functions with Hadoop

By Loonycorn | in Online Courses

Take your Hadoop skills to a whole new level by exploring its features for controlling and customizing MapReduce to a very granular level. Covering advanced topics like building inverted indexes for search engines, generating bigrams, combining multiple jobs, and much more, this course will push your skills towards a professional level.

  • Access 24 lectures & 4.5 hours of content 24/7
  • Cover advanced MapReduce topics like mapper, reducer, sort/merge, partitioning, & more
  • Use MapReduce to build an inverted index for search engines & generate bigrams from text
  • Chain multiple MapReduce jobs together
  • Write your own customized partitioner
  • Sort a large amount of data by sampling input files
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:20)
  • Juicing your MapReduce - Combiners, Shuffle and Sort and The Streaming API
    • Parallelize the reduce phase - use the Combiner (14:40)
    • Not all Reducers are Combiners (14:31)
    • How many mappers and reducers does your MapReduce have? (8:23)
    • Parallelizing reduce using Shuffle And Sort (14:55)
    • MapReduce is not limited to the Java language - Introducing the Streaming API (5:05)
    • Python for MapReduce (12:19)
  • MapReduce Customizations For Finer Grained Control
    • Setting up your MapReduce to accept command line arguments (13:47)
    • The Tool, ToolRunner and GenericOptionsParser (12:36)
    • Configuring properties of the Job object (10:41)
    • Customizing the Partitioner, Sort Comparator, and Group Comparator (15:16)
  • The Inverted Index, Custom Data Types for Keys, Bigram Counts and Unit Tests!
    • The heart of search engines - The Inverted Index (14:41)
    • Generating the inverted index using MapReduce (10:25)
    • Custom data types for keys - The Writable Interface (10:23)
    • Represent a Bigram using a WritableComparable (13:13)
    • MapReduce to count the Bigrams in input text (8:26)
    • Test your MapReduce job using MRUnit (13:41)
  • Input and Output Formats and Customized Partitioning
    • Introducing the File Input Format (12:48)
    • Text And Sequence File Formats (10:21)
    • Data partitioning using a custom partitioner (7:11)
    • Make the custom partitioner real in code (10:25)
    • Total Order Partitioning (10:10)
    • Input Sampling, Distribution, Partitioning and configuring these (9:04)
    • Secondary Sort (14:34)

View Full Curriculum


Access
Lifetime
Content
1.50 hours
Lessons
49

Database Operations via Hadoop and MapReduce

Analyze Data More Efficiently by Learning MapReduce's Parallels to SQL

By Loonycorn | in Online Courses

Analyzing data is an essential to making informed business decisions, and most data analysts use SQL queries to get the answers they're looking for. In this course, you'll learn how to map constructs in SQL to corresponding design patterns for MapReduce jobs, allowing you to understand how these two programs can be leveraged together to simplify data problems.

  • Access 49 lectures & 1.5 hours of content 24/7
  • Master the art of "thinking parallel" to break tasks into MapReduce transformations
  • Use Hadoop & MapReduce to implement a SQL query like operations
  • Work through SQL constructs such as select, where, group by, & more w/ their corresponding MapReduce jobs in Hadoop
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:11)
  • Hadoop as a Database
    • Structured data in Hadoop (14:08)
    • Running an SQL Select with MapReduce (15:31)
    • Running an SQL Group By with MapReduce (14:02)
    • A MapReduce Join - The Map Side (14:20)
    • A MapReduce Join - The Reduce Side (13:08)
    • A MapReduce Join - Sorting and Partitioning (8:49)
    • A MapReduce Join - Putting it all together (13:46)

View Full Curriculum


Access
Lifetime
Content
1 hours
Lessons
4

Recommendation Systems Via Hadoop And MapReduce

Build a Social Network Friend Recommendation Service from Scratch

By Loonycorn | in Online Courses

You see recommendation algorithms all the time, whether you realize it or not. Whether it's Amazon recommending a product, Facebook recommending a friend, Netflix, a new TV show, recommendation systems are a big part of internet life. This is done by collaborative filtering, something you can perform through MapReduce with data collected in Hadoop. In this course, you'll learn how to do it.

  • Access 4 lectures & 1 hour of content 24/7
  • Master the art of "thinking parallel" to break tasks into MapReduce transformations
  • Use Hadoop & MapReduce to implement a recommendations algorithm
  • Recommend friends on a social networking site using a MapReduce collaborative filtering algorithm
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:11)
  • Recommendation Systems using Collaborative Filtering
    • Introduction to Collaborative Filtering (7:25)
    • Friend recommendations using chained MR jobs (17:15)
    • Get common friends for every pair of users - the first MapReduce (14:50)
    • Top 10 friend recommendation for every user - the second MapReduce (13:46)

View Full Curriculum


Access
Lifetime
Content
1.5 hours
Lessons
7

K-Means Clustering via Hadoop And MapReduce

Decipher Big Data Sets Through a Prominent Machine Learning Algorithm

By Loonycorn | in Online Courses

Data, especially in enterprise, will often expand at a rapid scale. Hadoop excels at compiling and organizing this data, however, to do anything meaningful with it, you may need to run machine learning algorithms to decipher patterns. In this course, you'll learn one such algorithm, the K-Means clustering algorithm, and how to use MapReduce to implement it in Hadoop.

  • Access 7 lectures & 1.5 hours of content 24/7
  • Master the art of "thinking parallel" to break tasks into MapReduce transformations
  • Use Hadoop & MapReduce to implement the K-Means clustering algorithm
  • Convert algorithms into MapReduce patterns
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:11)
  • K-Means Clustering
    • What is K-Means Clustering? (14:04)
    • A MapReduce job for K-Means Clustering (16:33)
    • K-Means Clustering - Measuring the distance between points (13:52)
    • K-Means Clustering - Custom Writables for Input/Output (8:26)
    • K-Means Clustering - Configuring the Job (10:50)
    • K-Means Clustering - The Mapper and Reducer (11:23)
    • K-Means Clustering : The Iterative MapReduce Job (3:40)

View Full Curriculum



Terms

  • Instant digital redemption

15-Day Satisfaction Guarantee

We want you to be happy with every course you purchase! If you're unsatisfied for any reason, we will issue a store credit refund within 15 days of purchase.