Get $1 credit for every $25 spent!

Big Data & Analytics Bundle

Ending In:
Add to Cart - $45
Add to Cart ($45)
$723
93% off
wishlist

What's Included

Product Details

Access
Lifetime
Content
5 hours
Lessons
30

Introduction to Hadoop

Get Familiar with One of the Top Big Data Frameworks In the World

By Loony Corn | in Online Courses

Hadoop is one of the most commonly used Big Data frameworks, supporting the processing of large data sets in a distributed computing environment. This tool is becoming more and more essential to big business as the world becomes more data-driven. In this introduction, you'll cover the individual components of Hadoop in detail and get a higher level picture of how they interact with one another. It's an excellent first step towards mastering Big Data processes.

  • Access 30 lectures & 5 hours of content 24/7
  • Install Hadoop in Standalone, Pseudo-Distributed, & Fully Distributed mode
  • Set up a Hadoop cluster using Linux VMs
  • Build a cloud Hadoop cluster on AWS w/ Cloudera Manager
  • Understand HDFS, MapReduce, & YARN & their interactions
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: beginner
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:17)
  • Why is Big Data a Big Deal
    • The Big Data Paradigm (14:20)
    • Serial vs Distributed Computing (8:37)
    • What is Hadoop? (7:25)
    • HDFS or the Hadoop Distributed File System (11:01)
    • MapReduce Introduced (11:39)
    • YARN or Yet Another Resource Negotiator (4:00)
  • Installing Hadoop in a Local Environment
    • Hadoop Install Modes (8:32)
    • Setup a Virtual Linux Instance (For Windows users) (15:31)
    • Hadoop Standalone mode Install (9:33)
    • Hadoop Pseudo-Distributed mode Install (14:25)
  • The MapReduce "Hello World"
    • The basic philosophy underlying MapReduce (8:49)
    • MapReduce - Visualized And Explained (9:03)
    • MapReduce - Digging a little deeper at every step (10:21)
    • "Hello World" in MapReduce (10:29)
    • The Mapper (9:48)
    • The Reducer (7:46)
    • The Job (12:28)
  • Run a MapReduce Job
    • Get comfortable with HDFS (10:59)
    • Run your first MapReduce Job (14:30)
  • HDFS and Yarn
    • HDFS - Protecting against data loss using replication (15:32)
    • HDFS - Name nodes and why they're critical (6:48)
    • HDFS - Checkpointing to backup name node information (11:10)
    • Yarn - Basic components (8:33)
    • Yarn - Submitting a job to Yarn (13:10)
    • Yarn - Plug in scheduling policies (14:21)
    • Yarn - Configure the scheduler (12:26)
  • Setting up a Hadoop Cluster
    • Manually configuring a Hadoop cluster (Linux VMs) (13:50)
    • Getting started with Amazon Web Servicies (6:25)
    • Start a Hadoop Cluster with Cloudera Manager on AWS (13:04)

View Full Curriculum


Access
Lifetime
Content
1 hours
Lessons
4

Recommendation Systems Via Hadoop And MapReduce

Build a Social Network Friend Recommendation Service from Scratch

By Loonycorn | in Online Courses

You see recommendation algorithms all the time, whether you realize it or not. Whether it's Amazon recommending a product, Facebook recommending a friend, Netflix, a new TV show, recommendation systems are a big part of internet life. This is done by collaborative filtering, something you can perform through MapReduce with data collected in Hadoop. In this course, you'll learn how to do it.

  • Access 4 lectures & 1 hour of content 24/7
  • Master the art of "thinking parallel" to break tasks into MapReduce transformations
  • Use Hadoop & MapReduce to implement a recommendations algorithm
  • Recommend friends on a social networking site using a MapReduce collaborative filtering algorithm
Loonycorn is comprised of four individuals--Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh--who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels
  • IDE like IntelliJ or Eclipse required (free to download)

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, this course and Us (1:11)
  • Recommendation Systems using Collaborative Filtering
    • Introduction to Collaborative Filtering (7:25)
    • Friend recommendations using chained MR jobs (17:15)
    • Get common friends for every pair of users - the first MapReduce (14:50)
    • Top 10 friend recommendation for every user - the second MapReduce (13:46)

View Full Curriculum


Access
Lifetime
Content
4.50 hours
Lessons
41

Learn by Example: HBase - The Hadoop Database

Create More Flexible Databases by Mastering HBase

By Loonycorn | in Online Courses

For Big Data engineers and data analysts, HBase is an extremely effective databasing tool for organizing and manage massive data sets. HBase allows an increased level of flexibility, providing column oriented storage, no fixed schema and low latency to accommodate the dynamically changing needs of applications. With the 25 examples contained in this course, you'll get a complete grasp of HBase that you can leverage in interviews for Big Data positions.

  • Access 41 lectures & 4.5 hours of content 24/7
  • Set up a database for your application using HBase
  • Integrate HBase w/ MapReduce for data processing tasks
  • Create tables, insert, read & delete data from HBase
  • Get a complete understanding of HBase & its role in the Hadoop ecosystem
  • Explore CRUD operations in the shell, & with the Java API
Loonycorn is comprised of four individuals—Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh—who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels, but some knowledge Java is assumed

Compatibility

  • Internet required

Course Outline

  • You, This Course and Us
    • You, This Course and Us (1:50)
  • Introduction to HBase
    • The problem with distributed computing (7:17)
    • Installing HBase (10:57)
    • The Hadoop ecosystem (8:01)
    • The role of HBase in the Hadoop ecosystem (9:42)
    • How is HBase different from RDBMS? (3:10)
    • HBase Data Model (10:44)
    • Introducing CRUD operations (8:32)
    • HBase is different from Hive (4:48)
  • CRUD operations using the HBase Shell
    • Example1 - Creating a table for User Notifications (5:24)
    • Example 2 - Inserting a row (19:52)
    • Example 3 - Updating a row (19:15)
    • Example 4 - Retrieving a row (20:25)
    • Example 5 - Retrieving a range of rows (3:48)
    • Example 6 - Deleting a row (2:11)
    • Example 7 - Deleting a table (2:17)
  • CRUD operations using the Java API
    • Example 8 - Creating a table with HBaseAdmin (6:36)
    • Example 9 - Inserting a row using a Put object (8:33)
    • Example 10 - Inserting a list of Puts (3:30)
    • Example 11 - Retrieving data - Get and Result objects (10:55)
    • Example 12 - A list of Gets (3:34)
    • Example 13 - Deleting a row (2:25)
    • Example 14 - A list of Deletes (2:36)
    • Example 15 - Mix and match with batch operations (6:02)
    • Example 16 - Scanning a range of rows (8:06)
    • Example 17 - Deleting a table (3:51)
  • HBase Architecture
    • HBase Architecture (9:20)
  • Advanced operations - Filters and Counters
    • Example 18 - Filter by Row id - RowFilter (8:56)
    • Example 19 - Filter by column value - SingleColumnValueFilter (5:13)
    • Example 20 - Apply multiple conditions - Filterlist (4:31)
    • Example 21 - Retrieve rows within a time range (2:11)
    • Example 22 - Atomically incrementing a value with Counters (7:31)
  • MapReduce with HBase
    • Example 23 : A MapReduce task to count Notifications by Type (10:24)
    • Example 23 continued: Implementing the MapReduce in Java (13:35)
    • Demo : Running a MapReduce task (2:21)
  • Build a Notification Service
    • Example 24 : Implement a Notification Hierarchy (13:30)
    • Example 25: Implement a Notifications Manager (12:05)
  • Installing Hadoop in a Local Environment
    • Hadoop Install Modes (8:32)
    • Setup a Virtual Linux Instance (For Windows users) (15:31)
    • Hadoop Standalone mode Install (9:33)
    • Hadoop Pseudo-Distributed mode Install (14:25)

View Full Curriculum


Access
Lifetime
Content
6.5 hours
Lessons
67

Learn By Example: Scala

Master This Highly Scalable General-Purpose Language with 65 Examples

By Loonycorn | in Online Courses

The best way to learn is by example, and in this course you'll get the lowdown on Scala with 65 comprehensive, hands-on examples. Scala is a general-purpose programming language that is highly scalable, making it incredibly useful in building programs. Over this immersive course, you'll explore just how Scala can help your programming skill set, and how you can set yourself apart from other programmers by knowing this efficient tool.

  • Access 67 lectures & 6.5 hours of content 24/7
  • Use Scala w/ an intermediate level of proficiency
  • Read & understand Scala programs, including those w/ highly functional forms
  • Identify the similarities & differences between Java & Scala to use each to their advantages
Loonycorn is comprised of two individuals—Janani Ravi and Vitthal Srinivasan—who have honed their respective tech expertise at Google and Flipkart. The duo graduated from Stanford University and believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Important Details

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: intermediate

Compatibility

  • Internet required

Course Outline

  • You, This Course and Us
    • You, This Course and Us
  • Introducing Scala
    • Introducing Scala: Java's Cool Cousin
    • Installing Scala
    • Examples 1 and 2 - Hello world
    • Example 3 - Mutable and Immutable ‘variables’
    • Example 4 - Type Inference
    • Example 5 - String Operations
    • Example 6 - A Unified Type System
    • Example 7 - Emptiness in Scala
    • Example 8 - Type Operations
  • Expressions or Statements?
    • Example 9 - Statements v Expressions
    • Example 10 - Defining Values and Variables via Expressions
    • Example 11 - Nested Scopes in Expression Blocks
    • Example 12 - If/Else expression blocks
    • Example 13 - match expressions
    • Example 14 - match expressions: Pattern guards & OR-ed expressions
    • Example 15 - match expressions: catch-all to match-all
    • Example 16 - match expressions: down casting with Pattern Variables
    • Example 17 - for loops can be expressions OR statements
    • Example 18 - for loops: 2 types of iterators
    • Example 19 - for loops with if conditions: Pattern Guards
    • Example 21 - while/do-while Loops: Pure Statements
  • First Class Functions
    • First Class Functions
    • Example 22 - Functions are named, reusable expressions
    • Example 23 - Procedures are named, reusable statements
    • Example 24 - Functions with No Inputs
    • Example 25 - Invoking Functions with Expression Blocks
    • Example 26 - Nested Functions
    • Example 27 - Named Function Parameters
    • Example 28 - Parameter Default Values
    • Example 29 - Type Parameters: Parametric Polymorphism (5:11)
    • Example 30 - Vararg Parameters
    • Example 31 - Assigning Functions to Values
    • Example 32 - Higher Order Functions
    • Example 33 - Anonymous Functions (aka Function Literals)
    • Example 34 - Placeholder Syntax
    • Example 35 - Partially Applied Functions
    • Example 36 - Currying
    • Example 37 - By-Name Parameters
    • Example 38 - Closures
  • Collections
    • Example 39 - Tuples
    • Example 40 - Lists: Creating Them
    • Example 41 - Lists: Using Them
    • Example 42 - Lists: Higher Order Functions
    • Example 43 - Scan, ScanLeft, ScanRight
    • Example 44 - Fold, FoldLeft, FoldRight
    • Example 45 - Reduce,ReduceLeft,ReduceRight
    • Example 46 - Other, Simpler Reduce Operations
    • Example 47 - Sets and Maps
    • Example 48 - Mutable Collections, and Arrays
    • Example 49 - Option Collections
    • Example 50 - Error handling with util.Try
  • Classes and Objects
    • Example 51 - Classes
    • Example 52 - Primary v Auxiliary Constructors
    • Example 53 - Inheritance from Classes
    • Example 54 - Abstract Classes
    • Example 55 - Anonymous Classes
    • Example 56 - Type Parameters
    • Example 57 - Lazy Values
    • Example 58 - Default Methods with apply
    • Example 59 - Operators
    • Example 60 - Access Modifiers
    • Example 61 - Singleton Objects
    • Example 62 - Companion Objects
    • Example 63 - Traits
    • Example 64 - Case Classes
    • Example 65 - Self Types

View Full Curriculum


Access
Lifetime
Content
8.5 hours
Lessons
51

Scalable Programming with Scala and Spark

Get Rich Using Scala & Spark for Data Analysis, Machine Learning & Analytics

By Loonycorn | in Online Courses

The functional programming nature and the availability of a REPL environment make Scala particularly well suited for a distributed computing framework like Spark. Using these two technologies in tandem can allow you to effectively analyze and explore data in an interactive environment with extremely fast feedback. This course will teach you how to best combine Spark and Scala, making it perfect for aspiring data analysts and Big Data engineers.

  • Access 51 lectures & 8.5 hours of content 24/7
  • Use Spark for a variety of analytics & machine learning tasks
  • Understand functional programming constructs in Scala
  • Implement complex algorithms like PageRank & Music Recommendations
  • Work w/ a variety of datasets from airline delays to Twitter, web graphs, & Product Ratings
  • Use the different features & libraries of Spark, like RDDs, Dataframes, Spark SQL, MLlib, Spark Streaming, & GraphX
  • Write code in Scala REPL environments & build Scala applications w/ an IDE
Loonycorn is comprised of two individuals—Janani Ravi and Vitthal Srinivasan—who have honed their respective tech expertise at Google and Flipkart. The duo graduated from Stanford University and believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Important Details

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Requirements

  • Internet required
  • Some knowledge of Java or C++ is assumed

Course Outline

  • You, This Course and Us
    • You, This Course and Us (2:16)
    • Installing Scala and Hello World (9:43)
  • Introduction to Spark
    • What does Donald Rumsfeld have to do with data analysis? (8:45)
    • Why is Spark so cool? (12:23)
    • An introduction to RDDs - Resilient Distributed Datasets (9:39)
    • Built-in libraries for Spark (15:37)
    • Installing Spark (11:44)
    • The Spark Shell (6:55)
    • See it in Action : Munging Airlines Data with Spark (3:44)
    • Transformations and Actions (17:06)
  • Resilient Distributed Datasets
    • RDD Characteristics: Partitions and Immutability (12:35)
    • RDD Characteristics: Lineage, RDDs know where they came from (6:06)
    • What can you do with RDDs? (11:09)
    • Create your first RDD from a file (14:54)
    • Average distance travelled by a flight using map() and reduce() operations (6:59)
    • Get delayed flights using filter(), cache data using persist() (6:11)
    • Average flight delay in one-step using aggregate() (12:21)
    • Frequency histogram of delays using countByValue() (2:10)
  • Advanced RDDs: Pair Resilient Distributed Datasets
    • Special Transformations and Actions (14:45)
    • Average delay per airport, use reduceByKey(), mapValues() and join() (13:35)
    • Average delay per airport in one step using combineByKey() (8:23)
    • Get the top airports by delay using sortBy() (2:51)
    • Lookup airport descriptions using lookup(), collectAsMap(), broadcast() (10:57)
  • Advanced Spark: Accumulators, Spark Submit, MapReduce , Behind The Scenes
    • Get information from individual processing nodes using accumulators (9:25)
    • Long running programs using spark-submit (7:11)
    • Spark-Submit with Scala - A demo (6:10)
    • Behind the scenes: What happens when a Spark script runs? (14:30)
    • Running MapReduce operations (10:53)
  • PageRank: Ranking Search Results
    • What is PageRank? (16:44)
    • The PageRank algorithm (6:15)
    • Implement PageRank in Spark (9:45)
    • Join optimization in PageRank using Custom Partitioning (6:28)
  • Spark SQL
    • Dataframes: RDDs + Tables (15:48)
  • MLlib in Spark: Build a recommendations engine
    • Collaborative filtering algorithms (12:19)
    • Latent Factor Analysis with the Alternating Least Squares method (11:39)
    • Music recommendations using the Audioscrobbler dataset (5:38)
    • Implement code in Spark using MLlib (14:45)
  • Spark Streaming
    • Introduction to streaming (9:55)
    • Implement stream processing in Spark using Dstreams (9:19)
    • Stateful transformations using sliding windows (8:17)
  • Graph Libraries
    • The Marvel social network using Graphs (14:30)
  • Scala Language Primer
    • Scala - A "better Java"? (10:13)
    • How do Classes work in Scala? (11:02)
    • Classes in Scala - continued (15:50)
    • Functions are different from Methods (7:31)
    • Collections in Scala (10:12)
    • Map, Flatmap - The Functional way of looping (11:36)
    • First Class Functions revisited (8:46)
    • Partially Applied Functions (7:31)
    • Closures (8:07)
    • Currying (10:34)
  • Supplementary Installs
    • Installing Intellij (12:43)
    • Installing Anaconda (9:00)
    • [For Linux/Mac OS Shell Newbies] Path and other Environment Variables (8:25)

View Full Curriculum


Access
Lifetime
Content
5 hours
Lessons
40

Connect the Dots: Linear and Logistic Regression in Excel, Python and R

Build Robust Linear Models in Excel, R, & Python

By Loonycorn | in Online Courses

Linear Regression is a powerful method for quantifying the cause and effect relationships that affect different phenomena in the world around us. This course will teach you how to build robust linear models that will stand up to scrutiny when you apply them to real world situations. You'll even put what you've learnt into practice by leveraging Excel, R, and Python to build a model for stock returns.

  • Access 40 lectures & 5 hours of content 24/7
  • Cover method of least squares, explaining variance, & forecasting an outcome
  • Explore residuals & assumptions about residuals
  • Implement simple & multiple regression in Excel, R, & Python
  • Interpret regression results & avoid common pitfalls
  • Introduce a categorical variable
Loonycorn is comprised of four individuals—Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh—who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, This Course and Us (1:54)
  • Connect the Dots with Linear Regression
    • Using Linear Regression to Connect the Dots (9:04)
    • Two Common Applications of Regression (5:24)
    • Extending Linear Regression to Fit Non-linear Relationships (2:36)
  • Basic Statistics Used for Regression
    • Understanding Mean and Variance (6:03)
    • Understanding Random Variables (16:54)
    • The Normal Distribution (9:31)
  • Simple Regression
    • Setting up a Regression Problem (11:36)
    • Using Simple regression to Explain Cause-Effect Relationships (4:57)
    • Using Simple regression for Explaining Variance (8:07)
    • Using Simple regression for Prediction (4:04)
    • Interpreting the results of a Regression (7:25)
    • Mitigating Risks in Simple Regression (7:56)
  • Applying Simple Regression Using Excel
    • Applying Simple Regression in Excel (11:57)
    • Applying Simple Regression in R (11:14)
    • Applying Simple Regression in Python (6:05)
  • Multiple Regression
    • Introducing Multiple Regression (7:03)
    • Some Risks inherent to Multiple Regression (10:06)
    • Benefits of Multiple Regression (3:48)
    • Introducing Categorical Variables (6:58)
    • Interpreting Regression results - Adjusted R-squared (7:02)
    • Interpreting Regression results - Standard Errors of Co-efficients (8:12)
    • Interpreting Regression results - t-statistics and p-values (5:32)
    • Interpreting Regression results - F-Statistic (2:52)
  • Applying Multiple Regression using Excel
    • Implementing Multiple Regression in Excel (8:54)
    • Implementing Multiple Regression in R (6:26)
    • Implementing Multiple Regression in Python (4:21)
  • Logistic Regression for Categorical Dependent Variables
    • Understanding the need for Logistic Regression (9:24)
    • Setting up a Logistic Regression problem (6:02)
    • Applications of Logistic Regression (9:55)
    • The link between Linear and Logistic Regression (8:13)
    • The link between Logistic Regression and Machine Learning (4:16)
  • Solving Logistic Regression
    • Understanding the intuition behind Logistic Regression and the S-curve (6:21)
    • Solving Logistic Regression using Maximum Likelihood Estimation (10:02)
    • Solving Logistic Regression using Linear Regression (5:32)
    • Binomial vs Multinomial Logistic Regression (4:04)
  • Applying Logistic Regression
    • Predict Stock Price movements using Logistic Regression in Excel (9:52)
    • Predict Stock Price movements using Logistic Regression in R (8:00)
    • Predict Stock Price movements using Rule-based and Linear Regression (6:44)
    • Predict Stock Price movements using Logistic Regression in Python (4:49)

View Full Curriculum


Access
Lifetime
Content
1.5 hours
Lessons
19

Connect the Dots: Factor Analysis in Excel, Python and R

Learn Factor Extraction Using PCA in Excel, R, & Python

By Loonycorn | in Online Courses

Factor analysis helps to cut through the clutter when you have a lot of correlated variables to explain a single effect. This course will help you understand factor analysis and its link to linear regression. You'll explore how Principal Components Analysis (PCA) is a cookie cutter technique to solve factor extraction, and how it relates to machine learning.

  • Access 19 lectures & 1.5 hours of content 24/7
  • Understand principal components
  • Discuss Eigen values & Eigen vectors
  • Perform Eigenvalue decomposition
  • Use principal components for dimensionality reduction & exploratory factor analysis
  • Apply PCA to explain the returns of a technology stock like Apple
  • Find the principal components & use them to build a regression model
Loonycorn is comprised of four individuals—Janani Ravi, Vitthal Srinivasan, Swetha Kolalapudi and Navdeep Singh—who have honed their tech expertises at Google and Flipkart. The team believes it has distilled the instruction of complicated tech concepts into funny, practical, engaging courses, and is excited to be sharing its content with eager students.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Introduction
    • You, This Course and Us (1:45)
  • Factor Analysis and PCA
    • Factor Analysis and the Link to Regression (8:03)
    • Factor Analysis and PCA (7:00)
  • Basic Statistics Required for PCA
    • Mean and Variance (6:03)
    • Covariance and Covariance Matrices (11:45)
    • Covariance vs Correlation (3:19)
  • Diving into Principal Components Analysis
    • The Intuition Behind Principal Components (5:16)
    • Finding Principal Components (7:10)
    • Understanding the Results of PCA - Eigen Values (4:05)
    • Using Eigen Vectors to find Principal Components (2:31)
    • When not to use PCA (2:25)
  • PCA in Excel
    • Setting up the data (6:52)
    • Computing Correlation and Covariance Matrices (3:27)
    • PCA using Excel and VBA (5:51)
    • PCA and Regression (2:56)
  • PCA in R
    • Setting up the data (2:56)
    • PCA and Regression using Eigen Decomposition (3:58)
    • PCA in R using packages (1:56)
  • PCA in Python
    • PCA and Regression in Python (6:42)

View Full Curriculum


Access
Lifetime
Content
5 hours
Lessons
56

Taming Big Data with MapReduce & Hadoop

Analyze Large Amounts of Data with Today's Top Big Data Technologies

By Sundog Software | in Online Courses

Big data is hot, and data management and analytics skills are your ticket to a fast-growing, lucrative career. This course will quickly teach you two technologies fundamental to big data: MapReduce and Hadoop. Learn and master the art of framing data analysis problems as MapReduce problems with over 10 hands-on examples. Write, analyze, and run real code along with the instructor– both on your own system, and in the cloud using Amazon's Elastic MapReduce service. By course's end, you'll have a solid grasp of data management concepts.

  • Learn the concepts of MapReduce to analyze big sets of data w/ 56 lectures & 5.5 hours of content
  • Run MapReduce jobs quickly using Python & MRJob
  • Translate complex analysis problems into multi-stage MapReduce jobs
  • Scale up to larger data sets using Amazon's Elastic MapReduce service
  • Understand how Hadoop distributes MapReduce across computing clusters
  • Complete projects to get hands-on experience: analyze social media data, movie ratings & more
  • Learn about other Hadoop technologies, like Hive, Pig & Spark
Frank Kane spent 9 years at Amazon and IMDb, developing and managing the technology that automatically delivers product and movie recommendations to hundreds of millions of customers, all the time. Frank holds 17 issued patents in the fields of distributed computing, data mining, and machine learning. In 2012, Frank left to start his own successful company, Sundog Software, which focuses on virtual reality environment technology, and teaching others about big data analysis. For more details on this course and instructor, click here. This course is hosted by StackSkills, the premier eLearning destination for discovering top-shelf courses on everything from coding—to business—to fitness, and beyond!

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: all levels

Compatibility

  • Internet required

Course Outline

  • Introduction
    • Introduction
  • Getting Started
    • Installing Enthought Canopy
    • Installing MRJob
    • Downloading the MovieLens Data Set
    • Run Your First MapReduce Job
  • Understanding MapReduce
    • MapReduce Basic Concepts
    • Walkthrough of Rating Histogram Code
    • Understanding How MapReduce Scales / Distributed Computing
    • Average Friends by Age Example: Part 1 (3:04)
    • Average Friends by Age Example: Part 2
    • Minimum Temperature By Location Example
    • Maximum Temperature By Location Example
    • Word Frequency in a Book Example
    • Making the Word Frequency Mapper Better with Regular Expressions
    • Sorting the Word Frequency Results Using Multi-Stage MapReduce Jobs
    • Activity: Design a Mapper and Reducer for Total Spent by Customer (2:54)
    • Activity: Write Code for Total Spent by Customer
    • Compare Your Code to Mine. Activity: Sort Results by Amount Spent
    • Compare your Code to Mine for Sorted Results.
    • Combiners
  • Advanced MapReduce Examples
    • Example: Most Popular Movie
    • Including Ancillary Lookup Data in the Example
    • Example: Most Popular Superhero, Part 1 (4:22)
    • Example: Most Popular Superhero, Part 2 (6:31)
    • Example: Degrees of Separation: Concepts
    • Degrees of Separation: Preprocessing the Data
    • Degrees of Separation: Code Walkthrough
    • Degrees of Separation: Running and Analyzing the Results
    • Example: Similar Movies Based on Ratings: Concepts
    • Similar Movies: Code Walkthrough
    • Similar Movies: Running and Analyzing the Results
    • Learning Activity: Improving our Movie Similarities MapReduce Job
  • Using Hadoop and Elastic MapReduce
    • Fundamental Concepts of Hadoop
    • The Hadoop Distributed File System (HDFS)
    • Apache YARN (4:20)
    • Hadoop Streaming: How Hadoop Runs your Python Code
    • Setting Up Your Amazon Elastic MapReduce Account
    • Linking Your EMR Account with MRJob (3:40)
    • Exercise: Run Movie Recommendations on Elastic MapReduce
    • Analyze the Results of Your EMR Job
  • Advanced Hadoop and EMR
    • Distributed Computing Fundamentals
    • Activity: Running Movie Similarities on Four Machines
    • Analyzing the Results of the 4-Machine Job
    • Troubleshooting Hadoop Jobs with EMR and MRJob, Part 1
    • Troubleshooting Hadoop Jobs, Part 2
    • Analyzing One Million Movie Ratings Across 16 Machines, Part 1
    • Analyzing One Million Movie Ratings Across 16 Machines, Part 2
  • Other Hadoop Technologies
    • Introducing Apache Hive
    • Introducing Apache Pig
    • Apache Spark: Concepts
    • Spark Example: Part 1
    • Spark Example: Part 2
    • Congratulations! (0:41)
  • Where to Go from Here
    • New Lecture

View Full Curriculum


Access
Lifetime
Content
10 hours
Lessons
43

Projects in Hadoop and Big Data: Learn by Building Apps

Master One of the Most Important Big Data Technologies by Building Real Projects

By Eduonix Learning Solutions | in Online Courses

Hadoop is perhaps the most important big data framework in existence, used by major data-driven companies around the globe. Hadoop and its associated technologies allow companies to manage huge amounts of data and make business decisions based on analytics surrounding that data. This course will take you from big data zero to hero, teaching you how to build Hadoop solutions that will solve real world problems - and qualify you for many high-paying jobs.

  • Access 43 lectures & 10 hours of content 24/7
  • Learn how technologies like Mapreduce apply to clustering problems
  • Parse a Twitter stream Python, extract keywords w/ Apache Pig, visualize data w/ NodeJS, & more
  • Set up a Kafka stream w/ Java code for producers & consumers
  • Explore real-world applications by building a relational schema for a health care data dictionary used by the US Department of Veterans Affairs
  • Log collections & analytics w/ the Hadoop distributed file system using Apache Flume & Apache HCatalog
Eduonix creates and distributes high-quality technology training content. Their team of industry professionals has been training manpower for more than a decade. They aim to teach technology the way it is used in the industry and professional world. They have a professional team of trainers for technologies ranging from Mobility, Web and Enterprise, and Database and Server Administration.

Details & Requirements

  • Length of time users can access this course: lifetime
  • Access options: web streaming, mobile streaming
  • Certification of completion not included
  • Redemption deadline: redeem your code within 30 days of purchase
  • Experience level required: intermediate

Compatibility

  • Internet required

Course Outline

  • Introduction
    • Introduction
  • Add Value to Existing Data with Mapreduce
    • Introduction to the Project
    • Build and Run the Basic Code
    • Understanding the Code
    • Dependencies and packages
  • Hadoop Analytics and NoSQL
    • Introduction to Hadoop Analytics
    • Introduction to NoSQL Database
    • Solution Architecture
    • Installing the Solution
  • Kafka Streaming with Yarn and Zookeeper
    • Introduction to Kafka Yarn and Zookeeper
    • Code Structure
    • Creating Kafka Streams
    • Yarn Job with Samza
  • Real Time Stream processing with Apache Kafka and Apache Storm
    • Real Time Streaming
    • Hortonbox Virtual Machine
    • Running in Cluster Mode
    • Submitting the Storm Jar
  • Big Data Applications for the Healthcare Industry with Apache Sqoop and Apache S
    • Introduction to the Project
    • Introduction to HDDAccess
    • Sqoop, Hive and Solr
    • Hive Usage
  • Log collection and analytics with the Hadoop Distributed File System using Apach
    • Apache Flume and HCatalog
    • Install and Configure Apache Flume
    • Visualisation of the Data
    • Embedded Pig Scripts
  • Data Science with Hadoop Predictive Analytics
    • Introduction to Data Science
    • Source Code Review
    • Setting Up the Machine
    • Project Review
  • Visual Analytics with Apache Spark on Yarn
    • Project Setup
    • Setting Up Java Dependencies
    • Spark Analytics with PySpark
    • Bringing it all together
  • Customer 360 degree view, Big Data Analytics for e-commerce
    • Ecommerce and Big Data
    • Installing Datameer
    • Analytics and Visualizations
    • Demonstration
  • Putting it all together Big Data with Amazon Elastic Map Reduce
    • Introduction to the Project
    • Configuration
    • Setting Up Cluster on EMR
    • Dedicated Task Cluster on EMR
  • Summary
    • Summary

View Full Curriculum



Terms

  • Unredeemed licenses can be returned for store credit within 15 days of purchase. Once your license is redeemed, all sales are final.