Big Data Analytics Syllabus | IndianTechnoEra - IndianTechnoEra
Latest update Android YouTube

Big Data Analytics Syllabus | IndianTechnoEra

B. Tech. (5th Sem) (Common for CSE, CSE with Specialization in Data Science) 

BCSE-514 (Big Data & Analytics) 


Big Data Analytics Syllabus | IndianTechnoEra

 

Course Objectives:

  • 1.To acquire knowledge about Big Data and its characteristics.
  • 2.To learn about Hadoop & its installation.
  • 3.To acquire knowledge about data streaming process.
  • 4.To learn how Map Reduce can be used for parallel programming.
  • 5.To learn about various data processing tools such as PIG, Hive, Hbase & Sqoop.


Unit:-1 | Introduction to Big Data


Unit:-2 | Big Data Hadoop

Hadoop: Introduction, key advantages of Apache Hadoop, Hadoop vs. RDBMS, Hadoop Architecture, Hadoop components, HDFS Design and goals, anatomy of file read and write in HDFS, Replica placement strategy, Working with HDFS Commands, Hadoop file system interfaces, Hadoop 1.0 vs Hadoop 2.0, Hadoop Eco System. Data Streaming: Data streaming, Data Flow, Models, Flumes (Features, Architecture).


Unit:-5 | Map Reduce

Map Reduce: Anatomy of a Map Reduce Job Run, Failures, Job Scheduling, Shuffle and Sort, Task Execution, Map Reduce Types and Formats, Map Reduce Features, SQL vs. Map Reduce, Stream Data Model and Architecture.


Unit: 4 | PIG, Hive, HBase, and Sqoop

PIG: Introduction, Execution Modes of Pig, Comparison of Pig with Databases, Grunt, Pig Latin, User Defined Functions, Data Processing operators.

Hive: Hive Shell, Hive Services, Hive Metastore, Comparison with Traditional Databases, HiveQL, Tables, Querying Data and User Defined Functions.

HBase: HBasics, Concepts, Clients, Example, Zookeeper, Hbase vs. RDBMS. Big SQL.

Sqoop: Sqoop Architecture, Installation, connectors & drivers, importing and exporting data from HDFS, HIVE, Hbase.


Text Books:

1.Tom White, “Hadoop: The Definitive Guide”, 3rd ed., O’reilly Media, 2012.

2.Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGraw-Hill Publishing, 2012.

3.Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley. Reference Books:

1.Glenn J. Myatt, “Making Sense of Data”, John Wiley & Sons.

2.Pete Warden, “Big Data Glossary”, O’Reilly, 2011.


Download Big Data Analytics Notes



Big Data & Analytics Lab


Course Objectives:

  • 1.To acquire knowledge about Big Data and its characteristics.
  • 2.To learn about Hadoop& its installation.
  • 3.To learn about file streaming techniques.
  • 4.To learn about various data processing tools such as PIG, Hive, Hbase & Sqoop.


List of Practical

1.Installation of java on Unix/Linux machine.

2.Installation of Hadoop VM on single node.

3.Implement the following file management tasks in Hadoop: Adding files and directories.

  • Retrieving files from HDFS to local file system. Deleting Files from HDFS.

4.Run a java program based on parallel programming to implement the concept of Map Reduce Paradigm.

5.Implement the following file management tasks in Sqoop over Hadoop. 

  • Create a Database
  • Create Table
  • Insert Some Records

6.Install and Run Hive, then use Hive to Create, alter and drop databases, tables, views, functions and Indexes.

7.Install and Run Pig then write Pig Latin scripts to sort, group, join, project and filter the data.

8.Write a program to analyze the Web server log stream data using Apache Flume Framework.

9.Implement the following file management tasks of Hbase NOSQL over Hadoop. Operating on Table.

  • Insert Some Records. 
  • Display Table Data.

10.Implement the following file management tasks in Spark over Hadoop environment. RDD using in memory data set.

  • RDD using file.
  • Using Map Function.
  • Using Reduce Function.


Project: Recruitment for Big Data job profiles

Recruitment is a challenging job responsibility of the HR department of any company. Here, we’ll create a Big Data project that can analyze vast amounts of data gathered from real-world job posts published online. The project involves three steps:

  • Identify four Big Data job families in the given dataset.
  • Identify nine homogeneous groups of Big Data skills that are highly valued by companies.
  • Characterize each Big Data job family according to the level of competence required for each Big Data skill set. The goal of this project is to help the HR department find better recruitments for Big Data job roles.

Course Outcomes:

  • i)Understand the installation of VMWare.
  • ii)Understand Map Reduce Paradigm.
  • iii)Understand the installation of PIG.
  • iv)Apply Hive to create, alter, and drop databases, tables, views, functions, and indexes.

Download Big Data Analytics Practicals


إرسال تعليق

Feel free to ask your query...
Cookie Consent
We serve cookies on this site to analyze traffic, remember your preferences, and optimize your experience.
Oops!
It seems there is something wrong with your internet connection. Please connect to the internet and start browsing again.
AdBlock Detected!
We have detected that you are using adblocking plugin in your browser.
The revenue we earn by the advertisements is used to manage this website, we request you to whitelist our website in your adblocking plugin.
Site is Blocked
Sorry! This site is not available in your country.