B. Tech. (5th Sem) (Common for CSE, CSE with Specialization in Data Science)
BCSE-514 (Big Data & Analytics)
![]() |
| Big Data Analytics Syllabus | IndianTechnoEra |
Course Objectives:
- 1.To acquire knowledge about Big Data and its characteristics.
- 2.To learn about Hadoop & its installation.
- 3.To acquire knowledge about data streaming process.
- 4.To learn how Map Reduce can be used for parallel programming.
- 5.To learn about various data processing tools such as PIG, Hive, Hbase & Sqoop.
Unit:-1 | Introduction to Big Data
- Big Data: Introduction to Big Data Platform, Challenges of Conventional Systems,
- Data Types (Structured, Semi-Structured and Unstructured),
- Traditional BI vs Big Data Environment, Big Data (Descriptive, Predictive and Prescriptive),
- Big Data Technology Landscape (SQL, NoSQL, NoSQL Databases, New SQL),
- CAP Theorem, Hadoop installation (standalone modes and fully distributed mode).
Unit:-2 | Big Data Hadoop
Unit:-5 | Map Reduce
Unit: 4 | PIG, Hive, HBase, and Sqoop
HBase: HBasics, Concepts, Clients, Example, Zookeeper, Hbase vs. RDBMS. Big SQL.
Text Books:
1.Tom White, “Hadoop: The Definitive Guide”, 3rd ed., O’reilly Media, 2012.
2.Chris Eaton, Dirk DeRoos, Tom Deutsch, George Lapis, Paul Zikopoulos, “Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data”, McGraw-Hill Publishing, 2012.
3.Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley. Reference Books:
1.Glenn J. Myatt, “Making Sense of Data”, John Wiley & Sons.
2.Pete Warden, “Big Data Glossary”, O’Reilly, 2011.
Download Big Data Analytics Notes
Big Data & Analytics Lab
Course Objectives:
- 1.To acquire knowledge about Big Data and its characteristics.
- 2.To learn about Hadoop& its installation.
- 3.To learn about file streaming techniques.
- 4.To learn about various data processing tools such as PIG, Hive, Hbase & Sqoop.
List of Practical
1.Installation of java on Unix/Linux machine.
2.Installation of Hadoop VM on single node.
3.Implement the following file management tasks in Hadoop: Adding files and directories.
- Retrieving files from HDFS to local file system. Deleting Files from HDFS.
4.Run a java program based on parallel programming to implement the concept of Map Reduce Paradigm.
5.Implement the following file management tasks in Sqoop over Hadoop.
- Create a Database
- Create Table
- Insert Some Records
6.Install and Run Hive, then use Hive to Create, alter and drop databases, tables, views, functions and Indexes.
7.Install and Run Pig then write Pig Latin scripts to sort, group, join, project and filter the data.
8.Write a program to analyze the Web server log stream data using Apache Flume Framework.
9.Implement the following file management tasks of Hbase NOSQL over Hadoop. Operating on Table.
- Insert Some Records.
- Display Table Data.
10.Implement the following file management tasks in Spark over Hadoop environment. RDD using in memory data set.
- RDD using file.
- Using Map Function.
- Using Reduce Function.
Project: Recruitment for Big Data job profiles
Recruitment is a challenging job responsibility of the HR department of any company. Here, we’ll create a Big Data project that can analyze vast amounts of data gathered from real-world job posts published online. The project involves three steps:
- Identify four Big Data job families in the given dataset.
- Identify nine homogeneous groups of Big Data skills that are highly valued by companies.
- Characterize each Big Data job family according to the level of competence required for each Big Data skill set. The goal of this project is to help the HR department find better recruitments for Big Data job roles.
Course Outcomes:
- i)Understand the installation of VMWare.
- ii)Understand Map Reduce Paradigm.
- iii)Understand the installation of PIG.
- iv)Apply Hive to create, alter, and drop databases, tables, views, functions, and indexes.
