The Premier Organization for Data Professionals
  • Home
  • Saeed Rahimi: NoSQL 101: Big Data Analytics – Scrub with Pig, Store in HBase, And Analyze with Hive

Saeed Rahimi: NoSQL 101: Big Data Analytics – Scrub with Pig, Store in HBase, And Analyze with Hive

  • Wednesday, December 18, 2013
  • 8:30 AM - 11:30 AM
  • State of Minnesota

Registration


Registration is closed

Abstract

 
Performing analytics on Big Data to extract business value is the goal of  any organization. Most tools that are used to perform such analysis require  programming in Java or Python. Not everyone knows Java programming to draw  values from their collected data. Pig and Hive, two tools from Apache open  source foundation, are designed to be used by non-programmers to address this  tremendous need. HBase is store vast amount of data and provide scalable  incremental access to it.
   
Users who know SQL prefer to use Hive QL (HQL) over PigLatin.  Unfortunately, Big Data is not always structured in a way that lends itself  nicely to the use of SQL analytics. It turns out that using PigLatin to scrub  the data before HQL is used has become the industry standard approach. To apply  this approach, users need to understand the nuances of how these tools work and  how they interface with each other.
 
 
This talk covers the following that address the above needs.

Learning  Objectives
•       Hadoop, HDFS,  MapReduce.
•       Hive and Hive Query Language  (HQL).
•       Pig and  PigLatin.
•       Interfacing Pig and Hive with  HBase


   
Speaker Biography
 
Saeed Rahimi is a professor of software engineering at the University of St. Thomas. Dr. Rahimi has over 33 years of experience in academia and industry focusing on Database Management, Big Data, NoSQL, Database Administration, and Distributed Systems. He is the coauthor of two books, spoken in many National and International Conferences, and has published many articles in management and scientific journals. He has taught graduate level system courses at the University of St. Thomas and the University of Minnesota since 1981. His current research focus includes Big Data, NoSQL, distributed systems, and web databases. His previous professional experience includes a number of management and research positions within Fortune 100 companies. Dr. Rahimi has co-founded InfoSpan and SRDBSoft specializing in database management systems design and development and providing consulting services. Dr. Rahimi has a B.S., an M.S. and a Ph.D. in Computer Sciences from University of Minnesota.
   
 

Presentation

Agenda

8:30 Registration & Networking
9:00 Opening Remarks
9:15 Presentation 

 

Get Directions

Click for Google Map

Click for GoToMeeting Instructions

© DAMA-MN
Powered by Wild Apricot Membership Software