Participants of this course will learn the theory and practice of Big Data. Upon completion they will be able to create data-driven applications that make modern enterprise tick. This course contains an ideal mix of theory and practice, with the theory portion illustrated by hands-on examples and accomplished with industry-leading tools.

Audience: This course is intended for data scientists, IT managers, project leads and software developers.
Course Duration: 1 day

Attendees should possess basic computer literacy (Windows, Mac or Linux) and a desire to learn and ask questions. Command line skills are optional.

Hardware and Software Requirements:

There is no need to install any software on students’ machines. Working labs will be provided.

Course Outline:
  • Current state of Big Data


  • Big Data Tools
    • Hadoop and Ecosystem
    • Spark
    • NoSQL
    • Storm
    • Search


  • Data
    • Cleaning and Validation
    • Security
    • Privacy
    • Messiness
  • Extracting Actionable Information
    • Data Exploration
    • Machine Learning
    • Data Modeling and Prediction