Processing...

Home > Analyzing Big Data With Microsoft R (#20773)

The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis on a large dataset, and show how to utilize it in Big Data environments, such as a Hadoop or Spark cluster, or a SQL Server database.

Outcomes & Objectives

After completing this course, students will be able to:

  • Explain how Microsoft R Server and Microsoft R Client work
  • Use R Client with R Server to explore big data held in different data stores
  • Visualize data by using graphs and plots
  • Transform and clean big data sets
  • Implement options for splitting analysis jobs into parallel tasks
  • Build and evaluate regression models generated from big data
  • Create, score, and deploy partitioning models generated from big data
  • Use R in the SQL Server and Hadoop environments
  • Duration

    • 3 Days (08:30 - 16:00), In-Class, myWay Mentored Learning,
    • 90 Days Access Per Course, Online, myWay On Demand Distance Learning
  • Course Prerequisites

    In addition to their professional experience, students who attend this course should have:

    • Programming experience using R, and familiarity with common R packages
    • Knowledge of common statistical methods and data analysis best practices.
    • Basic knowledge of the Microsoft Windows operating system and its core functionality.

    Working knowledge of relational databases. 

    Learners with no relational database knowledge can attend our MTA Database Fundamentals course for more information CLICK HERE.

  • Who Should Attend

    The primary audience for this course is people who wish to analyze large datasets within a big data environment.

    The secondary audience are developers who need to integrate R analyses into their solutions.

Our Delivery Methods

Our innovative "myWay” learning methodology is built around the students individual learning requirement, allowing each student to learn in a style that is most suitable for their skills set, knowledge and schedule.

Online Mentored Learning

Do a course at your pace via our “myWay Online Mentored Learning”, combining self-study with supported interactive online video lectures, an online course mentor, extra resources, questionnaires and more, all supported via out Online Student Portal.

Part Time Mentored Learning

Designed for the working professional, our part time programmes provides you with the flexibility and benefit of our myWay Blended Learning with at home exercises/assignments and mentored or in-class lectures at a manageable schedule and pace.

Our Hybrid Delivery Methods

Our Hybrid Delivery Methods

myWay Hybrid Learning is a technology mediated delivery method that extends the benefit of flexibility and technology to all students. Each Hybrid delivery method is described in the section below.

#AnywhereAnytime

Have all your classes ready to be downloaded and watched, anytime, anywhere.

#NoStudentLeftBehind

Never miss a classs because of health, traffic, or transport issues.

#Flexibility

A personalized class schedule, attend class on campus, virtually or both.

 

In Class or Virtual Class Based Learning

A technology mediated delivery method allowing campus based class or virtual class attendance, or a combination of both. Classes can be in the form of lecture based or mentored based.

 

Mentored Online Learning

A technology mediated, self paced online delivery method with personal mentorship.

What you get

This course will help you you prepare for the Exam 70-773CLICK HERE to learn more about this exam.
This course contributes towards earning your MCSA: Machine LearningCLICK HERE to learn more.

Important Notes

  • Students are to be at the training venue by 08h00 in preparation for a 08h30 start time.
  • Learnfast retains the right to change this calendar without any notification.
  • Bookings are only confirmed upon receipt of the proof of payment or an official company purchase order for the full amount of the training.
  • For full day courses Learnfast will supply you with the relevant training material. A desktop computer to use for the training (where applicable), tea/coffee and a full lunch for full day InClass training hosted at Learnfast only. Catering is not included for OnSite training and laptop is available for hire at an additional cost if required.
  • Cancellation or rescheduling requests must be in writing and reach us via fax or email at least 5 (five) working days prior to the course commencement date. Full course fees may be retained for no shows or requests within 5 working days prior to commencement.
  • Although we go to great lengths to ensure that all training proceeds as scheduled, Learnfast reserves the right to cancel or postpone dates if we require to do so and undertake to inform clients in writing and telephonically of these changes.
  • Learnfast suggests clients wait until a week prior to course commencement that a course has been confirmed to go ahead as scheduled before booking flight and accommodation. Learnfast is NOT responsible for cost associated with cancellation of classes such as flight and accommodation for clients.

Module 1: Microsoft R Server and R Client

Explain how Microsoft R Server and Microsoft R Client work.

Lessons

  • What is Microsoft R server
  • Using Microsoft R client
  • The ScaleR functions

Lab : Exploring Microsoft R Server and Microsoft R Client

  • Using R client in VSTR and RStudio
  • Exploring ScaleR functions
  • Connecting to a remote server

After completing this module, students will be able to:

  • Explain the purpose of R server.
  • Connect to R server from R client
  • Explain the purpose of the ScaleR functions.

 

Module 2: Exploring Big Data

At the end of this module the student will be able to use R Client with R Server to explore big data held in different data stores.

Lessons

  • Understanding ScaleR data sources
  • Reading data into an XDF object
  • Summarizing data in an XDF object

Lab : Exploring Big Data

  • Reading a local CSV file into an XDF file
  • Transforming data on input
  • Reading data from SQL Server into an XDF file
  • Generating summaries over the XDF data

After completing this module, students will be able to:

  • Explain ScaleR data sources
  • Describe how to import XDF data
  • Describe how to summarize data held in XCF format

 

Module 3: Visualizing Big Data

Explain how to visualize data by using graphs and plots.

Lessons

  • Visualizing In-memory data
  • Visualizing big data

Lab : Visualizing data

  • Using ggplot to create a faceted plot with overlays
  • Using rxlinePlot and rxHistogram

After completing this module, students will be able to:

  • Use ggplot2 to visualize in-memory data
  • Use rxLinePlot and rxHistogram to visualize big data

 

Module 4: Processing Big Data

Explain how to transform and clean big data sets.

Lessons

  • Transforming Big Data
  • Managing datasets

Lab : Processing big data

  • Transforming big data
  • Sorting and merging big data
  • Connecting to a remote server

After completing this module, students will be able to:

  • Transform big data using rxDataStep
  • Perform sort and merge operations over big data sets

 

Module 5: Parallelizing Analysis Operations

Explain how to implement options for splitting analysis jobs into parallel tasks.

Lessons

  • Using the RxLocalParallel compute context with rxExec
  • Using the revoPemaR package

Lab : Using rxExec and RevoPemaR to parallelize operations

  • Using rxExec to maximize resource use
  • Creating and using a PEMA class

After completing this module, students will be able to:

  • Use the rxLocalParallel compute context with rxExec
  • Use the RevoPemaR package to write customized scalable and distributable analytics.

 

Module 6: Creating and Evaluating Regression Models

Explain how to build and evaluate regression models generated from big data

Lessons

  • Clustering Big Data
  • Generating regression models and making predictions

Lab : Creating a linear regression model

  • Creating a cluster
  • Creating a regression model
  • Generate data for making predictions
  • Use the models to make predictions and compare the results

After completing this module, students will be able to:

  • Cluster big data to reduce the size of a dataset.
  • Create linear and logit regression models and use them to make predictions.

 

Module 7: Creating and Evaluating Partitioning Models

Explain how to create and score partitioning models generated from big data.

Lessons

  • Creating partitioning models based on decision trees.
  • Test partitioning models by making and comparing predictions

Lab : Creating and evaluating partitioning models

  • Splitting the dataset
  • Building models
  • Running predictions and testing the results
  • Comparing results

After completing this module, students will be able to:

  • Create partitioning models using the rxDTree, rxDForest, and rxBTree algorithms.
  • Test partitioning models by making and comparing predictions.

 

Module 8: Processing Big Data in SQL Server and Hadoop

Explain how to transform and clean big data sets.

Lessons

  • Using R in SQL Server
  • Using Hadoop Map/Reduce
  • Using Hadoop Spark

Lab : Processing big data in SQL Server and Hadoop

  • Creating a model and predicting outcomes in SQL Server
  • Performing an analysis and plotting the results using Hadoop Map/Reduce
  • Integrating a sparklyr script into a ScaleR workflow

After completing this module, students will be able to:

  • Use R in the SQL Server and Hadoop environments.
  • Use ScaleR functions with Hadoop on a Map/Reduce cluster to analyze big data.

 

    No dates have been specified for this course.
    Please contact The CAD Corporation for more information and dates on this course.

By completing the below online booking, a booking confirmation will be sent out and an invoice will be generated. A place will be reserved on this course and you are expected to attend. If you require a quote first please contact Learnfast offices and speak to a sales consultant.

Analyzing Big Data with Microsoft R (#20773)





  1. By booking for this course, an invoice will be generated and you will be liable for the payment of this invoice. If you require a quote, please contact The CAD Corporation Offices.
  2. After the generation of the invoice a training confirmation will be emailed using the details provided above.
  3. The CAD Corporation retains the rights to change this calendar without any notification.
  4. Tea/coffee and a light lunch will be provided.
  5. All university students will receive a 10% discount for cash payments.
  6. The minimum notice of cancellation is 5 (five) working days prior to the course commencement date. If you fail to do so the full amount is payable.
  7. Students are to be at the training venue by 08h00 in preparation for a 08h30 start time.

Delivery Method: 
  • Math Result: