Detailed Course Outline
Module 00 - Introduction
Module 01 - Introduction to Data Integration and Cloud Data Fusion
- Data integration: what, why, challenges
- Data integration tools used in the industry
- User personas
- Introduction to cloud-based data fusion
- Data Integration Critical Capabilities
- Cloud Data Fusion UI components
Module 02 - Building Pipelines
- Cloud Data Fusion architecture
- Core concepts
- Data pipelines and directed acyclic graphs (DAG)
- Pipeline Lifecycle
- Designing pipelines in Pipeline Studio
Module 03 - Designing Complex Pipelines
- Branches, merging and joining
- Actions and Notifications
- Error handling and macros
- Pipeline configurations, scheduling, import and export
Module 04 - Pipeline Execution Environment
- Scheduling and triggers
- Execution environment: Compute profile and provisioners
- Monitoring pipelines
Module 05 - Building transformations and preparing data with Wrangler
- Wrangler
- Directives
- User-defined directives
Module 06 - Connectors and Streaming Pipelines
- Understand the data integration architecture.
- List various connectors.
- Use the Cloud Data Loss Prevention (DLP) API.
- Understand the reference architecture of streaming pipelines.
- Build and execute a streaming pipeline
.
Module 07 - Metadata and Data Lineage
- Metadata
- Data lineage
Module 08 - Summary
- Course summary