Cloudera Meetup
Intelligence
IntelligenceData Experiences
We now face a world of connected data
1985 1990 1995 2000 2005 2010 2015 2020
CONNECTED
DIGITAL
ANALOG
Complex implementations
Spreadmarts
Siloed data
Transactional systems
Enterprise data warehouse
OLAP
ETL
Hadoop
Interactive Dashboards
Ad hoc analysis
Operational reporting
Machine learning
Any data
In-memory
From data to decisions and actions
What should
I do?
What will
happen?
Why did it
happen?
What
happened?
Interactive
Dashboards
Recommendations
& Automation
Predictive
Models
Reports
Insight
Cloudera Altus
One of Europe’s busiest air hubs using Microsoft and
Open Source to become world’s most efficient airport
Integration of Azure, Hadoop, other Open Source Software (Rstudio, Python, Impala, Spark)
“As a technologist, I am highly impressed with the openness of
the Azure platform.”
— Rob Dielemans, Co-Founder and Managing Director, GoDataDriven
Tactics
• Project team of local Big
Data specialist
GoDataDriven and
Microsoft tech partner
Xpirit
• Azure beat Amazon and
other cloud rivals as
target cloud Open Source
Results
• New ‘Data Innovation Lab’ up
and running
• Team of data scientists
already building models of
both asset (carousels,
hardware) and internal (staff,
HR) processes to get better BI
insight
Objectives
• Better collect and draw
insight from airport
infrastructure and
passenger traffic data
• Build a data science facility
for management to gain
total visibility into all
processes
Cortana Intelligence
Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and
devices
Data
Cosmos DB
Platform Services
Infrastructure Services
Web
Apps
Mobile
Apps
API
Apps
Notification
Hubs
Hybrid
Cloud
Backup
StorSimple
Azure Site
Recovery
Import/Export
SQL
Database DocumentDB
Redis
Cache
Azure
Search
Storage
Tables
SQL Data
Warehouse
Azure AD
Health Monitoring
AD Privileged
Identity
Management
Operational
Analytics
Cloud
Services
Batch
RemoteApp
Service
Fabric
Visual Studio
Application
Insights
VS Team Services
Domain Services
HDInsight Machine
Learning Stream Analytics
Data
Factory
Event
Hubs
Data Lake
Analytics Service
IoT Hub
Data
Catalog
Security &
Management
Azure Active
Directory
Multi-Factor
Authentication
Automation
Portal
Key Vault
Store/
Marketplace
VM Image Gallery
& VM Depot
Azure AD
B2C
Scheduler
Xamarin
HockeyApp
Power BI
Embedded
SQL Server
Stretch Database
Mobile
Engagement
Functions
Cognitive Services Bot Framework Cortana
Security Center
Container
Service
VM
Scale Sets
Data Lake Store
BizTalk
Services
Service Bus
Logic
Apps
API
Management
Content
Delivery
Network
Media
Services
Media
Analytics
Azure
Data Lake Store
A hyper-scale
repository for Big Data
analytics workloads
Hadoop File System (HDFS) for the cloud
No limits to scale
Store any data in its native format
Enterprise-grade access control,
encryption at rest
Optimized for analytic workload performance
Azure
Data Lake Analytics
A new distributed
analytics service
Distributed analytics service built on
Apache YARN
Elastic scale per query lets users focus on
business goals—not configuring hardware
Includes U-SQL—a language that unifies the
benefits of SQL with the expressive
power of C#
Integrates with Visual Studio to develop,
debug, and tune code faster
Federated query across Azure data sources
Enterprise-grade role based access control
Azure Data Lake
YARN
HDInsight
Hive
R Server
HDFS
Store
Store and analyze data of any kind and size
Develop faster, debug and optimize smarter
Interactively explore patterns in your data
No learning curve
Managed and supported
Dynamically scales to match your business
priorities
Enterprise-grade security
Built on YARN, designed for the cloud
3rd partyAnalytics
U-SQL
Azure
Data Factory
Fully-managed ETL in the Cloud.
A globally deployed data movement service in the cloud.
Use it to ingest data from multiple on-premises and
cloud services easily. Schedule, orchestrate, and manage
the data transformation and analysis process.
Azure
Event Hubs &
Stream Analytics
Streaming data and processing in the cloud.
Event Hubs is a large scale pub/sub messaging hub, based
on open standards.
Stream Analytics is a Stream Processing engine, using SQL
as a language.
Language
Speech
Search
Machine
Learning
Knowledge Vision
Spell
check
Speech API
Entity linking
Recommendation
API
Bing
autosuggest
Computer
vision
Emotion
Forecasting
Text to
speech
Thumbnail
generation
Anomaly
detection
Custom
recognition
(CRIS)
Bing
image search
Web language
model
Customer
feedback
analysis
Academic
knowledge
OCR, tagging,
captioning
Sentiment
scoring
Bing
news search
Bing
web search
Text analytics
Cognitive Services
AI for
workplace safety
Azure
Breath of the Cloud
Platform Services
Infrastructure Services
Web
Apps
Mobile
Apps
API
Apps
Notification
Hubs
Hybrid
Cloud
Backup
StorSimple
Azure Site
Recovery
Import/Export
SQL
Database DocumentDB
Redis
Cache
Azure
Search
Storage
Tables
SQL Data
Warehouse
Azure AD
Health Monitoring
AD Privileged
Identity
Management
Operational
Analytics
Cloud
Services
Batch
RemoteApp
Service
Fabric
Visual Studio
Application
Insights
VS Team Services
Domain Services
HDInsight Machine
Learning Stream Analytics
Data
Factory
Event
Hubs
Data Lake
Analytics Service
IoT Hub
Data
Catalog
Security &
Management
Azure Active
Directory
Multi-Factor
Authentication
Automation
Portal
Key Vault
Store/
Marketplace
VM Image Gallery
& VM Depot
Azure AD
B2C
Scheduler
Xamarin
HockeyApp
Power BI
Embedded
SQL Server
Stretch Database
Mobile
Engagement
Functions
Cognitive Services Bot Framework Cortana
Security Center
Container
Service
VM
Scale Sets
Data Lake Store
BizTalk
Services
Service Bus
Logic
Apps
API
Management
Content
Delivery
Network
Media
Services
Media
Analytics
42Azure regions
NEWLY ANNOUNCED:
France: France Central and France South
Korea: Korea Central and Korea South
DoD East and Central
Africa: South Africa
Global Scale of Azure
Azure
AI Supercomputer
Agent Applications Services Infrastructure
Microsoft AI Portfolio
Cortana Office 365
Dynamics 365
SwiftKey
Pix
Customer Service
and Support
Skype
Calendar.help
Cortana Intelligence
Cognitive Services
Bot Framework
Cortana Devices SDK
Cognitive Toolkit
Azure Machine
Learning
Azure N Series
FPGA
People
gallery.cortanaintelligence.com
learnanalytics.microsoft.com
technet.microsoft.com/en-us/virtuallabs
nathan.bijnens@microsoft.com blogs.technet.com/b/machinelearning
How do I get started?
microsoft.com/cortanaintelligence
Cloudera, Azure and Big Data at Cloudera Meetup '17

Cloudera, Azure and Big Data at Cloudera Meetup '17

  • 1.
  • 3.
  • 4.
  • 5.
    We now facea world of connected data 1985 1990 1995 2000 2005 2010 2015 2020 CONNECTED DIGITAL ANALOG Complex implementations Spreadmarts Siloed data Transactional systems Enterprise data warehouse OLAP ETL Hadoop Interactive Dashboards Ad hoc analysis Operational reporting Machine learning Any data In-memory
  • 6.
    From data todecisions and actions What should I do? What will happen? Why did it happen? What happened? Interactive Dashboards Recommendations & Automation Predictive Models Reports Insight
  • 8.
  • 9.
    One of Europe’sbusiest air hubs using Microsoft and Open Source to become world’s most efficient airport Integration of Azure, Hadoop, other Open Source Software (Rstudio, Python, Impala, Spark) “As a technologist, I am highly impressed with the openness of the Azure platform.” — Rob Dielemans, Co-Founder and Managing Director, GoDataDriven Tactics • Project team of local Big Data specialist GoDataDriven and Microsoft tech partner Xpirit • Azure beat Amazon and other cloud rivals as target cloud Open Source Results • New ‘Data Innovation Lab’ up and running • Team of data scientists already building models of both asset (carousels, hardware) and internal (staff, HR) processes to get better BI insight Objectives • Better collect and draw insight from airport infrastructure and passenger traffic data • Build a data science facility for management to gain total visibility into all processes
  • 11.
    Cortana Intelligence Action People Automated Systems Apps Web Mobile Bots Intelligence Dashboards & Visualizations Cortana Bot Framework Cognitive Services PowerBI Information Management Event Hubs Data Catalog Data Factory Machine Learning and Analytics HDInsight (Hadoop and Spark) Stream Analytics Intelligence Data Lake Analytics Machine Learning Big Data Stores SQL Data Warehouse Data Lake Store Data Sources Apps Sensors and devices Data Cosmos DB
  • 12.
    Platform Services Infrastructure Services Web Apps Mobile Apps API Apps Notification Hubs Hybrid Cloud Backup StorSimple AzureSite Recovery Import/Export SQL Database DocumentDB Redis Cache Azure Search Storage Tables SQL Data Warehouse Azure AD Health Monitoring AD Privileged Identity Management Operational Analytics Cloud Services Batch RemoteApp Service Fabric Visual Studio Application Insights VS Team Services Domain Services HDInsight Machine Learning Stream Analytics Data Factory Event Hubs Data Lake Analytics Service IoT Hub Data Catalog Security & Management Azure Active Directory Multi-Factor Authentication Automation Portal Key Vault Store/ Marketplace VM Image Gallery & VM Depot Azure AD B2C Scheduler Xamarin HockeyApp Power BI Embedded SQL Server Stretch Database Mobile Engagement Functions Cognitive Services Bot Framework Cortana Security Center Container Service VM Scale Sets Data Lake Store BizTalk Services Service Bus Logic Apps API Management Content Delivery Network Media Services Media Analytics
  • 14.
    Azure Data Lake Store Ahyper-scale repository for Big Data analytics workloads Hadoop File System (HDFS) for the cloud No limits to scale Store any data in its native format Enterprise-grade access control, encryption at rest Optimized for analytic workload performance
  • 15.
    Azure Data Lake Analytics Anew distributed analytics service Distributed analytics service built on Apache YARN Elastic scale per query lets users focus on business goals—not configuring hardware Includes U-SQL—a language that unifies the benefits of SQL with the expressive power of C# Integrates with Visual Studio to develop, debug, and tune code faster Federated query across Azure data sources Enterprise-grade role based access control
  • 16.
    Azure Data Lake YARN HDInsight Hive RServer HDFS Store Store and analyze data of any kind and size Develop faster, debug and optimize smarter Interactively explore patterns in your data No learning curve Managed and supported Dynamically scales to match your business priorities Enterprise-grade security Built on YARN, designed for the cloud 3rd partyAnalytics U-SQL
  • 17.
    Azure Data Factory Fully-managed ETLin the Cloud. A globally deployed data movement service in the cloud. Use it to ingest data from multiple on-premises and cloud services easily. Schedule, orchestrate, and manage the data transformation and analysis process.
  • 18.
    Azure Event Hubs & StreamAnalytics Streaming data and processing in the cloud. Event Hubs is a large scale pub/sub messaging hub, based on open standards. Stream Analytics is a Stream Processing engine, using SQL as a language.
  • 19.
    Language Speech Search Machine Learning Knowledge Vision Spell check Speech API Entitylinking Recommendation API Bing autosuggest Computer vision Emotion Forecasting Text to speech Thumbnail generation Anomaly detection Custom recognition (CRIS) Bing image search Web language model Customer feedback analysis Academic knowledge OCR, tagging, captioning Sentiment scoring Bing news search Bing web search Text analytics Cognitive Services
  • 21.
  • 22.
  • 23.
    Platform Services Infrastructure Services Web Apps Mobile Apps API Apps Notification Hubs Hybrid Cloud Backup StorSimple AzureSite Recovery Import/Export SQL Database DocumentDB Redis Cache Azure Search Storage Tables SQL Data Warehouse Azure AD Health Monitoring AD Privileged Identity Management Operational Analytics Cloud Services Batch RemoteApp Service Fabric Visual Studio Application Insights VS Team Services Domain Services HDInsight Machine Learning Stream Analytics Data Factory Event Hubs Data Lake Analytics Service IoT Hub Data Catalog Security & Management Azure Active Directory Multi-Factor Authentication Automation Portal Key Vault Store/ Marketplace VM Image Gallery & VM Depot Azure AD B2C Scheduler Xamarin HockeyApp Power BI Embedded SQL Server Stretch Database Mobile Engagement Functions Cognitive Services Bot Framework Cortana Security Center Container Service VM Scale Sets Data Lake Store BizTalk Services Service Bus Logic Apps API Management Content Delivery Network Media Services Media Analytics
  • 24.
    42Azure regions NEWLY ANNOUNCED: France:France Central and France South Korea: Korea Central and Korea South DoD East and Central Africa: South Africa Global Scale of Azure
  • 25.
  • 27.
    Agent Applications ServicesInfrastructure Microsoft AI Portfolio Cortana Office 365 Dynamics 365 SwiftKey Pix Customer Service and Support Skype Calendar.help Cortana Intelligence Cognitive Services Bot Framework Cortana Devices SDK Cognitive Toolkit Azure Machine Learning Azure N Series FPGA People
  • 29.