Data Science Recap
Mark Tabladillo Ph.D.
May 21, 2020
Founder, PASS Data Science Virtual Chapter
2020
Recap of Main News
from Microsoft
Build 2020
Cloud Solution Architect
Microsoft United States
Connect on LinkedIn
Twitter @marktabnet
Topics
 Azure Synapse Link
 Responsible AI
 Project Bonsai & Project Moab
 AI Models at Scale
What if I want to run analytics in near real-
time on my operational data at scale?
Azure Synapse Link
 Microsoft is announcing Azure Synapse Link, a cloud-native
implementation of HTAP (hybrid transactional analytical processing),
which is an architecture for enabling analytics on live operational
data. With Azure Synapse Link, Azure is the first cloud service to
deliver on the promises of HTAP, without the costs, complexities and
trade-offs associated with implementations on-premises.
 Azure Synapse Link is now available with Azure Cosmos DB and will
soon be available with Azure SQL, Azure Database for PostgreSQL
and Azure Database for MySQL.
Azure Synapse Link:
Building real-time HTAP solutions with Azure
Cosmos DB & Azure Synapse Analytics
https://azure.microsoft.com/en-us/blog/azure-analytics-clarity-in-an-instant/
 Azure Cosmos DB is optimized for
operational workloads with single-digit
millisecond read and write latency
 99.999% high availability, guaranteed
throughput and consistency
 Turnkey global data replication across all
Azure regions
Fast NoSQL database with open APIs for any scale
What is Azure Cosmos DB
Real-time
Applications
& Services
Azure
Cosmos DB
 If you have large amounts of data,
analytical queries will take a long time
to run and will be resource intensive
 HUGE performance impact on the
OLTP workloads
Running OLTP and OLAP workloads on the same
database
Real-time
Applications &
Services
Azure
Cosmos DB
Reporting &
Dashboards
Azure Cosmos
DB
Spark connector
User
Applications
Azure
Cosmos DB
Data Lake
Extract
(Pipelines)
Transform
Enrich
Orchestrate
Power BI
Serve
Ingest data periodically from Azure Cosmos DB to Data Lake
Manage data formats and storage layer to optimize for analytics
Apache Spark
for Synapse
Synapse SQL
Separating OLTP & OLAP
Analytical Store
Column store optimized for
analytical queries
Transactional Store
Row store optimized for
transactional operations
Azure Cosmos DB Azure Synapse Analytics
Container
Cloud-Native HTAP
Azure
Synapse Link
SQL
Auto-Sync
Machine learning
Big data analytics
BI Dashboards
Operational
Data
Generate near real-time insights on your operational data
Azure Synapse Link: How it works?
Data Acquisition
& Understanding
Modeling
Business Understanding
Deployment
& Ops
How can we approach responsibility?
Responsible AI
Responsible AI in Three Areas
Understand New model interpretability and fairness assessment capabilities enable the
development of more accurate and fair models.
Protect New differential privacy computing capabilities enable customers to build
machine learning models using sensitive data while safeguarding the privacy of
individuals. This is a result of the partnership between Microsoft and Harvard’s
Institute for Quantitative School Science, which was announced last September.
Additionally, new confidential machine learning capabilities provide a secure and
trusted environment for machine learning.
Control New capabilities for fine-grained traceability, lineage, and access control of data,
models and experiments enable organizations to meet strict regulatory
requirements. Additionally, new workflow documentation capabilities to enforce
accountability in the machine learning process will be made available to
customers shortly after the Build conference.
Understand
Protect
Control
Project Bonsai Public Preview
Create and optimize intelligence for
industrial control systems with simulations
and machine teaching
Project Moab
Open-source machine teaching robotics
hardware kit
New Technical Demos and Customer Stories
featuring SCG and partner simulations using
Project Bonsai
AI models at scale
Massive, multi-purpose
AI models
Infrastructure at scale
The AI Supercomputer
Development at scale
Empowering every
developer
▪ Microsoft Turing: Largest AI
model ever built (17B
parameters)
▪ Changing how AI is
developed: from narrow,
custom models to multi-
purpose, customized, massive
models
▪ Turing language: Most
powerful model for multi-task
natural language processing
▪ The road of generalization:
Multi-modality text / images /
video
AI models and
development at scale
▪ Announcing Open Source frameworks & optimizers for massive
model training
▪ Future release of Microsoft Turing language model
AI computing at scale ▪ Announcing one of the top five publicly disclosed supercomputers
in the world
Outlook Meeting
Insights
Word Document
Summary
Bing Q&A
Dynamics 365
Seller Suggestions
• Lowering the barrier for state-of-the-art AI for every developer
• Enabling AI development scale @ Microsoft
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap

Microsoft Build 2020: Data Science Recap

  • 1.
    Data Science Recap MarkTabladillo Ph.D. May 21, 2020 Founder, PASS Data Science Virtual Chapter 2020
  • 2.
    Recap of MainNews from Microsoft Build 2020 Cloud Solution Architect Microsoft United States Connect on LinkedIn Twitter @marktabnet
  • 3.
    Topics  Azure SynapseLink  Responsible AI  Project Bonsai & Project Moab  AI Models at Scale
  • 4.
    What if Iwant to run analytics in near real- time on my operational data at scale?
  • 5.
    Azure Synapse Link Microsoft is announcing Azure Synapse Link, a cloud-native implementation of HTAP (hybrid transactional analytical processing), which is an architecture for enabling analytics on live operational data. With Azure Synapse Link, Azure is the first cloud service to deliver on the promises of HTAP, without the costs, complexities and trade-offs associated with implementations on-premises.  Azure Synapse Link is now available with Azure Cosmos DB and will soon be available with Azure SQL, Azure Database for PostgreSQL and Azure Database for MySQL.
  • 6.
    Azure Synapse Link: Buildingreal-time HTAP solutions with Azure Cosmos DB & Azure Synapse Analytics https://azure.microsoft.com/en-us/blog/azure-analytics-clarity-in-an-instant/
  • 7.
     Azure CosmosDB is optimized for operational workloads with single-digit millisecond read and write latency  99.999% high availability, guaranteed throughput and consistency  Turnkey global data replication across all Azure regions Fast NoSQL database with open APIs for any scale What is Azure Cosmos DB Real-time Applications & Services Azure Cosmos DB
  • 8.
     If youhave large amounts of data, analytical queries will take a long time to run and will be resource intensive  HUGE performance impact on the OLTP workloads Running OLTP and OLAP workloads on the same database Real-time Applications & Services Azure Cosmos DB Reporting & Dashboards Azure Cosmos DB Spark connector
  • 9.
    User Applications Azure Cosmos DB Data Lake Extract (Pipelines) Transform Enrich Orchestrate PowerBI Serve Ingest data periodically from Azure Cosmos DB to Data Lake Manage data formats and storage layer to optimize for analytics Apache Spark for Synapse Synapse SQL Separating OLTP & OLAP
  • 10.
    Analytical Store Column storeoptimized for analytical queries Transactional Store Row store optimized for transactional operations Azure Cosmos DB Azure Synapse Analytics Container Cloud-Native HTAP Azure Synapse Link SQL Auto-Sync Machine learning Big data analytics BI Dashboards Operational Data Generate near real-time insights on your operational data Azure Synapse Link: How it works?
  • 11.
    Data Acquisition & Understanding Modeling BusinessUnderstanding Deployment & Ops How can we approach responsibility?
  • 12.
  • 13.
    Responsible AI inThree Areas Understand New model interpretability and fairness assessment capabilities enable the development of more accurate and fair models. Protect New differential privacy computing capabilities enable customers to build machine learning models using sensitive data while safeguarding the privacy of individuals. This is a result of the partnership between Microsoft and Harvard’s Institute for Quantitative School Science, which was announced last September. Additionally, new confidential machine learning capabilities provide a secure and trusted environment for machine learning. Control New capabilities for fine-grained traceability, lineage, and access control of data, models and experiments enable organizations to meet strict regulatory requirements. Additionally, new workflow documentation capabilities to enforce accountability in the machine learning process will be made available to customers shortly after the Build conference.
  • 14.
  • 15.
  • 16.
  • 17.
    Project Bonsai PublicPreview Create and optimize intelligence for industrial control systems with simulations and machine teaching Project Moab Open-source machine teaching robotics hardware kit New Technical Demos and Customer Stories featuring SCG and partner simulations using Project Bonsai
  • 18.
    AI models atscale Massive, multi-purpose AI models Infrastructure at scale The AI Supercomputer Development at scale Empowering every developer
  • 19.
    ▪ Microsoft Turing:Largest AI model ever built (17B parameters) ▪ Changing how AI is developed: from narrow, custom models to multi- purpose, customized, massive models ▪ Turing language: Most powerful model for multi-task natural language processing ▪ The road of generalization: Multi-modality text / images / video
  • 20.
    AI models and developmentat scale ▪ Announcing Open Source frameworks & optimizers for massive model training ▪ Future release of Microsoft Turing language model AI computing at scale ▪ Announcing one of the top five publicly disclosed supercomputers in the world
  • 21.
    Outlook Meeting Insights Word Document Summary BingQ&A Dynamics 365 Seller Suggestions • Lowering the barrier for state-of-the-art AI for every developer • Enabling AI development scale @ Microsoft