© 2016 MapR Technologies 1© 2016 MapR Technologies 1
Today’s Presenters
Rafael Godinho
Technical Evangelist
Tim Morgan
Managing Director
© 2016 MapR Technologies 2© 2016 MapR Technologies 2
Agenda
• Big Data & the Cloud
• Customer Use Case
• Azure Overview & Demo of MapR on Azure
© 2016 MapR Technologies 3© 2016 MapR Technologies 3
Data Gravity
Data tends to stay where it is generated
Applications and services are attracted to the data
© 2016 MapR Technologies 4© 2016 MapR Technologies 4
Flexible processing where
change is the norm
Distributed processing across clusters, data
centers and public and private cloud
environments
Supports global apps that
can scale arbitrarily
Key to Real-time at Scale: Global Cloud
Processing
© 2016 MapR Technologies 5© 2016 MapR Technologies 5
Open Source Engines & Tools Commercial Engines & Applications
Enterprise-Grade Platform Services
DataProcessing
Web-Scale Storage
MapR-FS MapR-DB
Search
and
Others
Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability
MapR Streams
Cloud
and
Managed
Services
Search and
Others
UnifiedManagementandMonitoring
Search
and
Others
Event StreamingDatabase
Custom
Apps
MapR Converged Data Platform
HDFS API POSIX, NFS Kakfa APIHBase API OJAI API
© 2016 MapR Technologies 6© 2016 MapR Technologies 6
MapR on Microsoft Azure Marketplace
MapR and Microsoft enable enterprise grade big data applications in the Azure cloud
Simplified Deployment
Azure Marketplace’s automated deployment
capabilities make big data easy
Azure’s infrastructure can scale up to match any
requirement and scale down for value
MapR integrates with other Azure services to
enable customers to analyze any type of data to
unlock the biggest insights
Unlimited Scale Seamless Interoperability
Product Alignment
About Sullexis
• Sullexis is a professional services firm that specializes in helping its clients to
create, manage, and enhance data to accelerate and improve decision making
across the enterprise. We bring data and technology together to make our clients
measurably more effective
• With industry experience ranging from energy and manufacturing to finance and
high tech, Sullexis brings the technology, processes, and strategies together to
make you more effective in what you do
• Founded in 2006, Sullexis is headquartered in Houston, TX and has a delivery
center in Monterrey, MX.
• Our consultants have implemented solutions across the US, Caribbean, Europe
and Latin America.
Presentation Title 7
Client Background
• Our client is one of North America’s largest Oilfield Services companies
providing well construction, completion and operating services to exploration
and production companies.
• A significant number of acquisitions over the last 10 years resulted in 18
different ERP applications running on 5 different platforms. To enable
future, scale-able growth, they embarked on an ERP standardization project.
The goal to put the entire company on one technology stack with a common
process.
• Having decided to consolidate on a single ERP, the client still needed to
determine how best to handle compliance, regulatory and operational needs
associated with the legacy systems.
• Migrating transaction data to the new ERP would be cost prohibitive and
risky; and market ready data archiving solutions were costly and unable to
meet the defined business needs.
• This left retaining the legacy systems themselves, which would be very
costly, or finding a new approach that was cost effective, reliable and could
meet the business needs.
8
18 to 1
Key Requirements
Preserve and provide easy access to ALL data
• Preserve all structured and unstructured data (approx 12 TBs)
• Ability to run legacy reports to meet compliance, regulatory and ongoing business needs
• Easy for a business person to use directly to minimize IT resource dependency
• Ability to provide consolidated views across disparate data sets
Be cost effective
• Flexible and scalable compute/data storage options (ex. Use of cold storage)
• Provide access through existing BI and reporting tools (ex. Hyperion, MS Power BI, SAP Lumira)
to eliminate new purchases and training
• Enable 100% decommissioning of legacy systems
Enable the future
• Establish processes and tools that support future company acquisitions
• Provide platform to enable new and innovate data applications and solutions
9
Solution Selection Process
Initial Analysis
• Market Research
• Vendor presentations
Two week POC ‘bake-off’ to demonstrate:
• Rapid integration of different data sources both structured and unstructured
• Connectivity to SAP ECC and Oracle EBS
• Reporting capabilities re-using SAP Lumira
Winning POC Solution
• A MapR Converged Data Platform cluster installed in MapR’s private cloud
• Predefined adapters for Oracle used to extract and load structured data to MapR (<100GB)
• Unstructured data of CSV, PDFs and TXT loaded and made viewable through Elastic Search
• Apache Drill and a local install of SAP Lumira connected to the MapR cluster to demonstrate
reporting capabilities
10
Solution Architecture
Project Considerations
Technology Factors
• Reliability and speed of connection to cloud
• Count and category of machines in cloud
(CPU, RAM, Storage)
• Volume of data (row size and count)
• Ongoing transaction use of source system
• Variable needs for data (frequency,
response, volume)
Project Factors
• Timeliness of and accessibility to various
parties
• Cataloging of all data
• Evaluation of transactional status of existing
data sets, and how to address moving
targets (blackout periods, iterative loads,
journaling)
• Sample extracts from every table
• Ability to validate data loads (row counts
samples)
Solution Architecture
NFS
PDF, CSV, XLS Oracle Navision SysPro MS Excel Great Plains
Data
Web-Scale Storage
MapR-FS MapR-DB
Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability
MapR Streams
Event StreamingDatabase
Enterprise Grade Platform
13
PDF TIFF CSV
Why Azure
• Sullexis and client both experienced with Azure and MSFT
• MapR Quick Start on Azure made it easy and fast to get started
• MapR already successfully running well on Azure (see blog)
• Client’s enterprise MSFT account made it simple to procure and administer
• Connectivity to Azure via ExpressRoute mitigated some of the reliability and latency of
connection
14
Apache Drill - Flexible & Fast
Access to any data type, any data source
• Relational
• Nested data
• Schema-less
Rapid time to insights
• Query data in-situ
• No Schemas required
• Easy to get started
Integration with existing tools
• ANSI SQL
• BI tool integration
Scale in all dimensions
• TB-PB of scale
• 1000’s of users
• 1000’s of nodes
Granular security
• Authentication
• Row/column level controls
• De-centralized
15
Sqoop – Easy & Efficient
Leveraging a Sullexis developed direct connect extract tool based on Sqoop was
seen as meeting all the technology and project factors:
• Addresses all source data
• Support for both Oracle and SQL Server
• Import direct to Parquet
• Supports type mapping
• Supports incremental imports and merges
• Enables validation via row count matches
• Provides for parallel imports for enhance speed (but also allows for throttling)
16
Elastic Search – Simple & Transparent
17
Reporting Client Browser
Web UI
edgenode 1node 0 node 2
POSIX Client
PDF TIFF CSV PDF TIFF CSV PDF TIFF CSV
MapR-FS
ODBC or JDBC HTTP(S)
Highlights
• Quick and easy startup
• Primary technical concerns around latency to the cloud can be successfully mitigated (e.g. client’s
cluster enabled transfer rates of 100-140 million records per hour)
• While early, the base business case will result in a payback within a few months and
business users have suggested that data access is easier now than originally available
in the legacy system
• This ERP legacy system decommissioning approach can be executed in as little 2 months
for a complete data archive to 6 months with robust operational reporting
• Provides repeatable tools and process available for future system decommissioning needs
• The client is already experimenting with the platform for use as an IoT sensor data
historian. So far the results have been encouraging
18
About Us
We are attuned to the challenges facing organizations in a variety of industries and understand the constant pressure to improve
business processes and make better decisions. But beyond that, we have a passion for technology. Using that passion, we help our
clients use proven technology coupled with our real-world knowledge to accelerate and improve the flow of data and information and
improve productivity. The technical improvements we provide equip our customers to make the best business decisions possible.
Helping our clients unleash the power of their data is our focus.
MapR on Azure: Getting Value from Big Data in the Cloud 19
Nearly 50 million Office
Online users
Office for iOS has been
downloaded over 80M times
Analyst
reports
Platform Services
Infrastructure Services
Web
Apps
Mobile
Apps
API
Apps
Notification
Hubs
Hybrid
Cloud
Backup
StorSimple
Azure Site
Recovery
Import/Export
SQL
Database DocumentDB
Redis
Cache
Azure
Search
Storage
Tables
SQL Data
Warehouse
Azure AD
Health Monitoring
AD Privileged
Identity
Management
Operational
Analytics
Cloud
Services
Batch
RemoteApp
Service
Fabric
Visual Studio
Application
Insights
VS Team Services
Domain Services
HDInsight Machine
Learning Stream Analytics
Data
Factory
Event
Hubs
Data Lake
Analytics Service
IoT Hub
Data
Catalog
Security &
Management
Azure Active
Directory
Multi-Factor
Authentication
Automation
Portal
Key Vault
Store/
Marketplace
VM Image Gallery
& VM Depot
Azure AD
B2C
Scheduler
Xamarin
HockeyApp
Power BI
Embedded
SQL Server
Stretch Database
Mobile
Engagement
Functions
Cognitive Services Bot Framework Cortana
Security Center
Container
Service
VM
Scale Sets
Data Lake Store
BizTalk
Services
Service Bus
Logic
Apps
API
Management
Content
Delivery
Network
Media
Services
Media
Analytics
Architecture of MapR on Azure
MapR Converged Data Platform
Windows NFS Map/Reduce Hive Drill Excel/PowerBI
Demo
© 2016 MapR Technologies 28© 2016 MapR Technologies 28
Digital transformation for better customer experience
Deliver self-service insights across the business
• MapR platform on the Azure cloud to modernize their infrastructure and
sunset legacy systems.
• Faster exploration of data with Apache Drill mitigating need for schema
development.
• Support for use cases such as customer 360, supply chain & image
analysis
OBJECTIVES
CHALLENGES
SOLUTION
• Modernize analytics & improve speed of marketing campaigns
• Reduce cost of existing systems
•
• Existing technologies prohibiting effective & timely reporting and analysis
• Very long time to extract value from the data leading to lots of Excel
Leading optical retail chain
© 2016 MapR Technologies 29© 2016 MapR Technologies 29
New Analytical Insights to Real Estate Tenants
Optimize tenants experience and drive additional revenue
• MapR on the Azure cloud helps analyze more data types for faster
insights
• Analysts query and search work orders to identify maintenance and
utilization trends, enabling cost savings.
• Optimization of tenants’ experience to capture additional rental revenue.
OBJECTIVES
CHALLENGES
SOLUTION
• Identify maintenance and utilization trends and enable cost savings via
predictive maintenance
• Modernize data infrastructure with a new analytics platform
•
• M&A activity resulting in hundreds of siloed databases
• Inability to handle new data types such as IOT sensor data and provide
new insights
LARGE COMMERCIAL REAL ESTATE
MANAGEMENT COMPANY
© 2016 MapR Technologies 30© 2016 MapR Technologies 30
MapR Customers in All Major Industries
FINANCIAL
SERVICES
RETAIL & CPG SECURITY
ONLINE SERVICES &
SOFTWARE
MEDIA &
ENTERTAINMENT
MANUFACTURING,
UTILITIES, OIL &
GAS
ADVERTISING HEALTH COMMUNICATIONS GOVERNMENT
United
Healthcare
© 2016 MapR Technologies 31© 2016 MapR Technologies 31
Azure and MapR Resources – 3 steps to get started
• Azure Overview
https://www.mapr.com/partners/partner/microsoft-azure-microsofts-cloud-
computing-platform-moving-faster-achieving-more
• 7 Steps to Deploy the MapR Sandbox on Azure
https://www.mapr.com/blog/7-steps-deploy-mapr-sandbox-microsoft-azure
• Azure Test Drive
http://mapr.testdrivelabs.com/ (subject to change)
© 2016 MapR Technologies 32© 2016 MapR Technologies 32
Q&A
@mapr
@mapr.com
Engage with us!
maprtech
mapr-technologies
https://www.mapr.com/get-started-with-mapr
https://www.mapr.com/training
https://www.mapr.com/ebooks/big-data-all-stars/

MapR on Azure: Getting Value from Big Data in the Cloud -

  • 1.
    © 2016 MapRTechnologies 1© 2016 MapR Technologies 1 Today’s Presenters Rafael Godinho Technical Evangelist Tim Morgan Managing Director
  • 2.
    © 2016 MapRTechnologies 2© 2016 MapR Technologies 2 Agenda • Big Data & the Cloud • Customer Use Case • Azure Overview & Demo of MapR on Azure
  • 3.
    © 2016 MapRTechnologies 3© 2016 MapR Technologies 3 Data Gravity Data tends to stay where it is generated Applications and services are attracted to the data
  • 4.
    © 2016 MapRTechnologies 4© 2016 MapR Technologies 4 Flexible processing where change is the norm Distributed processing across clusters, data centers and public and private cloud environments Supports global apps that can scale arbitrarily Key to Real-time at Scale: Global Cloud Processing
  • 5.
    © 2016 MapRTechnologies 5© 2016 MapR Technologies 5 Open Source Engines & Tools Commercial Engines & Applications Enterprise-Grade Platform Services DataProcessing Web-Scale Storage MapR-FS MapR-DB Search and Others Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability MapR Streams Cloud and Managed Services Search and Others UnifiedManagementandMonitoring Search and Others Event StreamingDatabase Custom Apps MapR Converged Data Platform HDFS API POSIX, NFS Kakfa APIHBase API OJAI API
  • 6.
    © 2016 MapRTechnologies 6© 2016 MapR Technologies 6 MapR on Microsoft Azure Marketplace MapR and Microsoft enable enterprise grade big data applications in the Azure cloud Simplified Deployment Azure Marketplace’s automated deployment capabilities make big data easy Azure’s infrastructure can scale up to match any requirement and scale down for value MapR integrates with other Azure services to enable customers to analyze any type of data to unlock the biggest insights Unlimited Scale Seamless Interoperability Product Alignment
  • 7.
    About Sullexis • Sullexisis a professional services firm that specializes in helping its clients to create, manage, and enhance data to accelerate and improve decision making across the enterprise. We bring data and technology together to make our clients measurably more effective • With industry experience ranging from energy and manufacturing to finance and high tech, Sullexis brings the technology, processes, and strategies together to make you more effective in what you do • Founded in 2006, Sullexis is headquartered in Houston, TX and has a delivery center in Monterrey, MX. • Our consultants have implemented solutions across the US, Caribbean, Europe and Latin America. Presentation Title 7
  • 8.
    Client Background • Ourclient is one of North America’s largest Oilfield Services companies providing well construction, completion and operating services to exploration and production companies. • A significant number of acquisitions over the last 10 years resulted in 18 different ERP applications running on 5 different platforms. To enable future, scale-able growth, they embarked on an ERP standardization project. The goal to put the entire company on one technology stack with a common process. • Having decided to consolidate on a single ERP, the client still needed to determine how best to handle compliance, regulatory and operational needs associated with the legacy systems. • Migrating transaction data to the new ERP would be cost prohibitive and risky; and market ready data archiving solutions were costly and unable to meet the defined business needs. • This left retaining the legacy systems themselves, which would be very costly, or finding a new approach that was cost effective, reliable and could meet the business needs. 8 18 to 1
  • 9.
    Key Requirements Preserve andprovide easy access to ALL data • Preserve all structured and unstructured data (approx 12 TBs) • Ability to run legacy reports to meet compliance, regulatory and ongoing business needs • Easy for a business person to use directly to minimize IT resource dependency • Ability to provide consolidated views across disparate data sets Be cost effective • Flexible and scalable compute/data storage options (ex. Use of cold storage) • Provide access through existing BI and reporting tools (ex. Hyperion, MS Power BI, SAP Lumira) to eliminate new purchases and training • Enable 100% decommissioning of legacy systems Enable the future • Establish processes and tools that support future company acquisitions • Provide platform to enable new and innovate data applications and solutions 9
  • 10.
    Solution Selection Process InitialAnalysis • Market Research • Vendor presentations Two week POC ‘bake-off’ to demonstrate: • Rapid integration of different data sources both structured and unstructured • Connectivity to SAP ECC and Oracle EBS • Reporting capabilities re-using SAP Lumira Winning POC Solution • A MapR Converged Data Platform cluster installed in MapR’s private cloud • Predefined adapters for Oracle used to extract and load structured data to MapR (<100GB) • Unstructured data of CSV, PDFs and TXT loaded and made viewable through Elastic Search • Apache Drill and a local install of SAP Lumira connected to the MapR cluster to demonstrate reporting capabilities 10
  • 11.
  • 12.
    Project Considerations Technology Factors •Reliability and speed of connection to cloud • Count and category of machines in cloud (CPU, RAM, Storage) • Volume of data (row size and count) • Ongoing transaction use of source system • Variable needs for data (frequency, response, volume) Project Factors • Timeliness of and accessibility to various parties • Cataloging of all data • Evaluation of transactional status of existing data sets, and how to address moving targets (blackout periods, iterative loads, journaling) • Sample extracts from every table • Ability to validate data loads (row counts samples)
  • 13.
    Solution Architecture NFS PDF, CSV,XLS Oracle Navision SysPro MS Excel Great Plains Data Web-Scale Storage MapR-FS MapR-DB Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability MapR Streams Event StreamingDatabase Enterprise Grade Platform 13 PDF TIFF CSV
  • 14.
    Why Azure • Sullexisand client both experienced with Azure and MSFT • MapR Quick Start on Azure made it easy and fast to get started • MapR already successfully running well on Azure (see blog) • Client’s enterprise MSFT account made it simple to procure and administer • Connectivity to Azure via ExpressRoute mitigated some of the reliability and latency of connection 14
  • 15.
    Apache Drill -Flexible & Fast Access to any data type, any data source • Relational • Nested data • Schema-less Rapid time to insights • Query data in-situ • No Schemas required • Easy to get started Integration with existing tools • ANSI SQL • BI tool integration Scale in all dimensions • TB-PB of scale • 1000’s of users • 1000’s of nodes Granular security • Authentication • Row/column level controls • De-centralized 15
  • 16.
    Sqoop – Easy& Efficient Leveraging a Sullexis developed direct connect extract tool based on Sqoop was seen as meeting all the technology and project factors: • Addresses all source data • Support for both Oracle and SQL Server • Import direct to Parquet • Supports type mapping • Supports incremental imports and merges • Enables validation via row count matches • Provides for parallel imports for enhance speed (but also allows for throttling) 16
  • 17.
    Elastic Search –Simple & Transparent 17 Reporting Client Browser Web UI edgenode 1node 0 node 2 POSIX Client PDF TIFF CSV PDF TIFF CSV PDF TIFF CSV MapR-FS ODBC or JDBC HTTP(S)
  • 18.
    Highlights • Quick andeasy startup • Primary technical concerns around latency to the cloud can be successfully mitigated (e.g. client’s cluster enabled transfer rates of 100-140 million records per hour) • While early, the base business case will result in a payback within a few months and business users have suggested that data access is easier now than originally available in the legacy system • This ERP legacy system decommissioning approach can be executed in as little 2 months for a complete data archive to 6 months with robust operational reporting • Provides repeatable tools and process available for future system decommissioning needs • The client is already experimenting with the platform for use as an IoT sensor data historian. So far the results have been encouraging 18
  • 19.
    About Us We areattuned to the challenges facing organizations in a variety of industries and understand the constant pressure to improve business processes and make better decisions. But beyond that, we have a passion for technology. Using that passion, we help our clients use proven technology coupled with our real-world knowledge to accelerate and improve the flow of data and information and improve productivity. The technical improvements we provide equip our customers to make the best business decisions possible. Helping our clients unleash the power of their data is our focus. MapR on Azure: Getting Value from Big Data in the Cloud 19
  • 20.
    Nearly 50 millionOffice Online users Office for iOS has been downloaded over 80M times
  • 21.
  • 23.
    Platform Services Infrastructure Services Web Apps Mobile Apps API Apps Notification Hubs Hybrid Cloud Backup StorSimple AzureSite Recovery Import/Export SQL Database DocumentDB Redis Cache Azure Search Storage Tables SQL Data Warehouse Azure AD Health Monitoring AD Privileged Identity Management Operational Analytics Cloud Services Batch RemoteApp Service Fabric Visual Studio Application Insights VS Team Services Domain Services HDInsight Machine Learning Stream Analytics Data Factory Event Hubs Data Lake Analytics Service IoT Hub Data Catalog Security & Management Azure Active Directory Multi-Factor Authentication Automation Portal Key Vault Store/ Marketplace VM Image Gallery & VM Depot Azure AD B2C Scheduler Xamarin HockeyApp Power BI Embedded SQL Server Stretch Database Mobile Engagement Functions Cognitive Services Bot Framework Cortana Security Center Container Service VM Scale Sets Data Lake Store BizTalk Services Service Bus Logic Apps API Management Content Delivery Network Media Services Media Analytics
  • 24.
    Architecture of MapRon Azure MapR Converged Data Platform
  • 27.
    Windows NFS Map/ReduceHive Drill Excel/PowerBI Demo
  • 28.
    © 2016 MapRTechnologies 28© 2016 MapR Technologies 28 Digital transformation for better customer experience Deliver self-service insights across the business • MapR platform on the Azure cloud to modernize their infrastructure and sunset legacy systems. • Faster exploration of data with Apache Drill mitigating need for schema development. • Support for use cases such as customer 360, supply chain & image analysis OBJECTIVES CHALLENGES SOLUTION • Modernize analytics & improve speed of marketing campaigns • Reduce cost of existing systems • • Existing technologies prohibiting effective & timely reporting and analysis • Very long time to extract value from the data leading to lots of Excel Leading optical retail chain
  • 29.
    © 2016 MapRTechnologies 29© 2016 MapR Technologies 29 New Analytical Insights to Real Estate Tenants Optimize tenants experience and drive additional revenue • MapR on the Azure cloud helps analyze more data types for faster insights • Analysts query and search work orders to identify maintenance and utilization trends, enabling cost savings. • Optimization of tenants’ experience to capture additional rental revenue. OBJECTIVES CHALLENGES SOLUTION • Identify maintenance and utilization trends and enable cost savings via predictive maintenance • Modernize data infrastructure with a new analytics platform • • M&A activity resulting in hundreds of siloed databases • Inability to handle new data types such as IOT sensor data and provide new insights LARGE COMMERCIAL REAL ESTATE MANAGEMENT COMPANY
  • 30.
    © 2016 MapRTechnologies 30© 2016 MapR Technologies 30 MapR Customers in All Major Industries FINANCIAL SERVICES RETAIL & CPG SECURITY ONLINE SERVICES & SOFTWARE MEDIA & ENTERTAINMENT MANUFACTURING, UTILITIES, OIL & GAS ADVERTISING HEALTH COMMUNICATIONS GOVERNMENT United Healthcare
  • 31.
    © 2016 MapRTechnologies 31© 2016 MapR Technologies 31 Azure and MapR Resources – 3 steps to get started • Azure Overview https://www.mapr.com/partners/partner/microsoft-azure-microsofts-cloud- computing-platform-moving-faster-achieving-more • 7 Steps to Deploy the MapR Sandbox on Azure https://www.mapr.com/blog/7-steps-deploy-mapr-sandbox-microsoft-azure • Azure Test Drive http://mapr.testdrivelabs.com/ (subject to change)
  • 32.
    © 2016 MapRTechnologies 32© 2016 MapR Technologies 32 Q&A @mapr @mapr.com Engage with us! maprtech mapr-technologies https://www.mapr.com/get-started-with-mapr https://www.mapr.com/training https://www.mapr.com/ebooks/big-data-all-stars/