Exploring and Using the Python
Ecosystem
Adam J. Cook, Chair of SME Chapter 112
FOR PUBLIC RELEASE
About the Presenter
FOR PUBLIC RELEASE
▪ Adam Cook
▪ B.S. in Mechanical Engineering from Purdue
University West Lafayette.
▪ Chief Technical Officer of Alliedstrand in
Hammond, Indiana.
▪ Chair of SME Chapter 112 (Northwest Indiana
and South Chicago).
▪ Embedded systems engineering, custom
automation systems, industrial software.
▪ Lives in Chicago.
▪ Contact me at adam.j.cook@alliedstrand.com.
Chapter “Digital Initiative”
FOR PUBLIC RELEASE
Overviews of Digital
Engineering and
Manufacturing Topics
Applied Programming
Workshops and
Webinars
Chapter Hackathons
Chapter Office Hours
(In-person and online)
Slack #python channel
Slides and code:
http://bit.ly/2uzCQqR
What is Python?
FOR PUBLIC RELEASE
▪ High-level programming language.
▪ Free and open-source.
▪ Cross-platform.
▪ Extensive standard library.
▪ Designed to be highly readable, explicit and productive.
▪ Proven to be quite versatile (and popular).
Slides and code:
http://bit.ly/2uzCQqR
Why use Python in Manufacturing?
FOR PUBLIC RELEASE
▪ Python is fast becoming one of the most popular languages in data analytics and machine
learning. Coincidentally, manufacturing processes are producing more valuable data than ever!
Source:
https://www.ibm.com/developerworks/community/blogs/jfp/entry/What_Language_Is_Best_For_Machine_Learning_And_Data
_Science?lang=en
Slides and code:
http://bit.ly/2uzCQqR
Today’s Agenda
FOR PUBLIC RELEASE
▪ Look at PyPI (also called pip) and how it can help you.
▪ Brief overview of the Anaconda Python distribution and why you
might want to use it (for data science, you should really just use
it).
▪ Super high-level overview of data science/analytics. This is
important. Data can be tricky and deceptive.
▪ Small recap of where we are in these Python webinars.
Slides and code:
http://bit.ly/2uzCQqR
Caveats and Warnings
FOR PUBLIC RELEASE
▪ This event assumes you are a novice. If you have keep in mind that we will
be watering down a bunch.
▪ Programming and data analytics is challenging – the following
presentation will not make you into an expert. Practice and read code.
▪ For data analytics and machine learning applications, in particular,
knowing Python is not enough.
▪ We are starting to get advanced now. Application architecture patterns are
difficult. Data problems are very deep and a very active area of research.
The industry is extremely fluid. Do not try to memorize everything!
▪ We are going to talk today at a high-level. Let us know if you want to
break down things into separate webinars.
▪ Think about what kind of actual applications you want to build and let
us know. After a couple of projects, things will start clicking together.
Slides and code:
http://bit.ly/2uzCQqR
Demonstration
FOR PUBLIC RELEASE
Let’s take a look at pip!
(we will use the code from http://bit.ly/2w62Sk4)
> pip install <package name>
Slides and code:
http://bit.ly/2uzCQqR
Other Resources
FOR PUBLIC RELEASE
There is another great resource called Awesome
Python.
Slides and code:
http://bit.ly/2uzCQqR
Anaconda
(think of it as “Python Plus”)
Contains the following out-of-the-box:
▪ SciPy
▪ Jupyter
▪ Other Continuum tools
Python vs. Anaconda
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Python
SciPy, NumPy, Matplotlib, Jupyter…and bears, oh my!
FOR PUBLIC RELEASE
▪ SciPy – umbrella package containing NumPy, Matplotlib and
SymPy.
▪ NumPy – provides sophisticated N-dimensional array handling
▪ Matplotlib – provides powerful 2D plotting functionality for data and
result visualizations
▪ SymPy – provides symbolic mathematics functionality (computer
algebra system)
▪ Jupyter – interactive, web browser-based “notebook” which
allows you to share Python code, run experiments and capture
results.
Demonstration
FOR PUBLIC RELEASE
Let’s take a look at Jupyter!
(we will use the code from http://bit.ly/2wlz56p)
Slides and code:
http://bit.ly/2uzCQqR
Word Soup
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Data Science (baseline)
Data Analytics (applied)
Big Data Analytics (applied)
Examples:
▪ Digital twin
▪ Autonomous vehicles
▪ Large mfg. operation (> 5 TB data sets)
▪ Calculus
▪ Statistics
▪ SQL
▪ Unstructured data
▪ Machine learning
▪ Python
▪ PostgreSQL
▪ Algorithm design
▪ Data visualization
▪ Data wrangling
▪ Java/C#/C++/JavaScript
▪ Hadoop
▪ Computational
parallelism (Python is not
good here)
▪ MapReduce
▪ Distributed systems
Big Data is hard (really hard)! Make sure you need it!
SQL
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Structured Query Language
Relational database (RDMS)
For example, PostgreSQL
SELECT *
FROM Machines
WHERE oee < 0.90
ORDER BY machine_id;
Query and manage data
Machine_ID oee
---------------------- -------
Haas1 0.75
Mazak1 0.88
Okuma4 0.80
Okuma7 0.74
Results
Relational data
This is a query.
Python “hiding” SQL
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Blue box from the
previous slide
>>> class Machine:
>>> machine_id=“Haas2”
>>> oee=0.74
Python
Object-relational
mapper (ORM)
Examples:
SQLAlchemy
Django ORM>>> machines =
Machines.objects.filter(oee < 0.90)
ORM Layer Database Layer
Data model
This is a query.
Query and manage data
SQL
Python “hiding” everything
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Blue box from the previous slide
Custom Python application or Jupyter
(has maybe a nice user interface)
Python
Key word: Abstractions!
MapReduce
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Source: http://datascienceguide.github.io/
Hadoop
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Hadoop consists of two (2)
parts:
1. Hadoop Distributed File
System (HDFS)
2. Processing Part
(MapReduce)
Source: http://ubm.io/2vipYqj
Big Data and Python
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
Hadoop infrastructure
(but this generally is more complex
architecturally and built with languages other
than Python)
Custom Python application
(has maybe a nice user interface)
Python
Big Data
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
If you are not sure, then you do not need Big Data.
(just use PostgreSQL)
Big Picture
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
What does this all have to do with Python?
Data Sanity
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR
▪ Data can (and it will, at times) lie to you.
▪ Think about data delivery – particularly if it is arriving from
human sources.
▪ Data anomalies will occur. How do you address them?
▪ Are you collecting the right data and, more importantly, enough
relevant data?
▪ Careful of biases (i.e. confirmation bias). Be scientific!
Resources
FOR PUBLIC RELEASE
Books
▪ Raschka, S. (2015). Python machine learning: unlock deeper insights into machine learning with this vital
guide to cutting-edge predictive analytics. Birmingham (U.K.): Packt Publishing.
▪ VanderPlas, J. (2017). Python data science handbook: Essential tools for working with data. Sebastopol,
CA: O'Reilly.
▪ Klein, P. N. (2013). Coding the matrix: linear algebra through applications to computer science. Newton,
MA: Newtonian Press.
Videos
▪ Sarah Guido - Hands-on Data Analysis with Python - PyCon 2015
▪ Jake VanderPlas - Machine Learning with Scikit-Learn (I) - PyCon 2015
▪ Olivier Grisel - Machine Learning with Scikit-Learn (II) - PyCon 2015
Slides and code:
http://bit.ly/2uzCQqR
Online Course
FOR PUBLIC RELEASE
http://bit.ly/2danP4n
(Applied Data Science with Python
Specialization – University of Michigan)
Slides and code:
http://bit.ly/2uzCQqR
Next Webinar
FOR PUBLIC RELEASE
Machine Learning with scikit-learn
(mostly)
Slides and code:
http://bit.ly/2uzCQqR
Where can I get this slide deck and code?
FOR PUBLIC RELEASE
http://bit.ly/2uzCQqR
(actually, go ahead and bookmark this
link – this web page will be updated
constantly with new content)
Slides and code:
http://bit.ly/2uzCQqR
Python-ish Feedback Received So Far
FOR PUBLIC RELEASE
Regular Expressions
Computational Geometry
Data AnalyticsIIoT
Machine Vision
Deep Learning
Machine Learning
Embedded Systems
Robotics
Big Data
Linear Algebra
Statistics
CAE
Cloud Computing
Siemens NX Python API
Data Visualization
Realtime (Streaming) Data
M2M
Slides and code:
http://bit.ly/2uzCQqR
Deeper Look at Machine-to-Machine (M2M)
FOR PUBLIC RELEASE
M2M
MTConnect
MQTT
Slides and code:
http://bit.ly/2uzCQqR
OPC-UA
CNC Machines, general manufacturing equipment
IoT, realtime sensor networks, many-to-many
general manufacturing equipment
Another Example of a Basic Learning Path
FOR PUBLIC RELEASE
Python
Data Analytics
Statistics
Visualizations
Linear Algebra
Calculus
Slides and code:
http://bit.ly/2uzCQqR
Call for Feedback
FOR PUBLIC RELEASE
Please provide us with feedback!
Slides and code:
http://bit.ly/2uzCQqR
Want to Keep the Conversation Going?
FOR PUBLIC RELEASE
We have a Slack channel! Send me an
invite request at my e-mail address.
(we are working on an automatic invite
link)
Slides and code:
http://bit.ly/2uzCQqR
Thank you!
Adam J. Cook
Chief Technical Officer of Alliedstrand
Chair of SME Chapter 112
adam.j.cook@alliedstrand.com
https://linkedin.com/in/adam-j-cook
https://github.com/adamjcook
SME
www.sme.org
https://facebook.com/SMEmfg
https://twitter.com/SME_MFG
https://linkedin.com/company/sme
SME Chapter 112
Serving Northwest Indiana and Chicagoland
https://facebook.com/sme112
https://linkedin.com/company/sme112
https://github.com/sme112
Thanks for attending!
Special thanks to our hosting partner – GreenCow Coworking. Check them out at
greencow.space!
Suggestions? Feedback? Comments? Complaints? Contact us below!
FOR PUBLIC RELEASE
Slides and code:
http://bit.ly/2uzCQqR

Exploring and Using the Python Ecosystem

  • 1.
    Exploring and Usingthe Python Ecosystem Adam J. Cook, Chair of SME Chapter 112 FOR PUBLIC RELEASE
  • 2.
    About the Presenter FORPUBLIC RELEASE ▪ Adam Cook ▪ B.S. in Mechanical Engineering from Purdue University West Lafayette. ▪ Chief Technical Officer of Alliedstrand in Hammond, Indiana. ▪ Chair of SME Chapter 112 (Northwest Indiana and South Chicago). ▪ Embedded systems engineering, custom automation systems, industrial software. ▪ Lives in Chicago. ▪ Contact me at adam.j.cook@alliedstrand.com.
  • 3.
    Chapter “Digital Initiative” FORPUBLIC RELEASE Overviews of Digital Engineering and Manufacturing Topics Applied Programming Workshops and Webinars Chapter Hackathons Chapter Office Hours (In-person and online) Slack #python channel Slides and code: http://bit.ly/2uzCQqR
  • 4.
    What is Python? FORPUBLIC RELEASE ▪ High-level programming language. ▪ Free and open-source. ▪ Cross-platform. ▪ Extensive standard library. ▪ Designed to be highly readable, explicit and productive. ▪ Proven to be quite versatile (and popular). Slides and code: http://bit.ly/2uzCQqR
  • 5.
    Why use Pythonin Manufacturing? FOR PUBLIC RELEASE ▪ Python is fast becoming one of the most popular languages in data analytics and machine learning. Coincidentally, manufacturing processes are producing more valuable data than ever! Source: https://www.ibm.com/developerworks/community/blogs/jfp/entry/What_Language_Is_Best_For_Machine_Learning_And_Data _Science?lang=en Slides and code: http://bit.ly/2uzCQqR
  • 6.
    Today’s Agenda FOR PUBLICRELEASE ▪ Look at PyPI (also called pip) and how it can help you. ▪ Brief overview of the Anaconda Python distribution and why you might want to use it (for data science, you should really just use it). ▪ Super high-level overview of data science/analytics. This is important. Data can be tricky and deceptive. ▪ Small recap of where we are in these Python webinars. Slides and code: http://bit.ly/2uzCQqR
  • 7.
    Caveats and Warnings FORPUBLIC RELEASE ▪ This event assumes you are a novice. If you have keep in mind that we will be watering down a bunch. ▪ Programming and data analytics is challenging – the following presentation will not make you into an expert. Practice and read code. ▪ For data analytics and machine learning applications, in particular, knowing Python is not enough. ▪ We are starting to get advanced now. Application architecture patterns are difficult. Data problems are very deep and a very active area of research. The industry is extremely fluid. Do not try to memorize everything! ▪ We are going to talk today at a high-level. Let us know if you want to break down things into separate webinars. ▪ Think about what kind of actual applications you want to build and let us know. After a couple of projects, things will start clicking together. Slides and code: http://bit.ly/2uzCQqR
  • 8.
    Demonstration FOR PUBLIC RELEASE Let’stake a look at pip! (we will use the code from http://bit.ly/2w62Sk4) > pip install <package name> Slides and code: http://bit.ly/2uzCQqR
  • 9.
    Other Resources FOR PUBLICRELEASE There is another great resource called Awesome Python. Slides and code: http://bit.ly/2uzCQqR
  • 10.
    Anaconda (think of itas “Python Plus”) Contains the following out-of-the-box: ▪ SciPy ▪ Jupyter ▪ Other Continuum tools Python vs. Anaconda FOR PUBLIC RELEASE Slides and code: http://bit.ly/2uzCQqR Python
  • 11.
    SciPy, NumPy, Matplotlib,Jupyter…and bears, oh my! FOR PUBLIC RELEASE ▪ SciPy – umbrella package containing NumPy, Matplotlib and SymPy. ▪ NumPy – provides sophisticated N-dimensional array handling ▪ Matplotlib – provides powerful 2D plotting functionality for data and result visualizations ▪ SymPy – provides symbolic mathematics functionality (computer algebra system) ▪ Jupyter – interactive, web browser-based “notebook” which allows you to share Python code, run experiments and capture results.
  • 12.
    Demonstration FOR PUBLIC RELEASE Let’stake a look at Jupyter! (we will use the code from http://bit.ly/2wlz56p) Slides and code: http://bit.ly/2uzCQqR
  • 13.
    Word Soup FOR PUBLICRELEASE Slides and code: http://bit.ly/2uzCQqR Data Science (baseline) Data Analytics (applied) Big Data Analytics (applied) Examples: ▪ Digital twin ▪ Autonomous vehicles ▪ Large mfg. operation (> 5 TB data sets) ▪ Calculus ▪ Statistics ▪ SQL ▪ Unstructured data ▪ Machine learning ▪ Python ▪ PostgreSQL ▪ Algorithm design ▪ Data visualization ▪ Data wrangling ▪ Java/C#/C++/JavaScript ▪ Hadoop ▪ Computational parallelism (Python is not good here) ▪ MapReduce ▪ Distributed systems Big Data is hard (really hard)! Make sure you need it!
  • 14.
    SQL FOR PUBLIC RELEASE Slidesand code: http://bit.ly/2uzCQqR Structured Query Language Relational database (RDMS) For example, PostgreSQL SELECT * FROM Machines WHERE oee < 0.90 ORDER BY machine_id; Query and manage data Machine_ID oee ---------------------- ------- Haas1 0.75 Mazak1 0.88 Okuma4 0.80 Okuma7 0.74 Results Relational data This is a query.
  • 15.
    Python “hiding” SQL FORPUBLIC RELEASE Slides and code: http://bit.ly/2uzCQqR Blue box from the previous slide >>> class Machine: >>> machine_id=“Haas2” >>> oee=0.74 Python Object-relational mapper (ORM) Examples: SQLAlchemy Django ORM>>> machines = Machines.objects.filter(oee < 0.90) ORM Layer Database Layer Data model This is a query. Query and manage data SQL
  • 16.
    Python “hiding” everything FORPUBLIC RELEASE Slides and code: http://bit.ly/2uzCQqR Blue box from the previous slide Custom Python application or Jupyter (has maybe a nice user interface) Python Key word: Abstractions!
  • 17.
    MapReduce FOR PUBLIC RELEASE Slidesand code: http://bit.ly/2uzCQqR Source: http://datascienceguide.github.io/
  • 18.
    Hadoop FOR PUBLIC RELEASE Slidesand code: http://bit.ly/2uzCQqR Hadoop consists of two (2) parts: 1. Hadoop Distributed File System (HDFS) 2. Processing Part (MapReduce) Source: http://ubm.io/2vipYqj
  • 19.
    Big Data andPython FOR PUBLIC RELEASE Slides and code: http://bit.ly/2uzCQqR Hadoop infrastructure (but this generally is more complex architecturally and built with languages other than Python) Custom Python application (has maybe a nice user interface) Python
  • 20.
    Big Data FOR PUBLICRELEASE Slides and code: http://bit.ly/2uzCQqR If you are not sure, then you do not need Big Data. (just use PostgreSQL)
  • 21.
    Big Picture FOR PUBLICRELEASE Slides and code: http://bit.ly/2uzCQqR What does this all have to do with Python?
  • 22.
    Data Sanity FOR PUBLICRELEASE Slides and code: http://bit.ly/2uzCQqR ▪ Data can (and it will, at times) lie to you. ▪ Think about data delivery – particularly if it is arriving from human sources. ▪ Data anomalies will occur. How do you address them? ▪ Are you collecting the right data and, more importantly, enough relevant data? ▪ Careful of biases (i.e. confirmation bias). Be scientific!
  • 23.
    Resources FOR PUBLIC RELEASE Books ▪Raschka, S. (2015). Python machine learning: unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics. Birmingham (U.K.): Packt Publishing. ▪ VanderPlas, J. (2017). Python data science handbook: Essential tools for working with data. Sebastopol, CA: O'Reilly. ▪ Klein, P. N. (2013). Coding the matrix: linear algebra through applications to computer science. Newton, MA: Newtonian Press. Videos ▪ Sarah Guido - Hands-on Data Analysis with Python - PyCon 2015 ▪ Jake VanderPlas - Machine Learning with Scikit-Learn (I) - PyCon 2015 ▪ Olivier Grisel - Machine Learning with Scikit-Learn (II) - PyCon 2015 Slides and code: http://bit.ly/2uzCQqR
  • 24.
    Online Course FOR PUBLICRELEASE http://bit.ly/2danP4n (Applied Data Science with Python Specialization – University of Michigan) Slides and code: http://bit.ly/2uzCQqR
  • 25.
    Next Webinar FOR PUBLICRELEASE Machine Learning with scikit-learn (mostly) Slides and code: http://bit.ly/2uzCQqR
  • 26.
    Where can Iget this slide deck and code? FOR PUBLIC RELEASE http://bit.ly/2uzCQqR (actually, go ahead and bookmark this link – this web page will be updated constantly with new content) Slides and code: http://bit.ly/2uzCQqR
  • 27.
    Python-ish Feedback ReceivedSo Far FOR PUBLIC RELEASE Regular Expressions Computational Geometry Data AnalyticsIIoT Machine Vision Deep Learning Machine Learning Embedded Systems Robotics Big Data Linear Algebra Statistics CAE Cloud Computing Siemens NX Python API Data Visualization Realtime (Streaming) Data M2M Slides and code: http://bit.ly/2uzCQqR
  • 28.
    Deeper Look atMachine-to-Machine (M2M) FOR PUBLIC RELEASE M2M MTConnect MQTT Slides and code: http://bit.ly/2uzCQqR OPC-UA CNC Machines, general manufacturing equipment IoT, realtime sensor networks, many-to-many general manufacturing equipment
  • 29.
    Another Example ofa Basic Learning Path FOR PUBLIC RELEASE Python Data Analytics Statistics Visualizations Linear Algebra Calculus Slides and code: http://bit.ly/2uzCQqR
  • 30.
    Call for Feedback FORPUBLIC RELEASE Please provide us with feedback! Slides and code: http://bit.ly/2uzCQqR
  • 31.
    Want to Keepthe Conversation Going? FOR PUBLIC RELEASE We have a Slack channel! Send me an invite request at my e-mail address. (we are working on an automatic invite link) Slides and code: http://bit.ly/2uzCQqR
  • 32.
    Thank you! Adam J.Cook Chief Technical Officer of Alliedstrand Chair of SME Chapter 112 adam.j.cook@alliedstrand.com https://linkedin.com/in/adam-j-cook https://github.com/adamjcook SME www.sme.org https://facebook.com/SMEmfg https://twitter.com/SME_MFG https://linkedin.com/company/sme SME Chapter 112 Serving Northwest Indiana and Chicagoland https://facebook.com/sme112 https://linkedin.com/company/sme112 https://github.com/sme112 Thanks for attending! Special thanks to our hosting partner – GreenCow Coworking. Check them out at greencow.space! Suggestions? Feedback? Comments? Complaints? Contact us below! FOR PUBLIC RELEASE Slides and code: http://bit.ly/2uzCQqR