The ELK Stack
ElasticSearch, LogStash, and Kibana
What is
● Schema-less
● Distributed
● REST-ful, Document-oriented, and speaks
JSON
● For searching and analytics
● and more...
Architecture
● Built on top of Apache Lucene
● Runs on the JVM
● Distributed in nature - cluster can have
data, master or load balancing nodes
● Highly available and fault tolerant
Legend
Shard
Index
Node (instance of ElasticSearch process)
ElasticSearch - document oriented
- and schema-less
Movies example…
● index a document (PUT and POST)
● check for existence of a document
● retrieve fields
● delete
● update
● update with optimistic concurrency
● update partial
● upsert
ElasticSearch - Distributed
● Start a node - open Marvel
o Data is allocated within the node
● Start another node
o Highlight data being redistributed to the new node
● Discovery mechanisms - multicast vs.
unicast
● Master election, sharding
ElasticSearch - RESTful
● Get index stats - number of shards (partitions of data), replicas, state and
size
● Get cluster health - overall health status, number of shards and nodes
● Get cluster state - metrics of all indices, settings and mappings of all
indices, some metrics, info on all shards in all indices
ElasticSearch - Concepts
● Index - highest level bucket to store documents, indicates some physical storage
● Type
Relational DB ⇒ Databases ⇒ Tables ⇒ Rows ⇒ Columns
Elasticsearch ⇒ Indices ⇒ Types ⇒ Documents ⇒ Fields
● Mapping -the definition of a type (think schema) and how ElasticSearch should
analyze, parse and store the fields of this type
● Analysis:
o first, tokenizing a block of text into individual terms suitable for use in an inverted index,
o then normalizing these terms into a standard form to improve their “searchability” or recall.
ElasticSearch - search and analytics
● Search
o Structured search - working with exact values, between date ranges,
numbers, enumerated strings, etc...
o Full-text search - natural language and other text, relevance is usually
concern here instead of exact matches
ElasticSearch - search in depth
● Analyzes all documents and keeps an
inverted index data structure for fast
matching
Inverted index example
Document: “The quick brown fox jumped over the lazy dog.”
Term | Doc 1 |
--------------------------
The | x |
quick | x |
brown | x |
fox | x |
jumped | x |
over | x |
the | x |
lazy | x |
dog | x |
Inverted index example
Document: “The quick brown fox jumped over the lazy dog.”
Document: “Quick, the fox, was lazy.”
Term | Doc 1 | Doc 2 |
--------------------------
The | x | |
quick | x | |
brown | x | |
fox | x | x |
jumped | x | |
over | x | |
the | x | x |
lazy | x | x |
dog | x | |
Quick | | x |
was | | x |
ElasticSearch - Query DSL
● Simple search
● Compound search
● Query vs. Filters
● Range filter
● Aggregations
● Significant terms (‘the uncommonly
common’)
ElasticSearch - search examples
● NFL data - fuzzy description, more like this
● NFL data - bool query
● NFL data - all IND offense
● NFL data - aggregations - average down and distance, 2nd half yard to go
ElasticSearch - search examples
● NFL 2013 data - get touchdowns by quarter
● NFL 2013 data - get significant terms in description by teams
ElasticSearch - Demo Charting App
NFL Viz: https://github.com/mradamlacey/nfl-viz
What is LogStash
● Data import/export tool for time series and
log data
● Design inspired by Unix utilities which pipe
in/out to each other
What problem does it solve?
● How to parse and analyze log data from
many sources?
Mar 12 12:00:08 server2 rcd[308]: Loaded 12 packages in 'ximian-red-carpet|351'
(0.01878 seconds)
[2014-05-06 08:04:00.333] [ERROR] - core - bad thing happened
[Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server
configuration: /export/home/live/ap/htdocs/test
LogStash - example use cases
● Import of JSON to ES by dropping files into a folder
● Parse webserver access files across multiple servers,
calculate response times and chart
● Parse application logs and send emails when an error
occurs
● Stream application log data across many servers to a
single log dashboard
● Drop a file into a folder to be ingested and aggregated
into centralized log database
LogStash - example configuration
● input
● filter
● output
LogStash - demo
● output CPU load to CSV (load-avg.conf)
● Stream tweets to ElasticSearch
(twitter.conf)
● Parse NodeJS server logs to ElasticSearch
(mp.conf)
What is Kibana
● Dashboard tool for data in ElasticSearch
● Highly configurable/customizable, build
panels with user defined charts, tables,
etc...
● Built on AngularJS
Kibana - demos
● NFL stats dashboard
● Tweets dashboard
● ElasticSearch Marvel
https://github.com/mradamlacey/elk-stack-presentation
Sources
ElasticSeach - The Definitive Guide:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/
LMAO If you don’t LogStash: http://tech.paulcz.net/ACUG-Logstash
Quick ELK Demos: https://github.com/kurtado/quick-elk

The ELK Stack - Launch and Learn presentation

  • 1.
    The ELK Stack ElasticSearch,LogStash, and Kibana
  • 3.
    What is ● Schema-less ●Distributed ● REST-ful, Document-oriented, and speaks JSON ● For searching and analytics ● and more...
  • 4.
    Architecture ● Built ontop of Apache Lucene ● Runs on the JVM ● Distributed in nature - cluster can have data, master or load balancing nodes ● Highly available and fault tolerant
  • 5.
  • 6.
    ElasticSearch - documentoriented - and schema-less Movies example… ● index a document (PUT and POST) ● check for existence of a document ● retrieve fields ● delete ● update ● update with optimistic concurrency ● update partial ● upsert
  • 7.
    ElasticSearch - Distributed ●Start a node - open Marvel o Data is allocated within the node ● Start another node o Highlight data being redistributed to the new node ● Discovery mechanisms - multicast vs. unicast ● Master election, sharding
  • 8.
    ElasticSearch - RESTful ●Get index stats - number of shards (partitions of data), replicas, state and size ● Get cluster health - overall health status, number of shards and nodes ● Get cluster state - metrics of all indices, settings and mappings of all indices, some metrics, info on all shards in all indices
  • 9.
    ElasticSearch - Concepts ●Index - highest level bucket to store documents, indicates some physical storage ● Type Relational DB ⇒ Databases ⇒ Tables ⇒ Rows ⇒ Columns Elasticsearch ⇒ Indices ⇒ Types ⇒ Documents ⇒ Fields ● Mapping -the definition of a type (think schema) and how ElasticSearch should analyze, parse and store the fields of this type ● Analysis: o first, tokenizing a block of text into individual terms suitable for use in an inverted index, o then normalizing these terms into a standard form to improve their “searchability” or recall.
  • 10.
    ElasticSearch - searchand analytics ● Search o Structured search - working with exact values, between date ranges, numbers, enumerated strings, etc... o Full-text search - natural language and other text, relevance is usually concern here instead of exact matches
  • 11.
    ElasticSearch - searchin depth ● Analyzes all documents and keeps an inverted index data structure for fast matching
  • 12.
    Inverted index example Document:“The quick brown fox jumped over the lazy dog.” Term | Doc 1 | -------------------------- The | x | quick | x | brown | x | fox | x | jumped | x | over | x | the | x | lazy | x | dog | x |
  • 13.
    Inverted index example Document:“The quick brown fox jumped over the lazy dog.” Document: “Quick, the fox, was lazy.” Term | Doc 1 | Doc 2 | -------------------------- The | x | | quick | x | | brown | x | | fox | x | x | jumped | x | | over | x | | the | x | x | lazy | x | x | dog | x | | Quick | | x | was | | x |
  • 14.
    ElasticSearch - QueryDSL ● Simple search ● Compound search ● Query vs. Filters ● Range filter ● Aggregations ● Significant terms (‘the uncommonly common’)
  • 15.
    ElasticSearch - searchexamples ● NFL data - fuzzy description, more like this ● NFL data - bool query ● NFL data - all IND offense ● NFL data - aggregations - average down and distance, 2nd half yard to go
  • 16.
    ElasticSearch - searchexamples ● NFL 2013 data - get touchdowns by quarter ● NFL 2013 data - get significant terms in description by teams
  • 17.
    ElasticSearch - DemoCharting App NFL Viz: https://github.com/mradamlacey/nfl-viz
  • 18.
    What is LogStash ●Data import/export tool for time series and log data ● Design inspired by Unix utilities which pipe in/out to each other
  • 19.
    What problem doesit solve? ● How to parse and analyze log data from many sources? Mar 12 12:00:08 server2 rcd[308]: Loaded 12 packages in 'ximian-red-carpet|351' (0.01878 seconds) [2014-05-06 08:04:00.333] [ERROR] - core - bad thing happened [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/htdocs/test
  • 20.
    LogStash - exampleuse cases ● Import of JSON to ES by dropping files into a folder ● Parse webserver access files across multiple servers, calculate response times and chart ● Parse application logs and send emails when an error occurs ● Stream application log data across many servers to a single log dashboard ● Drop a file into a folder to be ingested and aggregated into centralized log database
  • 23.
    LogStash - exampleconfiguration ● input ● filter ● output
  • 24.
    LogStash - demo ●output CPU load to CSV (load-avg.conf) ● Stream tweets to ElasticSearch (twitter.conf) ● Parse NodeJS server logs to ElasticSearch (mp.conf)
  • 25.
    What is Kibana ●Dashboard tool for data in ElasticSearch ● Highly configurable/customizable, build panels with user defined charts, tables, etc... ● Built on AngularJS
  • 26.
    Kibana - demos ●NFL stats dashboard ● Tweets dashboard ● ElasticSearch Marvel
  • 27.
  • 28.
    Sources ElasticSeach - TheDefinitive Guide: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/ LMAO If you don’t LogStash: http://tech.paulcz.net/ACUG-Logstash Quick ELK Demos: https://github.com/kurtado/quick-elk

Editor's Notes

  • #1 resource: - https://speakerdeck.com/elasticsearch/introduction-to-elasticsearch-logstash-and-kibana - http://www.elasticsearch.org/blog/significant-terms-aggregation/ - http://www.elasticsearch.org/videos/using-elasticsearch-logstash-kibana-techologies-centralized-viewing-logs-bloomberg/
  • #6 - every field in a document is indexed in elasticsearch by default (inverted index created for fast lookup)
  • #9 analysis - things like lowercase, stemming, etc… tokenizing - could use n-gram filter and edge gram filters
  • #10 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #11 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #12 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #13 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #14 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #15 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #16 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #17 http://www.elasticsearch.org/blog/significant-terms-aggregation/
  • #18 http://tech.paulcz.net/ACUG-Logstash/
  • #19 http://tech.paulcz.net/ACUG-Logstash/
  • #20 http://tech.paulcz.net/ACUG-Logstash/
  • #21 http://tech.paulcz.net/ACUG-Logstash/
  • #22 http://tech.paulcz.net/ACUG-Logstash/
  • #27 Putin feeding a baby elk
  • #28 Putin feeding a baby elk