The document discusses Hadoop security today and tomorrow. It describes the four pillars of Hadoop security as authentication, authorization, accountability, and data protection. It outlines the current security capabilities in Hadoop like Kerberos authentication and access controls, and future plans to improve security, such as encryption of data at rest and in motion. It also discusses the Apache Knox gateway for perimeter security and provides a demo of using Knox to submit a MapReduce job.
#19 BackgroundHortonworks led initiativeUseful for connecting to Hadoop from the outside the clusterWhen more client language flexibility is requiredi.e. Java binding not an optionNot intended for RPC callsCall it REST API Gateway for HadoopDon’t call it a firewallFirewalls are at the network layerDon’t call is perimeter securityPerimeter security is getting discredited as an incomplete security solution
#21 Node the arrows to Hadoop Cluster are simplificationsActually there will be multiple arrow – one per port open between Knox and Hadoop Services it supports (WebHDFS, WebHCAT, HiveServer2, HBase, Oozie) & more in future
#22 Functions as HTTP reverse proxyRe-writes URLs to protect internal network topologyKnox Gateway embeds Jetty containerReads/Writes HTTP