With the growth of data traffic and data volumetric analysis needs, "Big Data" has become one of the most popular fields in IT and many companies are currently working on this topic, by deploying Hadoop clusters, which is the current most popular Big Data framework. As every new domain in computer science, Hadoop comes (by default) with truly no security. During the past years we dug into Hadoop and tried to understand Hadoop infrastructure and security.
This talks aims to present in a simple way Hadoop security issues or rather its "concepts", as well as to show the multiples vectors to attack a cluster. By vectors we mean practical vectors or to sum it up: how can you access the holy "datalake" after plugging your laptop onto the target network.
Moreover, you will learn how Hadoop (in)security model was designed explaining the different security mechanisms implemented in core Hadoop services. You will also discover tools, techniques and procedures we created and consolidated to make your way to the so-called "new black gold": data. Through different examples, you will be enlightened on how these tools and methods can be easily used to get access to data, but also to get a remote system access on cluster members.
Eventually and as Hadoop is the gathering of several services and projects, you will apprehend that patch management in this field is often complicated and known vulnerabilities often stay actionable for a while.