This talk will show a new method for password cracking called UNHash. UNHash as a tool uses rulefiles that are something in between of a DSL (Domain specific language) and a python script to describe the password cracking process. This talk will show how to mix web service abuse, knowledge of human nature and data mining to enable far better attacks against passwords. We will be focusing on a few features: cracking default passwords on network systems with minimal effort, testing for embedded backdoors and offline attacks by data mining and modeling about 33 million user account to gain insight in how users choose their passwords and how can we use that knowledge to speed up password cracking for 20% more gain for non pseudorandom passwords.
This talk will show a new method for password cracking called UNHash. UNHash as a tool uses rulefiles that are something in between of a DSL (Domain specific language) and a python script to describe the password cracking process. That way, we have the possibility to describe complex password cracking rules that contain dictionaries, rules, bruteforcing, joining, combining and other patterns in a language that is easily human readable and extensible. To stop reinventing the wheel, UNHash generates candidate passwords for john the ripper, hashcat or a lot of other tools that can read stdin.
The usage of "slow" hashes like bcrypt and scrypt will require us to try a smaller quantity of possible passwords, but with more detailed targeting. The concept behind UNHash is to enable such attacks against modern slow hashes or to enable better targeting and be faster and easier then traditional methods.
To make use of the new "language", we need set of rules. To generate rulesets, we will show a new machine learning algorithm that can analyze plaintext passwords and generate rules for UNHash. The machine learning algorithm shows a classifier network heuristic that we call the sieve algorithm that can classify passwords and show how users generate their passwords. Training the classifier on about 30+ million unique passwords, can yield interesting rules that describe how users pick their passwords.
Since we are already classified passwords, why not use the effort to collect all password elements like words (and see which languages do they belong), strings, numbers and mutations so we can use that as a cornerstone for a new set of dictionaries. Since we already said said that we want to identify words and their languages, we needed to create a linguistic dictionary for word the use in the classifier algorithm. We will show how to create custom dictionaries for various languages or from a specific domain by parsing wikipedia database backups or by abusing really popular web services.
A small portion of the talk will show why it is useful to scrape password dumps or obtain them via low interaction honeypots in order to collect known backdoor passwords.
We will skip the science and get to the practical part - How can you use UNHash for better password cracking and how to implement more classifiers so we can have a better models of how users create their passwords.