0% found this document useful (0 votes)

37 views16 pages

Password Cracking Using Probabilistic Context-Free

The paper presents a novel method for password cracking using probabilistic context-free grammars, which generates password structures based on previously disclosed passwords. This approach significantly improves the effectiveness of cracking attempts, achieving 28% to 129% higher success rates compared to traditional methods like John the Ripper. The authors emphasize the importance of selecting appropriate word-mangling rules to optimize the number of guesses required in password recovery efforts.

Uploaded by

Hâm Minh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views16 pages

Password Cracking Using Probabilistic Context-Free

Uploaded by

Hâm Minh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220713709

Password Cracking Using Probabilistic Context-Free Grammars

Conference Paper · May 2009

DOI: 10.1109/SP.2009.8 · Source: DBLP

CITATIONS READS
524 11,567

4 authors, including:

Breno de Medeiros
Google Inc.
49 PUBLICATIONS 2,959 CITATIONS

SEE PROFILE

All content following this page was uploaded by Breno de Medeiros on 22 May 2014.

The user has requested enhancement of the downloaded file.

2009 30th IEEE Symposium on Security and Privacy

Password Cracking Using Probabilistic Context-Free Grammars

Matt Weir, Sudhir Aggarwal, Breno de Medeiros, Bill Glodek
Department of Computer Science,
Florida State University, Tallahassee, Florida 32306, USA
weir@cs.fsu.edu, sudhir@cs.fsu.edu, breno.demedeiros@gmail.com, wjglodek@gmail.com

Abstract — Choosing the most effective word-mangling rules To estimate the risk of password-guessing attacks, it has
to use when performing a dictionary-based password cracking been proposed that administrators pro-actively attempt to
attack can be a difficult task. In this paper we discuss a new crack passwords in their systems [4]. Clearly, the accuracy
method that generates password structures in highest of such estimates depends on being able to approximate the
probability order. We first automatically create a probabilistic
context-free grammar based upon a training set of previously
most efficient tools available to adversaries. Therefore, it is
disclosed passwords. This grammar then allows us to generate an established practice among security researchers to
word-mangling rules, and from them, password guesses to be investigate and communicate advances in password-
used in password cracking. We will also show that this breaking: If the most efficient attack is indeed publicly
approach seems to provide a more effective way to crack known, then at least legitimate system operators will not
passwords as compared to traditional methods by testing our underestimate the risk of password compromise. Moreover,
tools and techniques on real password sets. In one series of password breaking mechanisms may also be used for data
experiments, training on a set of disclosed passwords, our recovery purposes. This often becomes necessary when
approach was able to crack 28% to 129% more passwords than
important data is stored in encrypted form under a password-
John the Ripper, a publicly available standard password
cracking program.
wrapped key and the password is forgotten or otherwise
unavailable. In this paper we describe novel advancements
Index Terms — Computer security, Data security, Computer in password-breaking attacks.
crime Some improvements in password retrieval are achieved by
increasing the speed with which the attacker can make
1. INTRODUCTION guesses, often by utilizing specialty hardware or distributed
computing [5, 6]. While increasing the speed at which you
Human-memorable passwords remain a common form of can make guesses is important, our focus is to try and reduce
access control to data and computational resources. This is the number of guesses required to crack a password, and
largely driven by the fact that human memorable passwords thus to optimize the time to find a password given whatever
do not require additional hardware, be it smartcards, key resources are available.
fobs, or storage to hold private/public key pairs. Our approach is probabilistic, and incorporates available
Trends that increase password resilience, in particular information about the probability distribution of user
against off-line attacks, include current or proposed passwords. This information is used to generate password
password hashes that involve salting or similar techniques patterns (which we call structures) in order of decreasing
[1]. Additionally, users are often made to comply with probability. These structures can be either password guesses
stringent password creation policies. While user education themselves or, effectively, word-mangling templates that can
efforts can improve the chances that users will choose safer later be filled in using dictionary words. As far as we are
and more memorable passwords [2], systems that allow aware, our work is the first that utilizes large lists of actual
users to choose their own passwords are typically vulnerable passwords as training data to automatically derive these
to space-reduction attacks that can break passwords structures.
considerably more easily than through a brute-force attack We use probabilistic context-free grammars to model the
(for a survey, see [3]). derivation of these structures from a training set of
Manuscript received November 10, 2008. passwords. In one series of experiments, we first trained our
This work was supported in part by the U.S. National Institute of Justice password cracker on a training set of disclosed passwords.
under Grant 2006-DN-BX-K007. We then tested our approach on a different test set of
M. Weir is a PhD student in the Computer Science Department at Florida
State University; weir@cs.fsu.edu.
disclosed passwords and compared our results with John the
S. Aggarwal is Professor of Computer Science at Florida State University; Ripper [11], a publicly available password cracking
sudhir@cs.fsu.edu. program. Using several different dictionaries, and allowing
B. de Medeiros works for Google and is a courtesy professor of Computer the same number of guesses, our approach was able to crack
Science at Florida State University; breno.demedeiros@gmail.com.
B. Glodek graduated with an M.S. degree in Computer Science from
28% to 129% more passwords than John the Ripper. Other
Florida State University; wjglodek@gmail.com. experiments also showed similar results.

1081-6011/09 $25.00 © 2009 IEEE 391

DOI 10.1109/SP.2009.8

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
By training our attacks on known passwords, this approach success while limiting the number of guesses required per
also provides us a great deal of flexibility in tailoring our dictionary word.
attacks since we automatically generate probability-valued Choosing the right word-mangling rules is crucial as the
structures from training data. For instance, we can train our application of each rule results in a large number of guesses.
password cracker on known Finnish passwords if our target This is especially true when the rules are used in
is a native Finnish speaker. combination. For example, adding a two-digit number to the
end of a dictionary word for a dictionary size of 800,000
words [9] would result in 80,000,000 guesses. Changing the
2. BACKGROUND AND PREVIOUS WORK
first letter to be both uppercase and lowercase would double
In off-line password recovery, the attacker typically this figure. Furthermore, in a typical password retrieval
possesses only a hash of the original password. To crack it, attempt it is necessary to try many different mangling rules.
the attacker makes a guess as to the value of the original The crucial question then becomes, which word-mangling
password. The attacker then hashes that guess using the rules should one try and in which order?
appropriate password-hashing algorithm and compares the Narayanan and Shmatikov use Markov models to generate
two hashes. If the two hashes match, the attacker has probable passwords that are phonetically similar to words
discovered the original password, or in the case of a poor and that thus may be candidates for guesses [10]. They
password hashing algorithm, they at least have a password further couple the Markov model with a finite state
that will grant them access. automaton to reduce the search space and eliminate low
The two most commonly used methods to make these probability strings. The goal of their work, however, is to
guesses are brute-force and dictionary attacks. With brute- support rainbow-based pre-computation (and, subsequently
force, the attacker attempts to try all possible password very fast hash inversion) by quickly finding passwords from
combinations. While this attack is guaranteed to recover the dictionaries that only include linguistically likely passwords.
password if the attacker manages to brute-force the entire They thus do not consider standard dictionary attacks.
password space, it often is not feasible due to time and Our approach can be viewed as an improvement to the
equipment constraints. If no salting is used, brute-force standard dictionary-based attack by using existing corpuses
attacks can be dramatically improved through the use of pre- of leaked passwords to automatically derive word-mangling
computation and powerful time-memory trade-off techniques rules and then using these rules and the corpus to further
[7, 8]. derive password guesses in probability order. We are also
The second main technique is a dictionary attack. The able to derive more complex word-mangling rules without
dictionary itself may be a collection of word lists that are being overwhelmed by large dictionaries due to the
believed to be common sources for users to choose assignments of probabilities to the structures.
mnemonic passwords [9]. However, users rarely use
unmodified elements from such lists (for instance, because
3. PROBABILISTIC PASSWORD CRACKING
password creation policies prevent it), and instead modify
the words in such a way that they can still recall them easily. Our starting assumption is that not all guesses have the
In a dictionary attack, the attacker tries to reproduce this same probability of cracking a password. For example, the
frequent approach to password choice, processing words guess “password12” may be more probable than the guess
from an input dictionary and systematically producing “P@$$W0rd!23” depending on the password creation policy
variants through the application of pre-selected mangling and user creativity. Our goal is thus to generate guesses in
rules. For example, a word-mangling rule that adds the decreasing order of probability to maximize the likelihood
number “9” at the end of a dictionary word would create the of cracking the target passwords within a limited number of
guess, “password9”, from the dictionary word “password”. guesses.
For a dictionary attack to be successful, it requires the The question then becomes, “How should we calculate
original word to be in the attacker’s input dictionary, and for these probabilities?” To this end, we have been analyzing
the attacker to use the correct word-mangling rule. While a disclosed password lists. These lists contain real user
dictionary based attack is often faster than brute-force on passwords that were accidentally disclosed to the public.
average, attackers are still limited by the amount of word- Even though these passwords are publicly available, we
mangling rules they can take advantage of due to time realize they contain personal information and thus treat them
constraints. Such constraints become more acute as the sizes as confidential.
of the input dictionaries grow. In this case, it becomes For our experiments we needed to divide the password
important to select rules that provide a high degree of lists up into two parts, a training corpus and a test corpus. If
a password appears in our training corpus, we will not use it

392

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
in the test corpus. In the case of password lists that were For example, the base structure S1L8D3 might have
disclosed in plain-text format, (i.e. prior to any hashing), we occurred with probability 0.1 in the training set. We decided
can choose to use the passwords in either the training or the to use the base structure directly in our grammars rather than
test corpuses. If a list of password hashes was instead the simple structure since the derivation of the base structure
disclosed, we used the entire list in the test corpus. This is from the simple structure was unlikely to be context-free.
because we have to crack the password hashes before we can The second type of information that we obtained from the
know what the plain text words were that created them. By training set was the probability of digit strings and of special
separating the training and test corpuses we can then strings appearing in the training set. To see an example of
compare the effectiveness of our probabilistic password this please refer to Table 3.1.3 and Table 3.1.4.
cracking with other publicly available password cracking
attacks, notably John the Ripper [11], by comparing their TABLE 3.1.3
results on the test sets. Probabilities of one-digit numbers
1 Digit Number of Occurrences Percentage of Total
3.1 PREPROCESSING 1 12788 50.7803
2 2789 11.0749
In the preprocessing phase, we measure the frequencies of
3 2094 8.32308
certain patterns associated to the password strings. First we
define some terminology that is used in the rest of the paper. 4 1708 6.78235
Let an alpha string be a sequence of alphabet symbols. 7 1245 4.94381
Also let a digit string be a sequence of digits, and a special 5 1039 4.1258
string be a sequence of non-alpha and non-digit symbols. 0 1009 4.00667
When parsing the training set, we denote alpha strings as L, 6 899 3.56987
digit strings as D, and special strings as S. For example the 8 898 3.5659
password “$password123” would define the simple structure 9 712 2.8273
SLD. The base structure is defined similarly but also
captures the length of the observed substrings. In the
example this would be S1L8D3. See Table 3.1.1 and Table TABLE 3.1.4
3.1.2. Note that the character set for alpha strings can be Probabilities of top 10 two-digit numbers
language dependent and that we currently do not make a 2 Digits Number of Occurrences Percentage of Total
distinction between upper case and lower case. 12 1084 5.99425
The first preprocessing step is to automatically derive all 13 771 4.26344
the observed base structures of all the passwords in the 11 747 4.13072
training set and their associated probabilities of occurrence. 69 734 4.05884
06 595 3.2902
TABLE 3.1.1 22 567 3.13537
Listing of different string types 21 538 2.97501
Data Type Symbols Examples 23 533 3.94736
Alpha String abcdefghijklmnopqrstuvwxyzäö cat 14 481 2.65981
Digit String 0123456789 432 10 467 2.58239
Special String !@#$%^&*()-_=+[]{};’:”,./<>? !!

We choose to calculate the probabilities only for digit

TABLE 3.1.2 strings and special strings since we knew that the corpus of
Listing of different grammar structures words, (aka alpha strings), that users may use in password
Structure Example generation was much larger than what we could accurately
Simple SLD learn from the training set. Note that the calculation of the
digit string and special string probabilities is gathered
Base S1L8D3 independently from the base structures in which they appear.
Pre-Terminal $L8123 Please also note that all the information that we capture of
Terminal (Guess) $wordpass123
both types is done automatically from an input file of
training passwords, using a program that we developed.

393

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
Referring to Table 3.1.2 again, the pre-terminal structure with associated probability of 0.0975. The idea is that pre-
fills in specific values for the D and S parts of the base terminal structures define mangling rules that can be directly
structure. Finally, the terminal structure fills in a specific set used in a distributed password cracking trial. For example, a
of alphabet letters for the L parts of the pre-terminal control server could compute the pre-terminal structures in
structure. Deriving these structures is discussed next. order of decreasing probability and pass them to a
distributed system to fill in the dictionary words and hash
the guesses. The ability to distribute the work is a major
3.2 USING PROBABILISTIC GRAMMARS
requirement if the proposed method is to be competitive
Context-free grammars have long been used in the study with existing alternatives. Note that we only need to store
of natural languages [12, 13, 14], where they are used to the probabilistic context-free grammar and that we can
generate (or parse) strings with particular structures. We derive the pre-terminal structures as needed. Furthermore,
show in the following that the same approach is useful in the note that fairly complex base structures might occur in the
automatic generation of password guesses that resemble training data and would eventually be used in guesses, but
human-created passwords. the number of base structures is unlikely to be
A context-free grammar is a defined as G = (V, Σ, S, P), overwhelming.
where: V is a finite set of variables (or non-terminals), Σ is a
finite set of terminals, S is the start variable, and P is a TABLE 3.2.1
finite set of productions of the form (1): Example probabilistic context-free grammar
α→β (1) LHS RHS Probability
where α is a single variable and β is a string consisting of S → D1L3 S2D1 0.75
variables or terminals. The language of the grammar is the S → L3D1S1 0.25
set of strings consisting of all terminals derivable from the D1 → 4 0.60
start symbol. D1 → 5 0.20
Probabilistic context-free grammars simply have D1 → 6 0.20
probabilities associated with each production such that for a S1 → ! 0.65
specific left-hand side (LHS) variable all the associated
S1 → % 0.30
productions add up to 1. From our training set, we first
derive a set of productions that generate the base structures S1 → # 0.05
and another set of productions that derive terminals S2 → $$ 0.70
consisting of digits and special characters. In our grammars, S2 → ** 0.30
in addition to the start symbol, we only use variables of the
form Ln,Dn, and Sn, for specified values of n. We call these The order in which pre-terminal structures are derived is
variables alpha variables, digit variables and special discussed in Section 3.3. Given a pre-terminal structure, a
variables respectively. Note that rewriting of alpha variables dictionary is used to derive a terminal structure which is the
is done using an input dictionary similar to that used in a password guess. Thus if you had a dictionary that contained
traditional dictionary attack. {cat, hat, stuff, monkey} the previous pre-terminal structure
A string derived from the start symbol is called a L34! would generate the following two guesses (the terminal
sentential form (it may contain variables and terminals). The structures), {cat4!, hat4!}, since those are the only dictionary
probability of a sentential form is simply the product of the words of length three.
probabilities of the productions used in its derivation. In our There are many approaches that could be followed when
production rules, we do not have any rules that rewrite alpha substituting the dictionary words in the pre-terminal
variables; thus we can “maximally” derive sentential forms structures. Note that each pre-terminal structure has an
and their probabilities that consist of terminal digits, special associated probability.
characters and alpha variables. These sentential forms are One approach to generating the terminal structures is to
the pre-terminal structures. simply fill in all relevant dictionary words for the highest
In our preprocessing phase, we automatically derive a probability pre-terminal structure, and then choose the next
probabilistic context-free grammar from the training set. An highest pre-terminal structure, etc. This approach does not
example of such a grammar is shown in Table 3.2.1. Given further assign probabilities to the dictionary words. The
this grammar, we can furthermore derive, for example, the naturalness of considering this approach is that we are
pre-terminal structure: leaning only lengths of alpha strings but not specific
S → L3D1S1 → L34S1 → L34! (2) replacements from the training set. This approach thus
always uses pre-terminal structures in highest probability

394

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
order regardless of the input dictionary used. We call this the starting pre-terminal structures includes the probabilities
approach pre-terminal probability order. of the alpha variables. In Section 3.4, we outline the proof
Another approach is to assign probabilities to alpha of correctness of the algorithm.
strings in various ways. Without more information on the First note that it is trivial to generate the most probable
likelihood of individual words, the most obvious technique guess. One simply replaces all the base structures with their
is to assign the alpha strings a probability based on how highest probability terminals and then selects the pre-
many words of that length appear in the dictionary. If there terminal structure with the highest probability. Note that for
are 10 words of length 3, then the probabilities of each of terminal probability order, the alpha strings in the base
those words would be 0.10. We call this approach terminal structure are also assigned a probability. For example, using
probability order. Note that in this case each terminal the data in Table 3.2.1, the highest probable pre-terminal
structure (password guess) has a well-defined probability. structure would be 4L3$$4. Since there are only 1589 base
The probability however is based in part on the input structures generated by our largest training set, this is not
dictionary which was not learned during the training phase. difficult. However, a more structured approach is needed to
We also considered other approaches for assigning generate guesses of a rank other than the first.
probabilities to alpha strings. For instance it is possible to To optimize the total running time of the algorithm, it is
assign probabilities to words in the dictionary based on other useful if it can operate in an online mode, i.e. it calculates
criteria such as observed use, frequency of appearance in the the current best pre-terminal structure and outputs it to the
language, or knowledge about the target. underlying (also distributable) password cracker. On the
An approach related to pre-terminal probability order is to other hand, also for performance reasons, at any particular
use the probability of the pre-terminals to sample a pre- stage the algorithm should only calculate those pre-terminal
terminal structure and then fill in appropriate dictionary structures that might be the current most probable structure
words for the alpha strings. Notice that in this latter case, remaining, taking into consideration the last output value.
we would not use a pre-terminal necessarily in highest Referring to Fig. 3.3.1, we would like to generate the pre-
probability order, but the frequency of generating terminals terminal structures L35! and L34% (nodes 7 and 6) only after
over time would match this pre-terminal probability. We L34! (node 2) has been generated.
call this approach pre-terminal sampled order.
In this paper, we will only consider results using pre-
terminal probability order and terminal probability order.
We remark that the terminal order uses the joint probability
determined by treating the probabilities of pre-terminal
structures and of the dictionary words that are substituted in
as independent.
It should be noted that we use probabilistic context-free
grammars for modeling convenience only; since our
production rules derived from the training set do not have
any recursion, they could also be viewed as regular
grammars. In fact, this allows us to develop an efficient
algorithm to find an indexing function for the pre-terminal
structures, as discussed in the next section. The grammars
that we currently automatically generate are unambiguous
context-free grammars.
Fig. 3.3.1. Generating the “Next” Pre-terminal Structures for the
3.3 EFFICIENTLY GENERATING A “NEXT” FUNCTION Base Structures in Table 3.2.1 (partial tree shown).
In this section we consider the problem of generating
guesses in order of decreasing (or equal) probability and One approach that is simple to describe and implement is
describe the algorithm. For pre-terminal probability order, to output all possible pre-terminal structures, evaluate the
this means in decreasing order of the pre-terminal structures. probability of each, and then sort the result. Unfortunately
For terminal probability order, this is the probability of the this pre-computation step is not parallelizable with the
terminal structures. However, the “next” function algorithm password cracking step that follows (i.e., it is not an online
is the same in both cases except that for the terminal algorithm).
probability order, the initial assignment of probabilities to Originally when we were still trying to see if using
probabilistic grammars was worth further investigation, we

395

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
created a proof of concept program that took this approach. value of the variable that was substituted. See Fig. 3.3.1 and
Unfortunately in addition to the problems described above, Table 3.3.3 to see the result after popping the top queue
it also resulted in over a hundred gigabytes of data that we entry. Also see Appendix 1.
had to generate and then sort before we could make our first TABLE 3.3.3
password guess. As you can imagine, this does not lend itself Priority queue after the first entry was popped
to a real world application.
Base Struct Pre-Terminal Probability Pivot Value
Our actual solution adopts as its main data structure a
L3D1S1 L34! 0.097 0
standard priority queue, where the top entry contains the
most probable pre-terminal structure. In the following, we D1L3S2D1 4L3**4 0.081 2
denote by the index of a variable in a base structure to mean D1L3S2D1 5L3$$4 0.063 0
the position in which the variable appears. For example, in D1L3S2D1 4L3$$5 0.063 3
the base structure L3D1S1 the variable L3 would be assigned
an index of 0, D1 an index of 1, and S1 an index of 2. Next, In this instance, since the popped pivot value was 0, all
we order all terminal values, (such as the numbers 4, and 5 index variables could be substituted. L3 was not incremented
for D1) in priority order for their respective class. That way since there were no values to fill in for it, as the alpha strings
we can quickly find the next most probable terminal value. are handled by the password cracker in a later stage. Both of
The structure of entries in the priority queue can be seen the D1 structures and S2 were replaced, resulting in three new
in Table 3.3.2. They contain a base structure, a pre-terminal pre-terminal structures being inserted into the queue with
structure, and a pivot value. This pivot value is checked pivot values of 0, 2 and 3. Notice that when the priority
when a pre-terminal structure is popped from the priority queue entry corresponding to the 2rd row of Table 3.3.3 is
queue. The pivot value helps determine which new pre- popped, it will not cause a new entry to be inserted into the
terminal structures may be inserted into the priority queue priority queue for its first D1 or its S2 structure. This is
next. The goal of using pivot values is to ensure that all because 4L3**4’s pivot value is equal to 2, which means that
possible pre-terminal structures corresponding to a base it cannot replace the first D1 structure with an index value of
structure are put into the priority queue without duplication. 0. As for the S2 structure, since ‘**’ is the least probable
More precisely, the pivot value indicates that the pre- terminal variable, there is no next-highest replacement rule
terminal structures to be next created from the original base and this entry will simply be consumed.
structure are to be obtained by replacing variables with an Observe that the algorithm is guaranteed to terminate
index value equal to or greater than the popped pivot value. because it processes existing entries by removing them and
Let’s look at an example based on the data in Table 3.2.1. replacing them with new ones that either (a) have a higher
Initially all the highest probability pre-terminals from every value for the pivot or (b) replace the base structure variable
base structure will be inserted into the priority queue with a in the position indicated by the pivot by a terminal that has
pivot value of 0. See Figure 3.3.1 and Table 3.3.2. lower probability than the current terminal in that position. It
can moreover be easily ascertained that the pre-terminal
TABLE 3.3.2 structures in the popped entries are assigned non-increasing
Initial Priority Queue probabilities and therefore the algorithm can output these
Base Struct Pre-Terminal Probability Pivot Value structures for immediate use as a mangling rule for the
D1L3S2D1 4L3$$4 0.188 0 underlying distributed password cracker.
L3D1S1 L34! 0.097 0 This process continues until no new pre-terminal
structures remain in the priority queue, or the password has
been cracked. Note that we do not have to store pre-
Next, the top entry in the priority queue will be popped. terminal structures once they are popped from the queue,
The pivot value will be consulted, and child pre-terminal which has the effect of limiting the size of the data structures
structures will be inserted as part of new entries for the used by the algorithm. In section 4.5, we discuss the space
priority queue. These pre-terminal structures are generated complexity of our algorithm in detail in the context of our
by substituting variables in the popped base structure by experimental results.
values with next-highest probability. Note that only one The running time for the current implementation of our
variable is replaced to create each new candidate entry. next algorithm for generating guesses is extremely
Moreover, this replacement is performed (as described competitive with existing password cracking techniques. On
above) for each variable with index equal to or greater than one of our lab computers, (MaxOSX 2.2GHz Intel Core 2
the popped pivot value. The new pivot value assigned to Duo) it took on average 33 seconds to generate 37781538
each inserted pre-terminal structure is equal to the index unhashed guesses using our method. Comparatively, the

396

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
popular password cracking tool John the Ripper [11] Consequently, when z was output, it did not violate this
operating in wordlist mode took 28 seconds to make the property, and since:
same number of guesses. If we expand the number of
guesses to 300 million, our technique took on average 3 Prob(z) >= Prob(x) > Prob(y),
minutes and 23 seconds to complete, while John the Ripper
operating in incremental (brute-force) mode took 2 minutes it follows that z must have been output (and processed)
and 55 seconds. Note that the vast majority of time (often before y. That means that x was inserted in the priority queue
weeks) taken in cracking passwords is spent in generating prior to y's removal, again in violation of the priority queue
the hashes from those guesses and not in the generation of property. This final contradiction concludes the proof.
the actual guesses themselves. Because of this, even an extra Note that by meeting the following conditions we can
minute or two spent generating guesses would be minor, and fully prove the required correctness of the next function:
thus the running times of these two methods are essentially • No duplicate pre-terminal structures are entered into the
identical. priority queue.
• All possible pre-terminal structures resulting from base
3.4 PROOF OF CORRECTNESS OF THE NEXT FUNCTION structures are eventually entered into the priority queue.

Property P1: pre-terminal structures are output in non- Due to space requirements we do not include a proof of
increasing probability order. these conditions but it follows from our use of the pivot
values.
Proof that P1 holds:
1. Remember that the priority queue is initialized with one
entry per base structure, and that the entry contains the 4. EXPERIMENTS AND RESULTS
pre-terminal structure with maximum probability for 4.1 DESCRIPTION OF PASSWORD LISTS
that base structure. These entries can be easily
constructed by simply replacing the highest likelihood For the research in this paper we obtained three password
terminal values for all the non-alpha variables in each lists to try different cracking techniques against. All three
base structure. lists represent real user passwords, which were compromised
2. Remember that the processing of an entry in the priority by hackers and subsequently publicly disclosed on the
queue results in its removal and output, and (possibly) Internet. As stated before, we realize that while publicly
in the insertion of new entries. For convenience of available, these lists contain private data; therefore we treat
description, we call these new entries “the children” and all password lists as confidential. If you wish a copy of the
the removed entry “the parent”. Recall that children list please contact the authors directly. Due to the moral and
never contain pre-terminal structures of strictly higher legal issues with distributing real user information, we will
probability than the pre-terminal structure contained in only provide the lists to legitimate researchers who agree to
the parent. abide by accepted ethical standards.
The first list, hereafter referred to as the “MySpace List”,
For the sake of contradiction, assume that P1 does not was originally published in October 2006. The passwords
hold, i.e., that at some step of processing, an entry x is were compromised by an attacker who created a fake
output of strictly higher probability than a previously output MySpace login page and then performed a standard phishing
entry y. That is: attack against the users. The attacker did not secure the
server they were collecting passwords upon which allowed
Prob(x) > Prob(y) and y is removed and output before x. independent security researchers to obtain copies of the
passwords. One of these researchers, (not affiliated with any
First let's argue that x had a parent entry z. Indeed, if x has university), subsequently posted his copy of the list on the
no parent, then it was inserted in the priority queue during Full-Disclosure mailing list [15]. While multiple versions of
the algorithm initialization (when the highest probability the MySpace list exist, owing to the fact that different
pre-terminal structure for each base structure was inserted). researchers downloaded the list at different times, we choose
But that means that x was in the priority queue at the step to use the version posted on Full-Disclosure which contained
where y was output, in violation of the priority queue 67042 plain text passwords. Please note that not all of these
property. This contradiction implies that x had a parent z. passwords represent actual user passwords. This is because
Without loss of generality, we can also assume that x is some users recognized that it was a phishing attack and
the first value produced by the algorithm that violates P1. entered fake, (and often vulgar), data. For our test we did not

397

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
attempt to purge these fake passwords due to the difficulties check for an exact match, and record how many guesses
in distinguishing between fake and real passwords. were necessary before the first match could be found. For
The second list will be referred to as the SilentWhisper password lists that are hashed, such as the Finnish list, we
list. This list contains 7480 plain text passwords and was piped the guesses generated by our program into the popular
originally from the website www.silentwhisper.net. A hacker password cracking program John the Ripper [11].
compromised the site via SQL injection, and due to a feud Essentially this allows us to use our program’s word-
with the site owner, subsequently posted the list to bittorrent. mangling rules without having to code our own hash
As a special note, these later passwords were extremely evaluator.
basic. Only 3.28% of the passwords contained both letters As a comparison against our probabilistic password
and numbers, and only 1.20% of them contained both letters cracking technique, we decided to use John the Ripper’s
and special characters. A grand total of two of the passwords default word-mangling rules. These word-mangling rules
contained letters, numbers and special characters. We are as close to an industry standard as we could find, and
included this list though as it does represent the passwords represent the approach most people would take when
many users choose. performing a dictionary-based password cracking attack. At
The final list will be referred to as the “Finnish List”. This its core, both our probabilistic password cracking guess
list was obtained by a hacker group via SQL injection generator and John the Ripper operating in wordlist mode
attacks and the results were subsequently posted on the are dictionary based attacks. When comparing the two
internet [16]. This list actually contains the passwords from methods, we ensure both programs use the same input
many different sites that were compromised; most of them dictionaries when trying to crack a given password set. In a
based in Finland, hence the name. This list contains 15699 dictionary-based attack, the number of guesses generated is
passwords in plain text and an additional 22733 unique finite, and determined by the size of the dictionary and the
MD5 password hashes. It is important to note that the plain type of word-mangling rules used. To reflect this, unless
text passwords and the hashed passwords represent different otherwise specified, we limited the number of guesses our
user bases as they came from separate compromised sites. In probabilistic password generator was allowed to create
fact, it appears that each set, (both the MD5 and plaintext based on the number of guesses generated by the default
lists), are composed of several lists from distinct websites John the Ripper rule set. This is because our program can
that were broken into. generate many more rules than what is included in the
default John the Ripper configuration and thus would create
more guesses given the chance. By ensuring both methods
4.2 EXPERIMENT SETUP AND PROCEDURES
are only allowed the same number of guesses, we feel we
In the current implementation of our probabilistic can fairly compare the two approaches.
password cracking guess generator, (written in C), our To use our method, we have to train our password cracker
program is trained on an existing password list. Then once it on real passwords. To do this, we needed to separate our
is given an input dictionary it can generate password guesses password lists into training lists and test lists. As a special
based on either the pre-terminal probability or the terminal note, if a password was used to train our method we made
probability of the password structures. It is important to note sure we did not include it in any of our test lists. We created
that the training need only be done once to generate the two different training lists to train our probabilistic password
grammar that will be used. This means that any group can cracker. The first list was created from the MySpace
create many different targeted grammars and then distribute password list. We divided the MySpace password list into
them to the end users of the password cracking program. two parts, a training list and a test list. The MySpace training
The end user would use input dictionaries of their choosing list contained a total of 33561 passwords. For the second
to crack passwords. Note that the storage requirement of a training list, we used all of the plaintext passwords from the
grammar is likely to be significantly less than the storage Finnish list. This contained a total of 15699 passwords. We
requirements of a typical input dictionary. Section 4.5 used all the Finnish plaintext passwords since we used the
discusses space requirements in greater detail. This Finnish hashed passwords for the test set. We did not create
distinction between training and operation, and the small a training list from the SilentWhisper set due to its small size
size of the base grammar generated means that our method is and the fact that we would need to have passwords left over
highly portable. to test against.
Our program currently outputs these guesses to stdout. We then designated all passwords not in the training set as
This gives us the flexibility to use our guesses as input to the test set. These passwords are never trained against, and
various other password cracking programs. For instance, to are used solely to gauge the effectiveness of both our
test against a test set of plaintext passwords, we can simply probabilistic password cracker and John the Ripper on real

398

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
world passwords. Just as in standard machine learning
research, our goal is by keeping these two groups, (training
and testing), separate so we can avoid overtraining our
method and provide a more accurate estimation of its
potential. In summary, the three test lists we used were the
MySpace test list containing 33481 plaintext passwords, the
SilentWhisper list which contained 7480 plaintext passwords
and the Finnish test list which contained 22733 unique MD5
password hashes.
One final note; a password was considered ‘cracked’ if
the program generated a guess that matched the password in
the test list.

4.3 DESCRIPTION OF INPUT DICTIONARIES Fig. 4.3.2. Number of Password Guesses Generated by JtR

Due to the fact that both our password cracker and John
the Ripper in wordlist mode operate as a dictionary attack, 4.4 PASSWORD CRACKING RESULTS
they both require an input dictionary to function. We choose
a total of six publicly available input dictionaries to use in Our first test, pictured in Fig. 4.4.1, shows the results of
our tests. Four of them, “English_lower”, “Finnish_lower”, training our Probabilistic Password Cracker on the MySpace
“Swedish_lower” and “Common_Passwords” were obtained training list. Three different cracking techniques are used on
from John the Ripper’s public web site [11]. As a side note, the MySpace test list. The first is the default rule set for John
the word “lower” refers to the fact that the dictionary words the Ripper. The second technique is our Probabilistic
are stored as all lower case. Additionally we used the input Password Cracker using the pre-terminal probabilities of its
dictionary “dic-0294” which we obtained from a popular structures. Once again, the pre-terminal probabilities do not
password-cracking site [9]. This list was chosen due to the assign a probability value to the dictionary words. The third
fact that we have found it very effective when used in technique is our Probabilistic Password Cracker using the
traditional password crackers. Finally, we created our own probabilities of the terminals (guesses). Recall that in this
wordlist “English_Wiki” which is based on the English case, we assign probabilities to dictionary words and extend
words gathered off of www.wiktionary.org. This is a sister our probabilistic context-free grammar to terminal strings.
project of Wikipedia, and it provides user updated Once again, the number of guesses allowed to each run is
dictionaries in various languages. shown in Fig. 4.3.2.
Each dictionary contained a different number of
dictionary words as seen in Table 4.3.1. Due to this, the
number of guesses generated by each input dictionary when
used with John the Ripper’s default mangling rules also
varied as can be seen by Fig. 4.3.2.

Table 4.3.1
Size of Input Dictionaries
Dictionary Name Number of Dictionary Words
Dic-0294 869228
English_Lower 444678
Common_Passwords 816
English_Wiki 68611 Fig. 4.4.1. Number of Passwords Cracked. Trained on the
MySpace Training List. Tested on the MySpace Test List
Swedish_Lower 14555
Finnish_Lower 358963
As the data shows, our password cracking operating in
terminal probability order performed the best. Using it, we
achieved an improvement over John the Ripper ranging from

399

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
28% to 129% more passwords cracked given the same probability order performed competitively with John the
number of guesses. Additionally, when we used the pre- Ripper.
terminal order, in all cases but one we also achieved better
results than John the Ripper, though less then what we
achieved using terminal probability order.
The next question would be how does our probabilistic
method work when trained on a different data set? The same
test as above was run on the MySpace test list, but this time
we used the Finnish training list to train our password
cracker. The results are shown in Fig. 4.4.2.

Fig. 4.4.3. Number of Passwords Cracked. Trained on the

MySpace Training List. Tested on the SilentWhisper Test List

To round things out, we then evaluated our probabilistic

method by training it on the Finnish training set and then
attacking the Finnish test set. Please note that the Finnish
training set and the Finnish test set were gathered from
separate websites. Thus for this experiment, even though the
Fig. 4.4.2. Number of Passwords Cracked. Trained on the Finnish
users share a common language, we trained and then tested
Training List. Tested on the MySpace Test List
our password cracker against different user bases. Due to the
time it takes to audit these passwords since they are hashed,
As the results show, the terminal probability order once we only performed this test with John the Ripper’s default
again performed the best, though not as well as it did when it rule set and our method operating in terminal probability
was trained on the MySpace data. This time the order. The results can be seen in Fig. 4.4.4.
improvement ranged from 11% to 96% more passwords
cracked compared to John the Ripper. A surprising result to
us was that when we used Pre-Terminal Probability Order, it
did not result in a noticeable improvement over John the
Ripper’s default rule set. In fact, in two of the test cases it
actually performed worse.
Next we ran the same tests by training our Probabilistic
Password Cracker on the MySpace training list, and then
running it against the SilentWhisper test list. The results can
be seen in Fig. 4.4.3. As expected, in this case the default
John the Ripper word-mangling rules performed slightly
better. This is due to the relative simplicity of the
SilentWhisper test set. Since our probabilistic method had
been trained on more complex passwords, it spent much of
Fig. 4.4.4. Number of Passwords Cracked. Trained on the Finnish
its time generating guesses using advanced mangling rules, Training List. Tested on the Finnish Test List
vs. John the Ripper which exhausted the simple mangling
rules, (such as just use the dictionary word), first. This does
show a limitation of our probabilistic method as it does need While the results were not as dramatic as compared to
to be trained on passwords of similar complexity as the cracking the MySpace list, we still see an improvement
passwords it is trying to crack. That being said, in all of the ranging from 17% to 29% over John the Ripper in all but
test runs with the exception of the one using the one of the test cases. Considering that we had no previous
English_Lower dictionary, our method operating in terminal

400

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
knowledge of how the passwords in the test set compared in Method performed only slightly worse than when it was
complexity to the passwords in the training set, this is still a trained on 33,561 passwords. What was more surprising was
fairly significant improvement. What this means is that we that our password cracker performed comparable to John the
were able to train our password cracker on one user base and Ripper even when it was trained on only 100 input
then use is successfully against another group which we passwords. We expect that given a longer run (aka allowing
knew nothing about except for their native language. our password cracker to generate more guesses), the effect
Looking back through the previous tests as shown in Fig. of having a larger training set will become more pronounced
4.4.1 through Fig. 4.4.4, one thing we noticed was that our as it will generally provide the password cracker more base
probabilistic method performed significantly worse when it structures as well as digit and symbol strings to draw upon.
used the English_Lower dictionary compared to the results it Also, we expect that the larger training set would better
obtained using the other input dictionaries. For example, reflect more accurate probabilities of the underlying base
let’s consider the test, Fig 4.4.1, where we trained our attack structures and replacement values.
on the MySpace training set, and tested it against the
MySpace test set. If we exclude the run that used the
English_Lower dictionary, the average improvement of our
method using terminal probability order compared to John
the Ripper was 90%. The improvement on the run which
used the English_Lower dictionary was 28%. The other tests
show similar results. We are still investigating why our
attacks do not perform as well with this dictionary. Despite
its name, the English_Lower dictionary seems to be
comprised mostly of “made up” words, such as ‘zoblotnick’,
and ‘bohrh’. Our current assumption is that the presence of a
large number of nonsense words throws off our method in
two different ways. First our program wastes time trying
these nonsense words. Second, when operating in terminal
probability order, a large number of essentially “junk” words
Fig. 4.4.5. Number of Passwords Cracked. Trained on different
can make what should be a highly probable structure have a sized MySpace Training Lists. Tested on the MySpace Test List
lower probability, and thus not be tried as soon. We still using Terminal Order Probability
need to investigate this more thoroughly.
The next test we ran was to evaluate how the size of the
In all the previous tests we limited our probabilistic
training list affected our probabilistic password cracker. To
method to the number of guesses generated by the default
investigate this we used training lists of various sizes
rule set of John the Ripper. One last test we wanted to run
selected from the original MySpace training list. The size of
was to see how our probabilistic method performed if we let
these lists is denoted by the number after them, aka the
it continue to run over an extended time. The following Fig.
MySpace20K list contains twenty thousand training
4.4.6 shows the number of passwords cracked over time
passwords. For reference, the MySpaceFull list contains all
using our probabilistic method operating in terminal
33,561 training passwords from the MySpace training list.
probability order. Please note, while John the Ripper exited
We were concerned about sampling bias as the lists became
after making 37,781,538 guesses, we continued to let our
shorter, (such as containing only 100 or 500 passwords). To
program operate until it made 300,000,000 guesses. Also
address this, for all training sets containing less than one
note that our Probabilistic Password Cracker was still
thousand passwords we trained and then ran each test twenty
creating guesses when we stopped it. We choose
five times with a different random sample of passwords
300,000,000 just as a benchmark number. The results are
included in the training list each time. We then averaged the
shown in Fig. 4.4.6.
results of the 25 different runs. All the tests to measure the
These results match with the previous test on this data set,
effect of the training list size used Terminal Probability
as seen in Fig. 4.4.1, in that given the same number of
Order and were run against the MySpace Test List. The
guesses our password cracker operating in terminal
results can be seen in Fig. 4.4.5. For comparison, John the
probability order cracks 68% more passwords than John the
Ripper’s performance is the left-most value, and training sets
Ripper. As you can see in Fig. 4.4.6 though, the rate at
increase in size from left to right for each input dictionary.
which our method cracks passwords does slow down as
It was surprising that even when our password cracker
more guesses are made. This is to be expected as it tries
was trained on only 10,000 passwords, our Probabilistic
lower and lower probability password guesses.

401

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
probabilistic attack. This would allow us to quickly crack
any short passwords our method may have missed. After a
period of time though, brute-force becomes completely
infeasible due to the length of the passwords and the size of
the keyspace. We expect that even the low probability
guesses generated by our cracker are better than a
completely random guess which would result from a pure
brute-force approach against a large keyspace. Therefore,
the more passwords you can crack before having to solely
rely upon brute-force attacks, the more advantageous it is.
Because of this, the large number of rules, (possibly
billions), that our method automatically generates is another
major advantage of our approach.

Fig. 4.4.6. Number of Passwords Cracked Over Time. Trained on

the MySpace Training List. Tested on the MySpace Test List 4.5 SPACE COMPLEXITY RESULTS
In this section we focus on the space complexity related to
We decided to run the test in Fig. 4.4.6 again, but this generating our password guesses. We first discuss the space
time have John the Ripper switch to brute-force after complexity of storing the grammar as this is what would be
exhausting all of its word-mangling rules. We feel this distributed to the end user once the password cracker has
would best simulate an actual password cracking session, been trained.
(aka exhaust a dictionary attack and then resort to brute- Since the grammar is generated from the training set, the
force) using John the Ripper. Please note that John the size of the grammar is dependent on the size of this set. To
Ripper uses Markov models in its brute-force attack to first distribute this grammar we need to save the set of S-
try passwords phonetically similar to human generated productions (grammar rules with the start symbol S on the
words. It creates the conditional probability based not only left hand side) that give rise to the base structures and their
on letters, but also on symbols and numbers as well. As a associated probabilities. See Figure 3.2.1. Consider a
third experiment, we also ran a pure brute-force attack training set of j passwords each of maximal length k. At
without using John the Ripper’s rules first. The results of worst each password could result in a unique base structure
these tests are shown in Fig. 4.4.7. resulting in O (j) S-productions. Similarly the number of Di-
productions and Si-productions depend on the number of
unique digit strings and special strings, respectively, in the
training set. This could result in a maximum of O(jk) unique
productions. Finally, the number of L-productions (rewriting
an alpha string using a dictionary word) depend on the input
dictionary chosen. For a dictionary of size m, the maximum
number of L-productions is simply O(m). In practice, we
expect the grammars to be highly portable with many fewer
production rules than the worst case. See Table 4.5.1.

Table 4.5.1
Size of the Stored Grammar
Training Set # of Base Number of Number of
& Size Structures Si-productions Di-productions
MySpace10k 820 79 2405
MySpace20k 1216 108 3377
Figure 4.4.7. Number of Passwords Cracked Over Time. Trained
on the MySpace Training List. Tested on the MySpace Test List MySpace 1589 144 4410
(33,561)
One thing we learned from this data is that it may be Finnish 736 49 1223
effective to pause our probabilistic method after around 100
(15,699)
million guesses and switch to a brute-force attack using a
small keyspace for a limited time before resuming our

402

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
We next consider the space complexity of an actual All tests were run using terminal probability order and
password cracking session. Using the grammar, for each using the dictionary Dic-0294. Note that in terminal
base structure, we generate pre-terminal structures, using the probability order while the specific L-production is not
“next” function that are pushed and popped from the priority expanded in the priority queue, its probability is taken into
queue as described in Section 3.3. The space complexity of account when pushing and popping the pre-terminal
this algorithm is the maximum size of the priority queue. It structures. The input dictionary thus can cause differences
should be clear that this is worst case O(n) where there are n in how the priority queue grows.
possible pre-terminals generated. We do not expect that the We finally consider the maximum number of pre-
worst case is actually sublinear. In practice, the maximum terminals and password guesses that could possibly be
size of the priority queue has not been an issue in our generated by our grammar. Consider as an example a base
experiments to date. Table 4.5.2 shows the total number of structure that takes the form S1L8D3. A pre-terminal value
pre-terminals generated to create a specified number of might take the form $L8123, and a final guess, (terminal
password guesses. The space requirement is shown by the value), might take the form $password123. To find the total
maximume size of the priority queue. Figure 4.5.3 shows number of possible pre-terminal values for this base
the size of the priority queue as a function of the passwords structure, one simply needs to examine the total possible
generated when trained on the MySpace training sets. replacements for each string variable in the base structure.
Using this example, and assuming there are 10 S1-production
Table 4.5.2 rules and 50 D2-production rules, then the total number of
Space Cost using Dic-0294 pre-terminals that may be generated by S1L8D3 is 500.
Training Set Total Pre- Maximum Password To find the total number of password guesses we simply
Terminals Size of Guesses expand this calculation by factoring in the number of
Generated Queue (millons) dictionary words that can replace the alpha string. In the
MySpace10k 28,457 1,274 50 above example, if we assume there are 1,000 dictionary
MySpaceFull 14,661 1,688 50 words of length 8, then the total number of guesses would be
500,000. See Table 4.5.4 for the total search space
Finnish 19,550 4,753 50
generated by each given training set and input dictionary.
MySpace10k 174,165 4,642 500
MySpaceFull 109,453 3,691 500 Table 4.5.4
Total Search Space
Finnish 1,567,911 138,187 500
Training Input Dictionary Pre- Password
MySpace10k 470,949 9,946 1,000 Set Terminals Guesses
MySpaceFull 193,963 6,682 1,000 (millions) (trillions)
MySpaceFull dic-0294 34,794,330 >100,000,000
Finnish 4,324,913 299,933 1,000
MySpaceFull English-Wiki 34,794,330 >100,000,000
MySpaceFull Common_Passwords 34,785,870 36,000
Finnish dic-0294 578 >100,000,000
Finnish English-Wiki 578 10,359,023
Finnish Common_Passwords 506 6

To explain the results of Table 4.5.4 further, note the

number of pre-terminals generated can be dependent on the
input dictionary, since if a Li-production exists where no
dictionary word matches it, (for example the dicionary does
not contain any words of length 9), then the base structure
containing the Li-production is discarded for that password
cracking run. Also, we found that the total number of pre-
terminals were mostly driven by a few base structures that
contained a large number of Di and Si-productions, for
Figure 4.5.3 Size of the Priority Queue over Time, using Dic-0294 example S1D3S2D3S1D4. Likewise the number of terminals,
as the Input Dictionary (final password guesses), was dominated by a few base
structures that contained many Li-productions such as:

403

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
L1S1L1S1L1S1L1S1L1S1L1S1L1S1. This was made worse by would allow our generator to create password guesses of a
the fact that in our code we did not remove duplicate structure or containing a terminal value that was not present
dictionary words. For example we would have 52 L1 in the training set. For example, currently if the number ‘23’
replacements from the input dictionary “dic-0294” even does not appear in the training set, our method will never use
though we lowercased all input words before using them. it. Ideally we would like it to try this terminal value, but at a
This is because by not removing duplicates we had two reduced probability compared to values found in the training
instances of every single letter of length 1. set. The ultimate goal would be to allow our method to
That being said, the advantage of our method is that these automatically switch between dictionary based attacks and
highly complex base structures will generally not be utilized targeted brute-force attacks based upon their relative
until late in the password cracking session due to their probability of cracking a password. For example, it might try
corresponding low probabilities. Therefore, we would not some word-mangling rules, then brute-force all words of
expand them in our priority queue until all the more length four, before returning back to trying additional word-
probable guesses have been generated first. mangling rules.
There also exists more research to be performed on
verifying the performance of this method if it is trained and
5. FUTURE RESEARCH
tested against password lists from different sources.
There are several areas that we feel are open for
improvement in our approach with using probabilistic
6. CONCLUSION
grammars for password cracking. As stated earlier in section
3.2, we are currently looking into different ways to do Our experiments show that using a probabilistic context
insertion of dictionary words into the final guess that take free grammar to aid in the creation of word-mangling rules
into account the size of the input dictionary. As can been through training over known password sets is a promising
seen in Figures 4.4.1 – 4.4.5, there was a definite advantage approach. It also allows us to quickly create a ruleset to
to using terminal probability order vs. pre-terminal generate password guesses for use in cracking unknown
probability order with our probabilistic password cracker. passwords. When compared against the default ruleset used
Currently we determine the probability of dictionary words in John the Ripper, our method managed to outperform it by
of length n by assigning a probability of 1/k if there are k cracking 28% - 129% more passwords, given the same
words of length n in the input dictionary. There are however number of guesses, based on training and testing on the
many other approaches we could take. Currently the most MySpace password set. Our method also did very well
promising approach seems to be the use of several input when trained on the Finnish training set and tested on the
dictionaries with different associated probabilities. This way MySpace test set. Our approach is expected to be most
one might have a small highly probable dictionary, (aka effective when tailoring one's attack against different sources
common passwords), and a much larger dictionary based on by training it on passwords of a relevant structure. For
words that are less common. example, if it is known that the target password was
Another point where we have identified room for future generated to satisfy a strong password policy (such as
improvement is modifying the base structures to more requiring it to be 8 characters long and containing numbers
accurately portray how people actually create passwords. and special characters) the algorithm could be trained only
For example, we could add another category, ‘U’, to on passwords meeting those requirements. We have also
represent uppercase letters as currently our method only shown that we can quickly and manageably generate
deals with lowercase letters. Also we could add another password guesses in highest probability order which allows
transformation to the base structure that would deal with us to test a very high number of rulesets effectively.
letter replacement, such as “replace every ‘a’ in the We feel that our method might successfully help forensic
dictionary word with an ‘@’.” Since we are using a context- investigators by doing better than existing techniques in
free grammar, this would be fairly straightforward. All we many practical situations. Our work can also provide a more
need to do is create a new production rule that deals with realistic picture of the real security (or lack of the same)
letter replacement. The harder part would be identifying provided by passwords. We expect that our approach can be
those transformations during the training phase. We are an invaluable addition to the existing techniques in password
currently looking into several ways to efficiently identify cracking.
those transformations such as checking the edit distance
between known passwords and a dictionary file.
It may also be useful to add probability smoothing or
switch to a Bayesian method in the training stage. This

404

Authorized licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.
REFERENCES Trade-Off. Proceedings of Advances in Cryptology (CRYPTO
2003), Lecture Notes in Computer Science, Volume 2729, pages
[1] U. Manber. A simple scheme to make passwords based on 617-630, 2003. Springer.
one-way functions much harder to crack. Computers & Security
Journal, Volume 15, Issue 2, 1996, Pages 171-176. Elsevier. [9] A list of popular password cracking wordlists, 2005, [Online
Document] [cited 2008 Oct 07] Available HTTP
[2] J. Yan, A. Blackwell, R. Anderson, and A. Grant. Password http://www.outpost9.com/files/WordLists.html
Memorability and Security: Empirical Results. IEEE Security and
Privacy Magazine, Volume 2, Number 5, pages 25-31, 2004. [10] A. Narayanan and V. Shmatikov, Fast Dictionary Attacks on
Passwords Using Time-Space Tradeoff, CCS’05, November 7–11,
[3] R. V. Yampolskiy. Analyzing User Password Selection 2005, Alexandria, Virginia
Behavior for Reduction of Password Space. Proceedings of the
IEEE International Carnahan Conferences on Security Technology, [11] John the Ripper password cracker, [Online Document] [cited
pp.109-115, 2006. 2008 Oct 07] Available HTTP http://www.openwall.com
[4] M. Bishop and D. V. Klein. Improving system security via [12] J.E. Hopcroft and J.D. Ullman, Introduction to Automata
proactive password checking. Computers & Security Journal, Theory, Languages, and Computation, Addison Wesley, 1979.
Volume 14, Issue 3, 1995, Pages 233-249. Elsevier. [13] L. R. Rabiner, A Tutorial on Hidden Markov Models and
[5] G. Kedem and Y. Ishihara. Brute Force Attack on UNIX Selected Applications in Speech Recognition, Proceedings of the
Passwords with SIMD Computer. Proceedings of the 3rd USENIX IEEE, Volume 77, No. 2, February 1989
Windows NT Symposium, 1999. [14] N. Chomsky. Three models for the description of language.
[6] N. Mentens, L. Batina, B. Preneel, I. Verbauwhede. Time- Information Theory, IEEE Transactions on, 2(3):113–124, Sep
Memory Trade-Off Attack on FPGA Platforms: UNIX Password 1956.
Cracking. Proceedings of the International Workshop on [15] Robert McMillan, Phishing attack targets Myspace users,
Reconfigurable Computing: Architectures and Applications. 2006, [Online Document] [cited 2008 Oct 07] Available HTTP
Lecture Notes in Computer Science, Volume 3985, pages 323-334, http://www.infoworld.com/infoworld/article/06/10/27/HNphishing
Springer, 2006. myspace 1.html.
[7] M. Hellman. A cryptanalytic time-memory trade-off. IEEE [16] Bulletin Board Announcement of the Finnish Password List,
Transactions on Information Theory, Volume 26, Issue 4, pages October 2007, [Online Document] [cited 2008 Oct 07] Available
401-406, 1980. HTTP http://www.bat.org/news/view_post?postid=40546&page=1
[8] P. Oechslin. Making a Faster Cryptanalytic Time-Memory &group.

APPENDIX 1
PSEUDO CODE FOR THE NEXT FUNCTION
//The probability calculation depends on if pre-terminal or terminal probability is used
//New nodes will be inserted into the queue with the probability of the pre-terminal structure acting as the priority value
For (all base structures) { //first populate the priority queue with the most probable values for each base structure
working_value.structure = most probable pre-terminal value for the base structure
working_value.pivot_value = 0
working_value.num_strings = total number of L/S/D strings in the corresponding base structure
working_value.probability = calculate_probability(working_value.structure)
insert_into_priority_queue(priority_queue, working_value) //higher probability == greater priority
}
working_value = Pop(priority_queue) //Now generate password guesses
while (working_value!=NULL) {
Print out all guesses for the popped value by filling in all combinations of the appropriate alpha strings.
For (i=working_value.pivot_value; i<working_value.num_strings;i++) {
insert_value.structure=decrement(working_value.structure,i); //get next lower probability S or D structure at pivot value ‘i’
if (insert_value.structure!=NULL) {
insert_value.probability = calculate_probability(insert_value.structure);
insert_value.pivot_value = i
insert_value.num_strings = working_value.num_strings
insert_into_priority_queue(priority_queue,insert_value)
}
}
working_value = Pop(priority_queue)
}

405

Authorized
View publication stats licensed use limited to: Florida State University. Downloaded on August 26, 2009 at 13:27 from IEEE Xplore. Restrictions apply.

Visual Studio 2019
No ratings yet
Visual Studio 2019
2,170 pages
First Crack Plaintext
No ratings yet
First Crack Plaintext
976 pages
4an Implementation and Evaluation of PDF Password Cracking Using John The Ripper and Crunch
No ratings yet
4an Implementation and Evaluation of PDF Password Cracking Using John The Ripper and Crunch
4 pages
Authentication by Encrypted Negative Password PDF
100% (1)
Authentication by Encrypted Negative Password PDF
15 pages
Uncrackable Passwords
100% (2)
Uncrackable Passwords
6 pages
Password Strengthening: Using Multi-Lingual Passwords
No ratings yet
Password Strengthening: Using Multi-Lingual Passwords
4 pages
Omen - Fast Password Guessing
No ratings yet
Omen - Fast Password Guessing
15 pages
Hospital Management System (SRS)
100% (8)
Hospital Management System (SRS)
16 pages
Memento: How To Reconstruct Your Secrets From A Single Password in A Hostile Environment
No ratings yet
Memento: How To Reconstruct Your Secrets From A Single Password in A Hostile Environment
28 pages
top10k-vn-passwords
No ratings yet
top10k-vn-passwords
170 pages
A Novel Dictionary Generation Methodology For Contextual-Based Password Cracking
No ratings yet
A Novel Dictionary Generation Methodology For Contextual-Based Password Cracking
11 pages
Guess Again (And Again and Again) : Measuring Password Strength by Simulating Password-Cracking Algorithms
No ratings yet
Guess Again (And Again and Again) : Measuring Password Strength by Simulating Password-Cracking Algorithms
15 pages
Passwordpaper Springer SCIproof
No ratings yet
Passwordpaper Springer SCIproof
23 pages
A Study of Personal Information in Human-Chosen
No ratings yet
A Study of Personal Information in Human-Chosen
9 pages
Pico: No More Passwords!: 1 Why Users Are Right To Be Fed Up
No ratings yet
Pico: No More Passwords!: 1 Why Users Are Right To Be Fed Up
34 pages
plaintext 15K COMBOLIST Vietnam BY LORD Ali
No ratings yet
plaintext 15K COMBOLIST Vietnam BY LORD Ali
261 pages
Pfleeger 9780134093093 Ch02-1
No ratings yet
Pfleeger 9780134093093 Ch02-1
22 pages
Application Inspection
No ratings yet
Application Inspection
15 pages
Text Passwords
100% (1)
Text Passwords
44 pages
Passgan: A Deep Learning Approach For Password Guessing
No ratings yet
Passgan: A Deep Learning Approach For Password Guessing
20 pages
Authentication
No ratings yet
Authentication
90 pages
CCS PDF
No ratings yet
CCS PDF
11 pages
Measuring Password Guessability For An Entire University
No ratings yet
Measuring Password Guessability For An Entire University
14 pages
Module 012 Password Cracking
No ratings yet
Module 012 Password Cracking
16 pages
Protection by Stevo
No ratings yet
Protection by Stevo
19 pages
Odern Password Cracking: A Hands-On Approach To Creating An Optimised and Versatile Attack.
No ratings yet
Odern Password Cracking: A Hands-On Approach To Creating An Optimised and Versatile Attack.
65 pages
Metoda Rainbow D Tradus
No ratings yet
Metoda Rainbow D Tradus
13 pages
Defcon 16 Weir
No ratings yet
Defcon 16 Weir
39 pages
Britannia 23042019235153 Bsensenoticeintimation 013
No ratings yet
Britannia 23042019235153 Bsensenoticeintimation 013
58 pages
Cracking More Password Hashes With Patterns
No ratings yet
Cracking More Password Hashes With Patterns
69 pages
Lecture3 AccessControl
No ratings yet
Lecture3 AccessControl
56 pages
Password Strength Tester Ijariie20152
No ratings yet
Password Strength Tester Ijariie20152
12 pages
Van20Heerden_2009
No ratings yet
Van20Heerden_2009
20 pages
AutoPass an Automatic Password Generator
No ratings yet
AutoPass an Automatic Password Generator
23 pages
4779 Brute Forceanddictionaryattackonhashedreal Worldpasswords
No ratings yet
4779 Brute Forceanddictionaryattackonhashedreal Worldpasswords
7 pages
B-157 Poster
No ratings yet
B-157 Poster
1 page
Publishable Humanly Usable Secure Password Creation Schemas
No ratings yet
Publishable Humanly Usable Secure Password Creation Schemas
10 pages
Week 3
No ratings yet
Week 3
33 pages
Lecture 4
No ratings yet
Lecture 4
34 pages
Grade 7 Mathematics Syllabus for USA
No ratings yet
Grade 7 Mathematics Syllabus for USA
30 pages
Password Cracking
No ratings yet
Password Cracking
8 pages
Penetration Testing of Password Protected Documents
No ratings yet
Penetration Testing of Password Protected Documents
22 pages
2022-V13I7150
No ratings yet
2022-V13I7150
7 pages
Ijcnis V8 N7 4
No ratings yet
Ijcnis V8 N7 4
8 pages
Impact of Entertainment Applications To The Academic Performance of Senior High School Students
No ratings yet
Impact of Entertainment Applications To The Academic Performance of Senior High School Students
31 pages
1.1what Is A Password Manager?
No ratings yet
1.1what Is A Password Manager?
16 pages
Password Strength - Wikipedia
No ratings yet
Password Strength - Wikipedia
84 pages
Crypto 7
No ratings yet
Crypto 7
22 pages
Se Final Report1 PDF
100% (1)
Se Final Report1 PDF
26 pages
A Comparison of Three Random Password Generators
No ratings yet
A Comparison of Three Random Password Generators
16 pages
Keszthelyi 44
No ratings yet
Keszthelyi 44
21 pages
week 3
No ratings yet
week 3
23 pages
Factory Physics 3rd Edition PDF Download
0% (1)
Factory Physics 3rd Edition PDF Download
2 pages
Truma Heating CP Plus Operating Instructions EN
No ratings yet
Truma Heating CP Plus Operating Instructions EN
21 pages
PASSWORD CRACKING - JJRPT
100% (1)
PASSWORD CRACKING - JJRPT
16 pages
Passwordpaper Springer SCIproof
No ratings yet
Passwordpaper Springer SCIproof
30 pages
Ethics Lect7 System Hacking
No ratings yet
Ethics Lect7 System Hacking
14 pages
Server Security-To Detect and Prevent Brute Force and Dictionary Attack
No ratings yet
Server Security-To Detect and Prevent Brute Force and Dictionary Attack
20 pages
Introduction to AutoCAD 2020: A Modern Perspective (eBook PDF) download pdf
100% (2)
Introduction to AutoCAD 2020: A Modern Perspective (eBook PDF) download pdf
34 pages
dk-2 Operating Manual 2-8
No ratings yet
dk-2 Operating Manual 2-8
54 pages
Unhashing Passwords: Asst. Prof. Rajesh Dhakad Rhythum Tamra
No ratings yet
Unhashing Passwords: Asst. Prof. Rajesh Dhakad Rhythum Tamra
4 pages
wordlists-vn1k
No ratings yet
wordlists-vn1k
17 pages
Data Warehosing and Data Mining
No ratings yet
Data Warehosing and Data Mining
15 pages
Securing Database Passwords Using A Combination of Hashing and Salting Techniques
No ratings yet
Securing Database Passwords Using A Combination of Hashing and Salting Techniques
7 pages
Detailed Lesson Plan in Science Grade 10
No ratings yet
Detailed Lesson Plan in Science Grade 10
9 pages
CSE 3118Y Week 02 About Password by Keszthelyi
No ratings yet
CSE 3118Y Week 02 About Password by Keszthelyi
20 pages
Password Cracker Using MPI: Project Report
No ratings yet
Password Cracker Using MPI: Project Report
33 pages
Adi Dac SG
No ratings yet
Adi Dac SG
12 pages
Analyzing Password Decryption Techniques Using Dictionary Attack
No ratings yet
Analyzing Password Decryption Techniques Using Dictionary Attack
9 pages
PasswordCracker - Me - Password Cracking How To
0% (1)
PasswordCracker - Me - Password Cracking How To
6 pages
01 Personalized Digital Storybooks for Kids
No ratings yet
01 Personalized Digital Storybooks for Kids
16 pages
A Compact Circular Patch Antenna For Wireless Network Applications
No ratings yet
A Compact Circular Patch Antenna For Wireless Network Applications
4 pages
Password Cracking Technique: A Survey: Himanshu Kumar, Neelam Mathur
No ratings yet
Password Cracking Technique: A Survey: Himanshu Kumar, Neelam Mathur
3 pages
log2
No ratings yet
log2
7 pages
Class Leading Performance With The Kyocera V4 KX Driver
No ratings yet
Class Leading Performance With The Kyocera V4 KX Driver
5 pages
Syllabus-MIS 315
No ratings yet
Syllabus-MIS 315
5 pages
Part 7 - DCIM-For-Dummies - 3rd-Edition
No ratings yet
Part 7 - DCIM-For-Dummies - 3rd-Edition
5 pages
Deposit Bonus Program - Test Scenario Matrix - V1.0
No ratings yet
Deposit Bonus Program - Test Scenario Matrix - V1.0
4 pages
Sketch Up Vray 3.4
No ratings yet
Sketch Up Vray 3.4
37 pages
55c20ce4cf7f39a28234910a4251fa56
No ratings yet
55c20ce4cf7f39a28234910a4251fa56
81 pages
IP 2025 Sample paper
No ratings yet
IP 2025 Sample paper
7 pages
G6 - Second Periodical Test - English-6
No ratings yet
G6 - Second Periodical Test - English-6
7 pages
Unit4 IOT DEVELOPMENTY EXAMPLES
No ratings yet
Unit4 IOT DEVELOPMENTY EXAMPLES
17 pages
A Convenient Method For Securely Managing Passwords: J. Alex Halderman Brent Waters Edward W. Felten
No ratings yet
A Convenient Method For Securely Managing Passwords: J. Alex Halderman Brent Waters Edward W. Felten
9 pages
wordlists-vn100
No ratings yet
wordlists-vn100
2 pages
Shading A Sphere: Read The Description!How To Draw A Realistic Sphere With ..
No ratings yet
Shading A Sphere: Read The Description!How To Draw A Realistic Sphere With ..
1 page
Rate Definition
No ratings yet
Rate Definition
8 pages
Lab3-Lê Quang Vũ-SE160967-IA1702
No ratings yet
Lab3-Lê Quang Vũ-SE160967-IA1702
3 pages
Media Kontra Droga Ad Campaign
No ratings yet
Media Kontra Droga Ad Campaign
3 pages
Cheat Sheet Final Final
No ratings yet
Cheat Sheet Final Final
2 pages
Synopsis of Password Generator GUI
No ratings yet
Synopsis of Password Generator GUI
12 pages
Pentesting Azure Applications: The Definitive Guide to Testing and Securing Deployments
From Everand
Pentesting Azure Applications: The Definitive Guide to Testing and Securing Deployments
Matt Burrough
No ratings yet
Cybersecurity Fundamentals Explained
From Everand
Cybersecurity Fundamentals Explained
Brian Mackay
No ratings yet
Your System's Sweetspots: CEO's Advice on Basic Cyber Security: CEO's Advice on Computer Science
From Everand
Your System's Sweetspots: CEO's Advice on Basic Cyber Security: CEO's Advice on Computer Science
Warren H. Lau
No ratings yet
The Browser Hacker's Handbook
From Everand
The Browser Hacker's Handbook
Wade Alcorn
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Password Cracking Using Probabilistic Context-Free

Uploaded by

Password Cracking Using Probabilistic Context-Free

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Password Cracking Using Probabilistic Context-Free Grammars

Conference Paper · May 2009

The user has requested enhancement of the downloaded file.

Password Cracking Using Probabilistic Context-Free Grammars

1081-6011/09 $25.00 © 2009 IEEE 391

We choose to calculate the probabilities only for digit

Fig. 4.4.3. Number of Passwords Cracked. Trained on the

To round things out, we then evaluated our probabilistic

Fig. 4.4.6. Number of Passwords Cracked Over Time. Trained on

To explain the results of Table 4.5.4 further, note the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.