In this article, we present a thorough evaluation of semantic password grammars. We report multifactorial ex-periments that test the impact of sample size, probability smoothing, and linguistic information on password cracking. The semantic grammars are compared with state-of-the-art probabilistic context-free grammar (PCFG) and neural network models, and tested in cross-validation and A vs. B scenarios. We present results that reveal the contributions of part-of-speech (syntactic) and semantic patterns, and suggest that the former are more consequential to the security of passwords. Our results show that in many cases PCFGs are still competitive models compared to their latest neural network counterparts. In addition, we show that there is little performance gain in training PCFGs with more than 1 million passwords. We present qualitative analy-ses of four password leaks (Mate1, 000webhost, Comcast, and RockYou) based on trained semantic grammars, and derive graphical models that capture high-level dependencies between token classes. Finally, we confirm the similarity inferences from our qualitative analysis by examining the effectiveness of grammars trained and tested on all pairs of leaks.