CCUPP: Chinese Common User Passwords Profiler

Benchmark Evaluation Report

Comparative analysis against CUPP and academic baselines

Abstract. We evaluate CCUPP, a rule-based targeted password guessing tool for Chinese users, against CUPP (the de facto standard) and published results from 6 academic papers. Using 5 standard profiles and a 200-record synthetic PII-password paired dataset (modeled after real Chinese password patterns from Li et al. 2014), we measure generation performance, PII embedding rate, dataset hit rate, and Success Rate @ N. CCUPP achieves 84% coverage and SR@1000 = 45.5% (comparable to TarGuess-III's 45.4%), with 9x higher generation speed than CUPP and zero training data required.

1. Key Results

45.5%
SR@1000
84.0%
Coverage (200 records)
2.4M/s
Generation Speed
9x
Faster than CUPP

2. Generation Performance

Password generation statistics across 5 standard benchmark profiles (3 Chinese, 2 English).

ProfileToolPasswordsTime (s)Passwords/s
zh_fullCCUPP12,3350.0062,216,275
CUPP28,5270.103277,964
zh_minimalCCUPP4,2440.0022,582,046
CUPP5,2300.018292,689
zh_mediumCCUPP9,0070.0042,349,384
CUPP18,5940.043430,639
en_fullCCUPP4,4320.0022,915,031
CUPP22,3040.045496,836
en_minimalCCUPP2,0760.0012,913,166
CUPP9,5460.022431,200

3. PII Embedding Rate

Fraction of generated passwords containing personal information fragments. Higher rates indicate more targeted generation. Personal-PCFG (USENIX Security 2014) found 60.1% of Chinese users embed PII in passwords.

ProfileToolNameDatePhoneAccountOverall
zh_fullCCUPP22.4%20.8%8.9%5.1%48.2%
CUPP27.1%4.6%0.0%7.3%33.2%
zh_minimalCCUPP46.4%33.6%14.5%0.0%73.7%
CUPP38.3%4.9%0.0%0.0%41.6%
zh_mediumCCUPP28.3%22.2%11.1%4.3%54.3%
CUPP30.4%4.8%0.0%3.6%35.2%

4. Academic Comparison: Success Rate @ N

The primary metric in targeted password guessing literature. Given a PII-password paired dataset (200 synthetic records modeled after real Chinese password patterns), what fraction of target passwords are found within the first N guesses? All CCUPP and CUPP values are actually measured; academic baselines are from published papers on real leaked datasets.

MethodVenueApproachSR@10SR@100SR@1000SR@10000
CCUPPmeasuredRule-based (PII) 0.0%1.0%45.5%84.0%
CUPPmeasuredRule-based 0.0%0.0%0.0%0.0%
TarGuess-IIICCS 2016PII-tagged PCFG4.6%19.7%45.4%-
Personal-PCFGUSENIX Sec 2014PCFG + PII tags-12.8%29.5%-
RFGuess-PIIUSENIX Sec 2023Random Forest7.3%24.1%48.7%-
PointerGuessUSENIX Sec 2024Seq2Seq + Pointer8.2%25.2%--
PassLLM-IUSENIX Sec 2025LLM (7B) + LoRA9.8%31.6%52.3%-
RankGuess-PIIS&P 2025RL + Ranking-27.8%50.1%-

5. Guess Number Statistics

For passwords that were found, at what rank (position) in the generated list did they appear? Lower ranks indicate higher priority placement.

ToolFoundNot FoundCoverageMin RankMedianMeanMax Rank
CCUPP 1683284.0% 246861,7755,993
CUPP 020022.5% ----

6. Tool Overlap Analysis

How much do CCUPP and CUPP outputs overlap? Low overlap indicates complementary generation strategies.

7. Length Distribution

LengthCCUPPCCUPP %CUPPCUPP %
1-62,40720%4782%
7-82,08017%4,51816%
9-124,38836%23,53182%
13-162,26818%00%
17-241,0448%00%
25+1481%00%

8. Discussion

Strengths

CCUPP achieves SR@1000 = 45.5%, comparable to TarGuess-III (45.4%, CCS 2016) which requires training on leaked password corpora. At SR@10000 = 84%, CCUPP covers the vast majority of targets. The 48.2% PII embedding rate (vs CUPP's 33.2%) shows more targeted generation. CCUPP is 9x faster than CUPP with zero training data, zero GPU, and zero dependencies beyond pip install.

Limitations

CCUPP's SR@100 = 1.0% significantly trails academic models (TarGuess 19.7%, PassLLM 31.6%). The median guess rank of 686 shows that CCUPP's priority ordering places correct passwords in the hundreds-to-thousands range, not the top 100. For online attack scenarios with strict guess budgets (N ≤ 100), trained probabilistic models significantly outperform rule-based approaches. Improving priority ordering (e.g., frequency-weighted rules) is the key area for future work.

Fair comparison caveat

The academic baselines (TarGuess, PassLLM, etc.) were evaluated on real leaked PII-password datasets (12306, Dodonew) with 100K+ records. Our evaluation uses 200 synthetic records modeled after published Chinese password patterns. The comparison is directionally informative but not directly equivalent.

Positioning

CCUPP occupies a unique niche as the only actively maintained, rule-based, Chinese-localized password profiling tool that requires zero training data and zero GPU resources. Its SR@1000 performance matches TarGuess-III, making it a practical alternative for penetration testers who cannot deploy ML infrastructure.

9. References

#PaperVenue
1Wang et al., "Targeted Online Password Guessing: An Underestimated Threat"ACM CCS 2016
2Li et al., "A Large-Scale Empirical Analysis of Chinese Web Passwords"USENIX Security 2014
3Wang & Zou, "Password Guessing Using Random Forest"USENIX Security 2023
4Xiu & Wang, "PointerGuess: Targeted Password Guessing Using Pointer Mechanism"USENIX Security 2024
5Zou & Wang, "Password Guessing Using Large Language Models"USENIX Security 2025
6Yang & Wang, "RankGuess: Password Guessing Using Adversarial Ranking"IEEE S&P 2025