粗略统计CSDN泄漏的密码

标签:Python

这次CSDN的账号密码泄漏,我出于好奇,就写了个Python脚本来分析这些600多万密码。
结果发现大多数用户的密码长度在8~14位之间,有29万用户的用户名和密码相同,有289万个密码为纯数字,最常用的10个密码为:123456789、12345678、11111111、dearbook、00000000、123123123、1234567890、88888888、111111111和147258369。
其中,使用123456789或12345678的用户有44万;而dearbook这个诡异的密码居然也有46053人采用,搞不懂……

最后附源码:
import heapq
import operator
import re

pattern = re.compile(r'(.+) # (.+) # .+')
total = 0
password_eq_to_name = 0
digit_passwords = 0
name_length = [0] * 21
password_length = [0] * 41
passwords = {}

file = open('www.csdn.net.sql')
for line in file:
	match = pattern.match(line)
	if match:
		total += 1
		name = match.group(1)
		password = match.group(2)

		if name == password:
			password_eq_to_name += 1

		if password.isdigit():
			digit_passwords += 1

		name_length[len(name)] += 1
		password_length[len(password)] += 1

		passwords[password] = passwords.get(password, 0) + 1

print 'Total lines:', total
print 'Password equal to name:', password_eq_to_name
print 'Digit passwords:', digit_passwords
print 'Name length:'
for i in xrange(21):
	if name_length[i]:
		print '\t%d: %d' % (i, name_length[i])
print 'Password length:'
for i in xrange(41):
	if password_length[i]:
		print '\t%d: %d' % (i, password_length[i])
top_passwords = heapq.nlargest(10, passwords.iteritems(), key=operator.itemgetter(1))
print 'Top 10 passwords:'
for password, count in top_passwords:
	print password, count
以及结果:
Total lines: 6428632
Password equal to name: 292661
Digit passwords: 2893401
Name length:
    1: 8
    2: 297
    3: 3711
    4: 14527
    5: 277094
    6: 595904
    7: 739229
    8: 869899
    9: 903438
    10: 973000
    11: 709963
    12: 531144
    13: 304800
    14: 207898
    15: 125882
    16: 75838
    17: 36862
    18: 25163
    19: 13391
    20: 20584
Password length:
    1: 90
    2: 51
    3: 598
    4: 6675
    5: 33039
    6: 82999
    7: 16901
    8: 2338638
    9: 1552173
    10: 930888
    11: 628821
    12: 369529
    13: 167845
    14: 154966
    15: 75345
    16: 49653
    17: 7024
    18: 5937
    19: 2297
    20: 5080
    21: 4
    22: 13
    23: 6
    24: 11
    25: 5
    26: 13
    27: 1
    28: 4
    29: 7
    30: 5
    31: 1
    32: 2
    36: 2
    38: 2
    39: 1
    40: 6
Top 10 passwords:
123456789 235012
12345678 212749
11111111 76346
dearbook 46053
00000000 34952
123123123 19986
1234567890 17790
88888888 15033
111111111 6995
147258369 5965

13条评论 你不来一发么↓ 顺序排列 倒序排列

    向下滚动可载入更多评论,或者点这里禁止自动加载

    想说点什么呢?