Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!


抓包分析

首先打開charles,記錄從打開瀏覽器到新浪微博登陸成功的全部http請求

打開新浪微博,等待頁面加載完成後,輸入賬號密碼點擊登陸,charles停止抓包,關閉瀏覽器。並將抓包結果進行保存。

找到登陸的POST請求https://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.19)

Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!

登陸POST請求

私信小編01-04即可獲取數十套PDF哦!


Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!


理論上我們只需要能完整的提交這個表單就能實現新浪微博的登陸。但是如果進行試驗的話,會發現將該表單完整複製之後使用requests進行post提交是無法登陸的,所以可以斷定其中某些字段是通過動態獲取。

由於新浪微博的首頁內容太多太雜

我們將上文中拿到的登陸post請求https://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.19)進行訪問發現是一個單獨的登陸頁面。

Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!

登陸頁面

打開F12,對登陸按鈕進行定位,根據前後臺交互的方式可以知道後臺應該是通過中某一個內容判斷用戶點擊了登陸按鈕,在Source中的js代碼部分檢索的type:submit。

Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!

按鈕定位

Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!

js代碼查找登陸操作

初步斷定該js中進行了一些加密操作

username轉換

根據命名規則嘗試檢索用戶名username很容易的找到了一段username的轉換操作

this.prelogin = function(config, callback) {
var url = location.protocol == "https:" ? ssoPreLoginUrl.replace(/^http:/, "https:") : ssoPreLoginUrl;
var username = config.username || "";
username = sinaSSOEncoder.base64.encode(urlencode(username));
delete config.username;
var arrQuery = {
entry: me.entry,
callback: me.name + ".preloginCallBack",
su: username,
rsakt: "mod"
};

從username = sinaSSOEncoder.base64.encode(urlencode(username));可以看出來,用戶名經過了url編碼後再進行了base64轉碼,從鍵值對可以看出來su提交的就是轉碼後的賬號

使用python實現

def get_username(self):
username_quote = urllib.parse.quote_plus(self.user_name)
username_base64 = base64.b64encode(username_quote.encode("utf-8"))
return username_base64.decode("utf-8")

password轉換

下面檢索password一下子就發現了關鍵的一句話,而且明目張膽的寫著RSA

password = RSAKey.encrypt([me.servertime, me.nonce].join("\t") + "\n" + password)

要進行RSA加密需要公鑰PublicKey,檢索一下Public找到了公鑰RSAKey.setPublic(me.rsaPubkey, "10001");這樣還不夠,再找找看me.rsaPubkey是什麼東西me.rsaPubkey = result.pubkey;所以應是返回值中有帶咯,在charles裡面找一下pubkey

Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!

公鑰

這個返回值中有很多眼熟的東西servertime,nonce都在這裡面了。記下這個有用的url:

https://login.sina.com.cn/sso/prelogin.php?entry=weibo&callback=sinaSSOController.preloginCallBack&su=&rsakt=mod&client=ssologin.js(v1.4.19)&_=1536460959875

sinaSSOController.preloginCallBack
({
"retcode":0,
"servertime":1536460961,
"pcid":"gz-1c8cc52b95dad5397635083e4ddcd33994aa",
"nonce":"42KG80",
"pubkey":"EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443",
"rsakv":"1330428213",
"uid":"2239053435",
"exectime":6
})

這個url有點複雜,看到了?,&等內容使用requests裡面get的params=傳入

def get_json_data(self, su_value):
params = {
"entry": "weibo",
"callback": "sinaSSOController.preloginCallBack",
"rsakt": "mod",
"checkpin": "1",
"client": "ssologin.js(v1.4.18)",
"su": su_value,
"_": int(time.time()*1000),
}
try:
response = self.session.get("http://login.sina.com.cn/sso/prelogin.php", params=params)
json_data = json.loads(re.search(r"\((?P.*)\)", response.text).group("data"))
except Exception as excep:
json_data = {}
logging.error("WeiBoLogin get_json_data error: %s", excep)
logging.debug("WeiBoLogin get_json_data: %s", json_data)
return json_data


Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!


根據password = RSAKey.encrypt([me.servertime, me.nonce].join("\t") + "\n" + password)用python寫一下rsa加密

def get_password(self, servertime, nonce, pubkey):
string = (str(servertime) + "\t" + str(nonce) + "\n" + str(self.pass_word)).encode("utf-8")
public_key = rsa.PublicKey(int(pubkey, 16), int("10001", 16))
password = rsa.encrypt(string, public_key)
password = binascii.b2a_hex(password)
return password.decode()

文章開頭的表單的動態內容都動態獲取了接下來就是post表單提交了,然後拿一下user_uniqueid和user_nick就可以(用你的賬戶)爬必須登錄才能獲取的數據了

login_url_1 = "http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.18)&_=%d" % int(time.time())
json_data_1 = self.session.post(login_url_1, data=post_data).json()
if json_data_1["retcode"] == "0":
params = {
"callback": "sinaSSOController.callbackLoginStatus",
"client": "ssologin.js(v1.4.18)",
"ticket": json_data_1["ticket"],
"ssosavestate": int(time.time()),
"_": int(time.time()*1000),
}
response = self.session.get("https://passport.weibo.com/wbsso/login", params=params)
json_data_2 = json.loads(re.search(r"\((?P.*)\)", response.text).group("result"))
if json_data_2["result"] is True:
self.user_uniqueid = json_data_2["userinfo"]["uniqueid"]
self.user_nick = json_data_2["userinfo"]["displayname"]
logging.warning("WeiBoLogin succeed: %s", json_data_2)
else:
logging.warning("WeiBoLogin failed: %s", json_data_2)
else:
logging.warning("WeiBoLogin failed: %s", json_data_1)
return True if self.user_uniqueid and self.user_nick else False
Python實現新浪微博登陸!驗證碼?反爬?在我這裡是不存在的!

登錄結果

完整代碼來自:https://github.com/xianhu/LearnPython/blob/master/python_wechat.py

#coding=utf-8
import re
import rsa
import time
import json
import base64
import logging
import binascii
import requests
import urllib.parse
class WeiBoLogin(object):
"""
class of WeiBoLogin, to login weibo.com
"""
def __init__(self):
"""
constructor
"""
self.user_name = None
self.pass_word = None
self.user_uniqueid = None
self.user_nick = None
self.session = requests.Session()
self.session.headers.update({"User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0"})
self.session.get("http://weibo.com/login.php")
return
def login(self, user_name, pass_word):
"""
login weibo.com, return True or False
"""
self.user_name = user_name
self.pass_word = pass_word
self.user_uniqueid = None
self.user_nick = None
# get json data
s_user_name = self.get_username()
json_data = self.get_json_data(su_value=s_user_name)
if not json_data:
return False
s_pass_word = self.get_password(json_data["servertime"], json_data["nonce"], json_data["pubkey"])
# make post_data
post_data = {
"entry": "weibo",

"gateway": "1",
"from": "",
"savestate": "7",
"userticket": "1",
"vsnf": "1",
"service": "miniblog",
"encoding": "UTF-8",
"pwencode": "rsa2",
"sr": "1280*800",
"prelt": "529",
"url": "http://weibo.com/ajaxlogin.php?framelogin=1&callback=parent.sinaSSOController.feedBackUrlCallBack",
"rsakv": json_data["rsakv"],
"servertime": json_data["servertime"],
"nonce": json_data["nonce"],
"su": s_user_name,
"sp": s_pass_word,
"returntype": "TEXT",
}
# get captcha code
if json_data["showpin"] == 1:
url = "http://login.sina.com.cn/cgi/pin.php?r=%d&s=0&p=%s" % (int(time.time()), json_data["pcid"])
with open("captcha.jpeg", "wb") as file_out:
file_out.write(self.session.get(url).content)
code = input("請輸入驗證碼:")
post_data["pcid"] = json_data["pcid"]
post_data["door"] = code
# login weibo.com
login_url_1 = "http://login.sina.com.cn/sso/login.php?client=ssologin.js(v1.4.18)&_=%d" % int(time.time())
json_data_1 = self.session.post(login_url_1, data=post_data).json()
if json_data_1["retcode"] == "0":
params = {
"callback": "sinaSSOController.callbackLoginStatus",
"client": "ssologin.js(v1.4.18)",
"ticket": json_data_1["ticket"],
"ssosavestate": int(time.time()),
"_": int(time.time()*1000),
}
response = self.session.get("https://passport.weibo.com/wbsso/login", params=params)
json_data_2 = json.loads(re.search(r"\((?P.*)\)", response.text).group("result"))
if json_data_2["result"] is True:
self.user_uniqueid = json_data_2["userinfo"]["uniqueid"]
self.user_nick = json_data_2["userinfo"]["displayname"]
logging.warning("WeiBoLogin succeed: %s", json_data_2)
else:
logging.warning("WeiBoLogin failed: %s", json_data_2)
else:
logging.warning("WeiBoLogin failed: %s", json_data_1)
return True if self.user_uniqueid and self.user_nick else False
def get_username(self):

"""
get legal username
"""
username_quote = urllib.parse.quote_plus(self.user_name)
username_base64 = base64.b64encode(username_quote.encode("utf-8"))
return username_base64.decode("utf-8")
def get_json_data(self, su_value):
"""
get the value of "servertime", "nonce", "pubkey", "rsakv" and "showpin", etc
"""
params = {
"entry": "weibo",
"callback": "sinaSSOController.preloginCallBack",
"rsakt": "mod",
"checkpin": "1",
"client": "ssologin.js(v1.4.18)",
"su": su_value,
"_": int(time.time()*1000),
}
try:
response = self.session.get("http://login.sina.com.cn/sso/prelogin.php", params=params)
json_data = json.loads(re.search(r"\((?P.*)\)", response.text).group("data"))
except Exception as excep:
json_data = {}
logging.error("WeiBoLogin get_json_data error: %s", excep)
logging.debug("WeiBoLogin get_json_data: %s", json_data)
return json_data
def get_password(self, servertime, nonce, pubkey):
"""
get legal password
"""
string = (str(servertime) + "\t" + str(nonce) + "\n" + str(self.pass_word)).encode("utf-8")
public_key = rsa.PublicKey(int(pubkey, 16), int("10001", 16))
password = rsa.encrypt(string, public_key)
password = binascii.b2a_hex(password)
return password.decode()
if __name__ == "__main__":
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s\t%(levelname)s\t%(message)s")
weibo = WeiBoLogin()
weibo.login("username", "password")


分享到:


相關文章: