目录
获取heder(手动)
转到postman(获取你的header)(自动?)
代码
报错说明(一点点)
获取到感兴趣的商品后
参考:
魔力赏市集搜索?使用python去以一个合适的价格搜到想要的手商品_哔哩哔哩_bilibili
写给自己看的,不回答问题
改进:
1. 添加了筛选词 exclude_words
2. 避免重复打印相同条目
3. 添加打印循环次数,让你知道程序还在跑(2023.12.20优化了显示效果,实现了直接刷新)
4. 简化打印,只打印id,名称,价格
5. 优化异常捕捉,跑太多会触发哔哩哔哩安全风控策略
6. 请gpt用Thread加速,不知道有没有用
获取heder(手动)
进入网址https://mall.bilibili.com/neul-next/index.html?page=magic-market_index
选择“网络”,此时会发现是空白的,点击刷新,才会刷新出信息
选择“Fetch/XHR”,右键list
选择复制cURL(bash)
转到postman(获取你的header)(自动?)
在My Workspace里点击加号
https://mall.bilibili.com/mall-magic-c/internet/c2c/v2/list
粘贴刚才复制的cURL(bash),(好像直接复制上面这句即可)点击Send
选择Python-request,复制postman提供的代码,你就知道自己的header
代码
以下代码不能直接运行,你要填充自己的header
from time import sleepimport requestsimport jsonfrom concurrent.futures import ThreadPoolExecutorimport sysdef fetch_data(payload, headers): response = requests.post(url, headers=headers, data=payload) response.raise_for_status() return response.json()def process_item(item): c2c_items_id = item["c2cItemsId"] c2c_items_name = item["c2cItemsName"] show_price = item["showPrice"] if c2c_items_id not in seen_c2c_items_ids: seen_c2c_items_ids.add(c2c_items_id) if not any(exclude_word in c2c_items_name for exclude_word in exclude_words): print(f"\nid: {c2c_items_id}, 名称: {c2c_items_name}, 价格: {show_price}")url = "https://mall.bilibili.com/mall-magic-c/internet/c2c/v2/list"i_want = []keyword = "Thea"exclude_words = ["景品", "早濑优香", "头盔", "英灵旅装", "二次再版", "拉芙塔莉雅", "索米", "天笠缀", "镜华", "宇崎学妹想要玩", "宇崎酱想要玩耍", "炼狱杏寿郎", "噬血代码", "我推的孩子", "春野樱", "战双帕弥什", "孤独摇滚", "后藤独", "嘉然", "路人女主的养成方法", "赵灵儿", "间谍教室", "从零开始的异世界生活", "酒会观测者", "露西亚", "亚丝娜", "绯染天空", "游戏人生", "奥丁领域", ] # 不想要的关键词seen_c2c_items_ids = set()count = 0minutes = 0# Set up the ThreadPoolExecutorwith ThreadPoolExecutor(max_workers=5) as executor: while True: payload = json.dumps({ "priceFilters": ["40000-90000"], # 价格筛选,这里是400~900元,真实价格后面要加两个0 "categoryFilter": "2312", # 类型筛选。手办:2312 模型:2066 周边:2331 3C数码:2273 # "discountFilters": ["50-70"], # 折扣筛选。 "sortType": "TIME_DESC", "nextId": None }) headers = { # ... (你的header) } try: response = executor.submit(fetch_data, payload, headers) response = response.result() # Retrieve the result of the thread execution if response is not None: nextId = response["data"]["nextId"] items = response["data"]["data"] executor.map(process_item, items) if nextId is None: break except TypeError as e: if "'NoneType' object is not subscriptable" in str(e): sleep(2) else: print("新错误") raise # 抛出异常 except requests.exceptions.HTTPError as http_err: if http_err.response.status_code == 412: print("错误号:412 由于触发哔哩哔哩安全风控策略,该次访问请求被拒绝。") if minutes % 1 == 0: print(f"睡了{minutes}分钟") sleep(30) minutes += 0.5 except requests.exceptions.JSONDecodeError: print("输入参数错误,看起来是触发了哔哩哔哩安全风控策略") break count += 1 sys.stdout.write("\r---{:03d}---".format(count)) sys.stdout.flush()print("==============获取完成================")if not i_want: print("没有找到")else: print("\n找到期望商品:") print(i_want) cheap = min(i_want, key=lambda x: x["price"]) print("\n找到便宜好货:") print(cheap)
打印结果示例(现版本策略)
打印结果示例(旧版策略)
报错说明(一点点)
错误号:412 由于触发哔哩哔哩安全风控策略,该次访问请求被拒绝。
解决方法:
使用魔法改变IP(缺点:可能网络不稳定,会导致报错)多sleep,比如每50次sleep30秒?不知道引发风控的点在哪里获取到感兴趣的商品后
复制id,在网址的itemsId替换为复制的id即可https://mall.bilibili.com/neul-next/index.html?page=magic-market_detail&noTitleBar=1&itemsId=【你的id】&from=market_index