Finding potential Yuletide offers with Python
I’m doing the same thing for my /yuletide signup as last year, i.e. using the event as an excuse to find myself a new VN to buy and play over the course of November/December. Doing this last year got me a new blorbo, so it was successful on all fronts really.
Helpfully all the VNs in the tagset are marked with “(Visual Novel)” or something similar, so I wrote a Python script that gets them from a downloaded copy of the tagest page, then queries the VNDB API based on my inclusion criteria.
I need the following libraries: re for trimming the canon names from AO3; requests for making the API calls; BeautifulSoup for parsing the tagset HTML:
import re,requests from bs4 import BeautifulSoup
I don’t want to have to play anything massively long as I’m on a deadline, so I’ll restrict it to VNs that are 45 hours or less.
maxhours = 45
Firstly extracting anything as a VN from the tagset and then stripping the title to just one of the piped alternatives where they exist:
with open("tagset.html","r") as tagset: tagsoup = BeautifulSoup(tagset,"html.parser") fandoms = tagsoup.find_all("li",class_="fandom") vns = [] for fandom in fandoms: title = fandom.find("h4") if "Visual Novel" in title.text: thetitle = re.sub(" \(.*Visual Novel.*\)\n.*\n.*","",re.sub(".* \| ","",title.text.strip())) vns.append(thetitle)
Then looking each of these up via the API and applying a few filters using the syntax it provides: I want VNs available in English, originally in Japanese, and available on one of the platforms I have convenient access to at the moment (this doesn’t include PC, so a lot are excluded including apparently any BL VNs, unfortunately). After getting each result I exclude it if it’s over the 45 hours – this can’t be done when the initial search is performed as there’s no granular filter for length. I include the match from the tagset as part of the returned result to flag any cases where the search string has returned a false positive.
final = [] for vn in vns: x = requests.post("https://api.vndb.org/kana/vn",json={"filters":["and",["lang","=","en"],["search","=",vn],["olang","=","ja"],["or",["platform","=","ps4"],["platform","=","ps5"],["platform","=","swi"]]],"fields":"title, length_minutes, id"}) try: for result in x.json()["results"]: if result["length_minutes"] < (maxhours * 60): final.append(result["title"] + ": https://vndb.org/" + result["id"] + " (" + vn + ")") except requests.exceptions.JSONDecodeError: pass
Then removing duplicates:
final = sorted(list(dict.fromkeys(final))) for vn in final: print(vn)
Once I’d taken out the false positives, I was left with about 19 VNs, which I looked up on VNDB to find the ten that seemed most appealing. I’ll make those ten my Yuletide offer and then buy + play whichever ends up being my assignment.
