BeautifulSoup: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER
Use UnicodeDammit, more https://www.crummy.com/software/BeautifulSoup/bs4/doc/#unicode-dammit
self.bs = BeautifulSoup(
UnicodeDammit(
content,
["latin-1", "iso-8859-1", "windows-1251"]
).unicode_markup,
"html.parser")
Top comments (0)