DEV Community

Cover image for BeautifulSoup: REPLACEMENT CHARACTER
YURII DE.
YURII DE.

Posted on • Edited on

13 3

BeautifulSoup: REPLACEMENT CHARACTER

BeautifulSoup: Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER

Use UnicodeDammit, more https://www.crummy.com/software/BeautifulSoup/bs4/doc/#unicode-dammit

self.bs = BeautifulSoup(
    UnicodeDammit(
        content, 
        ["latin-1", "iso-8859-1", "windows-1251"]
    ).unicode_markup,
    "html.parser")
Enter fullscreen mode Exit fullscreen mode

Top comments (0)

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more