DEV Community

WangLiwen
WangLiwen

Posted on

JavaScript Magic Tricks: Html Encryption

Goal

The effect to achieve in this article is to encrypt the HTML source code and ensure that the encrypted HTML can still be used normally when viewed as plain text. However, there seems to be a flaw in this approach because it only protects against "viewing the source code," but there are often "inspectors" in browser developer tools that can view the parsed HTML code. As long as the HTML code can be parsed, the parsed code can still be seen. Therefore, an additional protection measure is needed to hide the links.

Principle

This is achieved using JavaScript programming. First, the original HTML code is encoded using escape and saved as ciphertext for display. When the page loads, it is decoded using unescape and written to the webpage using document.write, thus achieving the same display as before encryption.
Then, all links on the page are extracted and the href attribute is saved in memory, while the attribute value is cleared. This is an important step that encrypts the links during rendering of the webpage, so even in the browser's developer tools, the link addresses cannot be found.
After integrating these features, both encrypted HTML source code and hidden links are achieved.

Source

(function(){
    //html源码
    var html_source = `
    <html>
        <head>
            <title>html加密</title>
        </head>
        <body>
            <h2>DEMO</h2>
            <p>html加密测试</p>
            <a href="http://www.jshaman.com/">js混淆加密</a>
        </body>
    </html>
    `;

    //html加密
    var encode_html_source = "<!DOCTYPE html>\n" ;
    encode_html_source += "<script>" + "document.write(unescape(\"" + escape(html_source) + "\"));\n" + "</script>";

    //链接加密
    var link_encode_code = `
    function link_encode(){
        //全局变量,存储全部链接的href
        var pre_href=[];

        //清空链接的href,使爬虫无法获取链接
        var link = document.getElementsByTagName("a");
        for(var i=0; i<link.length; i++){

            console.log(i, link[i].href,"已加密此链接")
            pre_href[i] = link[i].href;
            link[i].href= ""

            //添加新属性,存放id号,后续要根据id解密href
            link[i].setAttribute("decode_id", i);

            //获取之前的onfocus事件
            var pre_onfocus = link[i].onfocus;

            //注册click事件处理程序,即onfocus
            link[i].addEventListener("focus",function(){

                restore_href(this)
                //如果之前有onfocus事件处理程序,则先执行
                if(pre_onfocus!=undefined){
                    pre_onfocus;
                }

            });
        }

        //还原href,使链接可打开
        function restore_href(t){
            t.href = pre_href[t.getAttribute("decode_id")];
        }

    }
    link_encode();
    `;

    encode_html_source += "<script>" + link_encode_code + "</script></html>";
    console.log(encode_html_source);
})()
Enter fullscreen mode Exit fullscreen mode

Execute

Image description

As shown in the above image, running this code in a Node environment outputs the encrypted code for the preset HTML code in the program. Save it as an HTML file and open it, the effect will be as follows:

Image description

Firstly, the encrypted HTML code can be used normally. Secondly, it is also shown that the links have been encrypted. If you view the links again:

Image description

From the above image, it can be seen that the href of the link is empty. Note that even though the href is empty, the link can still be opened, as explained in detail in the source code in the previous paragraph.
However, in the original HTML file, the href has content, as shown in the following image:

Image description

As can be seen, the link has been hidden and encrypted. However, there is still a drawback at this point - the encrypted code for the links on the page is still public, and its functionality logic can be understood by others, as shown in the following image:

Image description

To address this issue, we can use JShaman or JScrambler to obfuscate this section of JS code, resulting in the encrypted code shown in the following image. Afterwards, we can further encrypt the HTML code. The overall code generated from the encryption process will be secure and unreadable.

Image description

Note: This is the encryption of the JS code in the source code of this function, not the generation of the JS code in the encrypted Html code.

Summary

This technology focuses on encrypting HTML source code, but its more useful feature is the encryption of links that it provides. This solution can be used to prevent link-based crawlers from being able to extract links from a page, thereby preventing automated tools from obtaining the links at various levels and crawling the entire website content.

Top comments (0)