DEV Community

Ryan Xu
Ryan Xu

Posted on

Building a Smart Editor: Automatically Detect URLs and Convert Them to Hyperlinks

This is an idea I came up with at work to improve the user experience. It involves implementing a text box that automatically detects URLs and converts them into hyperlinks as the user types(Source code Github/AutolinkEditor). This cool feature is somewhat tricky to implement, and the following issues must be addressed.

  • Accurately detect URLs within the text
  • Maintain the cursor position after converting the URL string into a hyperlink
  • Update the target URL accordingly when users edit the hyperlink text
  • Preserve line breaks in the text
  • Support pasting rich text while retaining both text and line breaks, with the text style matching the format of the text box.

Image description

...
 if(target && target.contentEditable){
  ...
  target.contentEditable = true;
  target.focus();
 }
...
Enter fullscreen mode Exit fullscreen mode

The conversion is driven by “onkeyup” and “onpaste” events. To reduce the frequency of conversions, a delay mechanism is implemented with “setTimeout”, where the conversion logic is triggered only after the user stops typing for 1 second by default.

idle(func, delay = 1000) {
      ...
      const idleHandler = function(...args) {
        if(this[timer]){
          clearTimeout(this[timer]);
          this[timer] = null;
        }
        this[timer] = setTimeout(() => {
          func(...args);
          this[timer] = null;
        }, delay);

      };
      return idleHandler.bind(this);
    }
Enter fullscreen mode Exit fullscreen mode

Identify and extract URLs with regular expression

I didn’t intend to spend time crafting the perfect regex for matching URLs, so I found a usable one via a search engine. If anyone has a better one, feel free to let me know!

...
const URLRegex = /^(https?:\/\/(([a-zA-Z0-9]+-?)+[a-zA-Z0-9]+\.)+(([a-zA-Z0-9]+-?)+[a-zA-Z0-9]+))(:\d+)?(\/.*)?(\?.*)?(#.*)?$/;
const URLInTextRegex = /(https?:\/\/(([a-zA-Z0-9]+-?)+[a-zA-Z0-9]+\.)+(([a-zA-Z0-9]+-?)+[a-zA-Z0-9]+))(:\d+)?(\/.*)?(\?.*)?(#.*)?/;
...

if(URLRegex.test(text)){
  result += `<a href="${escapeHtml(text)}">${escapeHtml(text)}</a>`;
}else {
  // text contains url
  let textContent = text;
  let match;
  while ((match = URLInTextRegex.exec(textContent)) !== null) {
    const url = match[0];
    const beforeUrl = textContent.slice(0, match.index);
    const afterUrl = textContent.slice(match.index + url.length);

    result += escapeHtml(beforeUrl);
    result += `<a href="${escapeHtml(url)}">${escapeHtml(url)}</a>`;
    textContent = afterUrl;
  }
  result += escapeHtml(textContent); // Append any remaining text
}
Enter fullscreen mode Exit fullscreen mode

Restoring the cursor position after conversion

With document.createRange and window.getSelectionfunctions, calculate the cursor position within the node’s text. Since converting URLs into hyperlinks only adds tags without modifying the text content, the cursor can be restored based on the previously recorded position.

saveSelection: target => {
      ...
          const range = window.getSelection().getRangeAt(0);
          var preSelectionRange = range.cloneRange();
          preSelectionRange.selectNodeContents(target);
          preSelectionRange.setEnd(range.startContainer, range.startOffset);
          var start = preSelectionRange.toString().length;

          return {
              start: start,
              end: start + range.toString().length
          }
        ...
    }
...
restoreSelection: (target, position) => {
     ...
        const range = document.createRange();
        range.setStart(target, 0);
        range.collapse(true);

        let foundStart = false;
        let stop = false;
        let node;
        let charIndex = 0;
        const nodeStack = [target];
        while (!stop && (node = nodeStack.pop())) {
          switch (node.nodeType) {
            case 1: // element
              for (let i = node.childNodes.length - 1; i >= 0; i--) {
                nodeStack.push(node.childNodes[i]);
              }
              break;
            case 3: // text
              const nextCharIndex = charIndex + node.length;
              if (!foundStart && position.start >= charIndex && position.start <= nextCharIndex) {
                range.setStart(node, position.start - charIndex);
                foundStart = true;
              }
              if (foundStart && position.end >= charIndex && position.end <= nextCharIndex) {
                range.setEnd(node, position.end - charIndex);
                stop = true;
              }
              charIndex = nextCharIndex;
              break;
          }

        if(foundStart) {
          const selection = window.getSelection();
          selection.removeAllRanges();
          selection.addRange(range);
        }
  ...
    }
...
Enter fullscreen mode Exit fullscreen mode

For more details, please read Can’t restore selection after HTML modify, even if it’s the same HTML.

Update or remove when editing hyperlink

Sometimes we create hyperlinks where the text and the target URL are the same(called ‘simple hyperlinks’ here). For example, the following HTML shows this kind of hyperlink.

<a href="http://www.example.com">http://www.example.com</a>
For such links, when the hyperlink text is modified, the target URL should also be automatically updated to keep them in sync. To make the logic more robust, the link will be converted back to plain text when the hyperlink text is no longer a valid URL.

handleAnchor: anchor => {
  ...
    const text = anchor.textContent;
    if(URLRegex.test(text)){
      return nodeHandler.makePlainAnchor(anchor);
    }else {
      return anchor.textContent;
    }
  ...
}
...
makePlainAnchor: target => {
  ...
  const result = document.createElement("a");
  result.href = target.href;
  result.textContent = target.textContent;
  return result;
  ...
}
Enter fullscreen mode Exit fullscreen mode

To implement this feature, I store the ‘simple hyperlinks’ in an object and update them in real-time during the onpaste, onkeyup, and onfocus events to ensure that the above logic only handles simple hyperlinks.

target.onpaste = initializer.idle(e => {
  ...
  inclusion = contentConvertor.indexAnchors(target);
}, 0);

const handleKeyup = initializer.idle(e => {
  ...
  inclusion = contentConvertor.indexAnchors(target);
  ...
}, 1000);

target.onkeyup = handleKeyup;
target.onfocus = e => {
  inclusion = contentConvertor.indexAnchors(target);
}

...

indexAnchors(target) {
  const inclusion = {};
  ...
  const anchorTags = target.querySelectorAll('a');
  if(anchorTags) {
    const idPrefix = target.id === "" ? target.dataset.id : target.id;

    anchorTags.forEach((anchor, index) => {
      const anchorId = anchor.dataset.id ?? `${idPrefix}-anchor-${index}`;
      if(anchor.href.replace(/\/+$/, '').toLowerCase() === anchor.textContent.toLowerCase()) {
        if(!anchor.dataset.id){
          anchor.setAttribute('data-id', anchorId);
        }
        inclusion[[anchorId]] = anchor.href;
      }
    });
  }
  return Object.keys(inclusion).length === 0 ? null : inclusion;
  ...
}
Enter fullscreen mode Exit fullscreen mode

Handle line breaks and styles

When handling pasted rich text, the editor will automatically style the text with the editor’s text styles. To maintain formatting,
tags in the rich text and all hyperlinks will be preserved. Handling input text is more complex. When the user presses Enter to add a new line, a div element is added to the editor, which the editor replaces with a
to maintain the formatting.

node.childNodes.forEach(child => {
  if (child.nodeType === 1) { 
    if(child.tagName === 'A') { // anchar element
      const key = child.id === "" ? child.dataset.id : child.id;

      if(inclusion && inclusion[key]){
        const disposedAnchor = handleAnchor(child);
        if(disposedAnchor){
          if(disposedAnchor instanceof HTMLAnchorElement) {
            disposedAnchor.href = disposedAnchor.textContent;
          }
          result += disposedAnchor.outerHTML ?? disposedAnchor;
        }
      }else {
        result += makePlainAnchor(child)?.outerHTML ?? "";
      }
    }else { 
      result += compensateBR(child) + this.extractTextAndAnchor(child, inclusion, nodeHandler);
    }
  } 
});

...
const ElementsOfBR = new Set([
  'block',
  'block flex',
  'block flow',
  'block flow-root',
  'block grid',
  'list-item',
]);
compensateBR: target => {
  if(target && 
    (target instanceof HTMLBRElement || ElementsOfBR.has(window.getComputedStyle(target).display))){
      return "<br />";
  }
  return "";
}
Enter fullscreen mode Exit fullscreen mode

Conclusions

This article describes some practical techniques used to implement a simple editor, such as common events like onkeyup and onpaste, how to use Selection and Range to restore the cursor position, and how to handle the nodes of an element to achieve the editor's functionality. While regular expressions are not the focus of this article, a complete regex can enhance the editor's robustness in identifying specific strings (the regex used in this article will remain open for modification). You can access the source code via Github/AutolilnkEditor to get more details if it is helpful for your project.

Top comments (0)