Discussion on: Pwned Together: Hacking dev.to

View post

@ben , isn't there an URI class or something like that in Ruby? I think it should handle parsing links much better than custom regex.

rhymes • Aug 31 '18

Probably you can slightly simplify the code using URI::regexp, is this what you mean?

Sergey Kislyakov • Aug 31 '18

Kinda. I think Ruby provides something like:

# pseudocode
link = URI.parse "http://evil.com/#gist.github.com/whatever"
link.host # evil.com

Andy Zhao (he/him) • Aug 31 '18 • Edited

This would work for preventing non-gist-github.com hosts, but I think we strayed away from this because it wouldn't prevent Bypass 2, where JS is injected via a raw gist link.

We could do something like this:

URI.parse(link).host == "gist.github.com" &&
  (link =~ /^https\:\/\/gist\.github\.com\/([a-zA-Z0-9\-]){1,39}\/([a-zA-Z0-9]){32}\s/)
    &.zero?

I think with the regex though it would be redundant to check the host.

Josh Cheek • Sep 3 '18 • Edited

How about something like this?

require 'uri'

def valid?(link)
  uri = URI.parse link
  return false if uri.scheme != 'https'
  return false if uri.userinfo
  return false if uri.host != 'gist.github.com'
  return false if uri.port != 443 # I think it has to be this if its https?
  return false if uri.fragment
  return false if uri.query

  # idk if old gist ids could be arbitrary chars,
  # but the 10 or so I looked at all seemed to be hex
  path, gist = File.split uri.path
  return false unless gist && gist.match?(/\A[0-9a-f]{32}\z/)

  path, user = File.split path
  return false unless user && user.match?(/\A[a-zA-Z0-9\-]{1,39}\z/)

  path == '/'
end

Plus, a bunch of test cases:

valid = [
  # suggested one
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # max name size (based on the existing regex, I didn't verify that it matches Github's rules)
  "https://gist.github.com/aaaaaaaaaabbbbbbbbbbccccccccccddddddddd/4bb1682ce590dc42402b2edddbca7aaa",
  # min name size
  "https://gist.github.com/a/4bb1682ce590dc42402b2edddbca7aaa",
  # each of the gist id char values (seems to be hex)
  "https://gist.github.com/a/0123456789abcdef0123456789abcdef",
  # user can have a dash in their name
  "https://gist.github.com/quincy-larson/4bb1682ce590dc42402b2edddbca7aaa",
  # just for contrast with some of the tests below
  "https://gist.github.com/a/0000000000111111111122222222223a",
]

invalid = [
  # the same as the valid one, except it's http. I actually thought browsers
  # would refuse to make http requests from https sites, but either way, it
  # should be disallowed as it's MITMable
  "http://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # host
  "https://evil.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # auth credentials
  "https://user:pass@gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # http port (not totally sure this matters)
  "https://gist.github.com:80/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa",
  # query
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa?q",
  # fragment
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa#f",
  # this one exceeds the max name size
  "https://gist.github.com/aaaaaaaaaabbbbbbbbbbccccccccccddddddddde/4bb1682ce590dc42402b2edddbca7aaa",
  # less than min name size
  "https://gist.github.com//4bb1682ce590dc42402b2edddbca7aaa",
  # invalid gist id chars (outside the hex range)
  "https://gist.github.com/a/0000000000111111111122222222223x",
  # quick test showed that GH was case sensitive about the hex in the gist id
  "https://gist.github.com/a/0000000000111111111122222222223A",
  # too few path segments
  "https://gist.github.com",
  "https://gist.github.com/",
  "https://gist.github.com/a",
  # too many path segments
  "https://gist.github.com/QuincyLarson/4bb1682ce590dc42402b2edddbca7aaa/4bb1682ce590dc42402b2edddbca7aaa",
  # these ones were used to bypass validity in the blog
  "//evil.com/script#gist.github.com",
  "https://gist.github.com@evil.com/",
  "https://gist.github.com/AntonyGarand/a8a0b4a36a040edc6051e888afce8fab/raw/4deb366ddaf0597e82fea808f7f4cb3ad763d98f/poc.js",
]

valid.all?    { |link| valid? link } # => true
invalid.none? { |link| valid? link } # => true

Josh Cheek • Sep 3 '18

Although, we should probably check whether you're allowed to have unicode in your username 🤔

Andy Zhao (he/him) • Oct 12 '18

@joshcheek gonna take that valid? method for checking giphy links. :)

Andy Zhao (he/him) • Oct 12 '18

Also that's pretty great. Might end up implementing it for the gist Liquid tag.