Skip to content

Rack::Request#host returns truncated data for domain names containing underscores, other characters #1604

@pvande

Description

@pvande

In Rack 2.1.2, Rack::Request#host did no hostname validation, returning everything from #host_with_port up to the colon.

      def host
        # Remove port number.
        h = host_with_port
        if colon_index = h.index(":")
          h[0, colon_index]
        else
          h
        end
      end

This did not support IPv6 well, so #1561 was implemented to solve that. This changed the implementation of #host to one that relied on regular expressions. This was released in Rack 2.2.0.

      def host
        split_authority(self.authority)[0]
      end

      def authority
        forwarded_authority || host_authority || server_authority
      end

      def forwarded_authority
        if value = get_header(HTTP_X_FORWARDED_HOST)
          wrap_ipv6(split_header(value).first)
        end
      end

      def host_authority
        get_header(HTTP_HOST)
      end

      def server_authority
        host = self.server_name
        port = self.server_port

        if host
          if port
            "#{host}:#{port}"
          else
            host
          end
        end
      end

      def split_authority(authority)
        if match = AUTHORITY.match(authority)
          if address = match[:ip6]
            return match[:host], address, match[:port]&.to_i
          else
            return match[:host], match[:host], match[:port]&.to_i
          end
        end

        # Give up!
        return authority, authority, nil
      end

      AUTHORITY = /
        # The host:
        (?<host>
          # An IPv6 address:
          (\[(?<ip6>.*)\])
          |
          # An IPv4 address:
          (?<ip4>[\d\.]+)
          |
          # A hostname:
          (?<name>[a-zA-Z0-9\.\-]+)
        )
        # The optional port:
        (:(?<port>\d+))?
      /x

This provided not only better IPv6 support, but also did some validation of hostnames, as per RFC 952.

It was soon discovered that this was improperly parsing hostnames that were valid under the more lenient RFC 1123 (permitting hostnames to begin with digits and hyphens) as IP addresses (#1590), so in #1591, the regular expression was updated to match [an entire line of] the authority. This was released as Rack 2.2.2.

      AUTHORITY = /^
        # The host:
        (?<host>
          # An IPv6 address:
          (\[(?<ip6>.*)\])
          |
          # An IPv4 address:
          (?<ip4>[\d\.]+)
          |
          # A hostname:
          (?<name>[a-zA-Z0-9\.\-]+)
        )
        # The optional port:
        (:(?<port>\d+))?
      $/x

The Issue

While the match being performed in the AUTHORITY regex is technically correct for hostnames, it is not correct for DNS names (as per RFC 2181, Section 11). This also causes issues in environments like Docker, where some tools (e.g. Docker Compose) implicitly create hostnames containing underscores. These hostnames are being truncated in Rack 2.2.0, and being left unprocessed in Rack 2.2.2.

Suggestions

Given RFC 2181 relaxes all restrictions on DNS names (except segment and total length), it seems unnecessarily risky to use anything but a wildcard match in that position. IP addresses, however, have a very predictable form, and regular expressions for matching them correctly are already a part of the Ruby standard library (see Resolv::IPv4::Regex and Resolv::IPv6::Regex) in every supported version of Ruby.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions