Skip to content

Conversation

realdoomsboygaming
Copy link
Contributor

HTML Parsing Utilities

  • getElementsByTag(html, tag)

    • Extracts all inner content of a given tag from an HTML string.
    • Example:
      const items = getElementsByTag(html, 'div'); // returns array of inner HTML for all <div> elements
  • getAttribute(html, tag, attr)

    • Extracts the value of an attribute from the first occurrence of a tag in the HTML string.
    • Example:
      const src = getAttribute(html, 'img', 'src'); // returns the src attribute of the first <img>
  • getInnerText(html)

    • Removes all HTML tags and returns the plain text content.
    • Example:
      const text = getInnerText('<p>Hello <b>world</b></p>'); // 'Hello world'

Text Extraction Helpers

  • extractBetween(str, start, end)

    • Extracts the substring between two markers.
    • Example:
      const chapter = extractBetween(html, '<h1>', '</h1>');
  • stripHtml(html)

    • Removes all HTML tags from a string.
    • Example:
      const clean = stripHtml('<div>foo</div>'); // 'foo'
  • normalizeWhitespace(str)

    • Collapses all whitespace to single spaces and trims the string.
    • Example:
      const norm = normalizeWhitespace('foo   bar\n\t baz'); // 'foo bar baz'

Encoding/Decoding Utilities

  • urlEncode(str)

    • URL-encodes a string.
    • Example:
      const encoded = urlEncode('hello world'); // 'hello%20world'
  • urlDecode(str)

    • Decodes a URL-encoded string.
    • Example:
      const decoded = urlDecode('hello%20world'); // 'hello world'
  • htmlEntityDecode(str)

    • Decodes basic HTML entities (&, <, >, ", ').
    • Example:
      const decoded = htmlEntityDecode('Tom &amp; Jerry &lt;3'); // 'Tom & Jerry <3'

@cranci1 cranci1 merged commit e449c04 into cranci1:dev Jul 12, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants