Skip to content

IndexOutOfBoundsException in Parser.parse() #2374

@squarejesse

Description

@squarejesse

I was using jsoup 1.21.1 to crawl some webpages and it crashed.

java.lang.IndexOutOfBoundsException: Index -1 out of bounds for length 0
	at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
	at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
	at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
	at java.base/java.util.Objects.checkIndex(Objects.java:385)
	at java.base/java.util.ArrayList.remove(ArrayList.java:551)
	at org.jsoup.parser.TreeBuilder.pop(TreeBuilder.java:163)
	at org.jsoup.parser.HtmlTreeBuilderState$8.process(HtmlTreeBuilderState.java:998)
	at org.jsoup.parser.HtmlTreeBuilder.process(HtmlTreeBuilder.java:174)
	at org.jsoup.parser.TreeBuilder.stepParser(TreeBuilder.java:124)
	at org.jsoup.parser.TreeBuilder.runParser(TreeBuilder.java:105)
	at org.jsoup.parser.TreeBuilder.parse(TreeBuilder.java:76)
	at org.jsoup.parser.Parser.parse(Parser.java:238)

Here’s the code that triggered the crash.

String s; = // see attached page.html.
String url = "https://energyinnovation.org/report/updated-economic-impacts-of-u-s-senate-passed-one-big-beautiful-bill-act-energy-provisions/"
Document document = Jsoup.parse(s, url);

Thanks for an awesome library.

page.html

Metadata

Metadata

Assignees

Labels

bugA confirmed bug, that we should fixfixedAn {bug|improvement} that has been {fixed|implemented}

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions