Skip to content

Attribute#sourceRange return null when parse body attribute #2204

@runt0

Description

@runt0

Hello
I want to parse my doc and read the attribute range, I had read the doc and the issue #1114. I found some time, the org.jsoup.nodes.Attribute#sourceRange return null. After test, I found if body tag is preceded by some tags, like p, span the org.jsoup.nodes.Attribute#sourceRange will return null,otherwise, it works normally. I don't know if this is a bug, is there some way to recognize it and get the range of attr?

  • my code:
import org.jsoup.nodes.Attribute;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.parser.Parser;

import java.io.FileInputStream;

public class TestMain {
    public static void main(String[] args) throws Exception{
        FileInputStream fis = new FileInputStream("tmp.html");
        String text = new String(fis.readAllBytes());
        Parser parser = org.jsoup.parser.Parser.htmlParser();
        parser.setTrackPosition(true);
        Document doc = Jsoup.parse(text,parser);
        for (Element element : doc.getAllElements()){
            Attribute targetAttr = element.attribute("target");
            if (targetAttr !=  null){
                System.out.println(targetAttr.sourceRange());
            }
        }
    }
}
  • tmp.html
<!DOCTYPE html>
<head></head>
<span></span>
<body target="text">
</body>
</html>
  • result
-1,-1:-1--1,-1:-1=-1,-1:-1--1,-1:-1
  • tmp.html
<!DOCTYPE html>
<head></head>
<body target="text">
</body>
</html>
  • result(expect result)
3,7:38-3,13:44=3,15:46-3,19:50

Metadata

Metadata

Assignees

Labels

bugA confirmed bug, that we should fixfixedAn {bug|improvement} that has been {fixed|implemented}

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions