Skip to content

Gephi XML parser fails when encountering U+0001 #1164

@DavidBruant

Description

@DavidBruant

Related to MyWebIntelligence/MyWebIntelligence#216 (comment)
This issue was found in Gephi 0.8. I haven't tried on Gephi 0.9.

Try importing the following GEXF document in Gephi :

<?xml version="1.0" encoding="UTF-8"?>
<gexf xmlns="http://www.gexf.net/1.2draft" version="1.2">
    <graph mode="static" defaultedgetype="directed">
        <attributes class="node">
            <attribute id="domain_title" title="domain_title" type="string"/>
        </attributes>
        <attributes class="edge">
            <attribute id="weight" title="weight" type="integer"/>
        </attributes>
        <nodes>
            <node id="n1" label="n1">
                <attvalues>
                    <attvalue for="domain_title" value="�a"/>
                <attvalues>
            </node>
        </nodes>
        <edges>
        </edges>
    </graph>
</gexf>

Note the funky character as attribute value, it's U+0001 (and this is where the XML parser trips).

This character is forbidden in XML 1.0 documents, but allowed in XML 1.1 documents. See https://en.wikipedia.org/wiki/Valid_characters_in_XML

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions