-
Notifications
You must be signed in to change notification settings - Fork 576
Description
I have been trying different variations to parse a JSON-LD file into a Graph, but they're all failing.
The file seems OK (I tried several) and it parses ok with the JSON-LD playground. I tried a few variations for invoking the parser.
This was after entirely nuking and reinstalling Python/Anaconda, and was in a fresh Conda environment (python=3.8), and with only "pip3 install rdflib", i.e. no ageing version of the plugin version of the parser hanging around.
parsejsonld_A.py
#!/usr/bin/env python3
from rdflib import Graph
if __name__ == '__main__':
fn = "example1.jsonld"
g = Graph()
g.parse(fn, format="json-ld")
parsejsonld_B.py
#!/usr/bin/env python3
from rdflib import Graph
g = Graph().parse("example1.jsonld", format="json-ld")
g.serialize("test-jsonld.nt", format="nt")
parsejsonld_A.py
#!/usr/bin/env python3
from rdflib import Graph
g = Graph()
g.parse(location = "file:feedkgx/example1.jsonld")
print(len(g))
The example file is just taken from Google documentation, see this Gist.
In each case I get this response:
./parsejsonld_A.py
Traceback (most recent call last):
File "./parsejsonld_A.py", line 8, in <module>
g.parse(fn, format="json-ld")
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/graph.py", line 1258, in parse
parser.parse(source, self, **args) # type: ignore[call-arg]
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/parsers/jsonld.py", line 125, in parse
to_rdf(data, conj_sink, base, context_data, version, generalized_rdf)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/parsers/jsonld.py", line 144, in to_rdf
return parser.parse(data, context, dataset)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/parsers/jsonld.py", line 164, in parse
context.load(local_context, context.base)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/context.py", line 357, in load
self._prep_sources(base, source, sources, referenced_contexts)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/context.py", line 381, in _prep_sources
new_ctx = self._fetch_context(
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/context.py", line 413, in _fetch_context
source = source_to_json(source_url)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/site-packages/rdflib/plugins/shared/jsonld/util.py", line 43, in source_to_json
return json.load(use_stream)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/anaconda3/envs/feedkgx/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)
As far as I can tell this is this something to do with the Schema.org @context URL, and our migration from http://schema.org/ + conneg, to https://schema.org/ and a JSON-LD 1.1-style HTTP header as the discovery mechanism for the context? But the error message is pretty uninformative.
If I change the schema.org context in the files to avoid a remote context, it parses.
The context lives here:
curl -s --head https://schema.org/ | grep 'link:'
link: </docs/jsonldcontext.jsonld>; rel="alternate"; type="application/ld+json"
Would a PR be welcomed on this?
e.g.
- clearer error message
- treat context as if it were written like this:
"@context": { "@vocab": "https://schema.org/" }, - or actually fetch the context doc via the link: header mechanism, as JSON-LD playground seems to do.
Related discussion: schemaorg/schemaorg#2578