Skip to content

add @graph support for extracted linked data #168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 12, 2023

Conversation

julianofnascimento
Copy link
Contributor

Pull request for the isse #167.

@barrust
Copy link
Collaborator

barrust commented May 8, 2023

It looks like the try/catch is more to support what you pulled into parse_linked_data_nodes function.

Besides the basic tests already there, is there a good example that we could turn into a test?

@julianofnascimento
Copy link
Contributor Author

I made a mistake there, sorry. Now I think it's fine. All the tests worked correctly. As for a test for this particular scenario, I was looking at how the tests are structured and had some doubts about how to refactor the code to insert this new scenario. This is a example that I took from a techcrunch page, all of the elements inside the list in the "@graph" key are schema types associated to the site:

{"@context":"https://schema.org","@graph":[{"@type":"NewsArticle","@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/#article","isPartOf":{"@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/"},"author":[{"@type":"Person","name":"Taylor Hatmaker"}],"headline":"Hacked verified Facebook pages impersonating Meta are buying ads from Meta","datePublished":"2023-05-05T23:00:57+00:00","dateModified":"2023-05-06T00:01:34+00:00","mainEntityOfPage":{"@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/","@type":"WebPage"},"wordCount":598,"commentCount":0,"publisher":{"@type":"Organization","name":"TechCrunch","logo":{"@type":"imageObject","url":"https://techcrunch.com/wp-content/themes/techcrunch-2017/images/logo-json-ld.png","width":"600","height":"60"}},"image":{"@type":"ImageObject","url":"https://techcrunch.com/wp-content/uploads/2021/11/facebook-meta-rotate-pattern.jpg?w=680","width":680,"height":383},"thumbnailUrl":"https://techcrunch.com/wp-content/uploads/2021/11/facebook-meta-rotate-pattern.jpg","keywords":["Facebook","malware","meta","security"],"articleSection":["Social"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/#respond"]}],"copyrightYear":"2023","copyrightHolder":{"@id":"https://techcrunch.com/#organization"},"description":"Sketchy Facebook pages impersonating businesses are nothing new, but a flurry of recent scams is particularly brazen. A handful of verified Facebook pages were hacked recently and spotted slinging likely malware through ads approved by and purchased through the platform. But the accounts should be easy to catch — in some cases, they were impersonating […]","speakable":{"@type":"SpeakableSpecification","cssSelector":[".alpha","#speakable-summary"]}},{"@type":"WebPage","@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/","url":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/","name":"Hacked verified Facebook pages impersonating Meta are buying ads from Meta | TechCrunch","isPartOf":{"@id":"https://techcrunch.com/#website"},"primaryImageOfPage":{"@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/#primaryimage"},"image":{"@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/#primaryimage"},"thumbnailUrl":"https://techcrunch.com/wp-content/uploads/2021/11/facebook-meta-rotate-pattern.jpg","datePublished":"2023-05-05T23:00:57+00:00","dateModified":"2023-05-06T00:01:34+00:00","description":"Facebook scammers are impersonating Facebook and spreading malware by buying ads from, you guessed it, Facebook.","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https://techcrunch.com/2023/05/05/hacked-verified-facebook-pages-impersonating-meta-are-buying-ads-from-meta/#primaryimage","url":"https://techcrunch.com/wp-content/uploads/2021/11/facebook-meta-rotate-pattern.jpg","contentUrl":"https://techcrunch.com/wp-content/uploads/2021/11/facebook-meta-rotate-pattern.jpg","width":3200,"height":1800,"caption":"abstract Meta logo"},{"@type":"WebSite","@id":"https://techcrunch.com/#website","url":"https://techcrunch.com/","name":"TechCrunch","description":"Startup and Technology News","publisher":{"@id":"https://techcrunch.com/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https://search.techcrunch.com/search?p={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https://techcrunch.com/#organization","name":"TechCrunch","url":"https://techcrunch.com/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://techcrunch.com/#/schema/logo/image/","url":"https://techcrunch.com/wp-content/uploads/2018/04/tc-logo-2018-square-reverse2x.png?resize=1200,1200","contentUrl":"https://techcrunch.com/wp-content/uploads/2018/04/tc-logo-2018-square-reverse2x.png?resize=1200,1200","width":1200,"height":1200,"caption":"TechCrunch"},"image":{"@id":"https://techcrunch.com/#/schema/logo/image/"},"sameAs":["https://www.facebook.com/techcrunch","https://twitter.com/TechCrunch","https://www.linkedin.com/company/techcrunch/","https://www.reddit.com/r/techcrunch/","https://www.youtube.com/channel/UCCjyq_K1Xwfg8Lndy7lKMpA"]},{"@type":"Person","@id":"https://techcrunch.com/#/schema/person/757d241fd925ad471eae43e6da2613c4","name":"Taylor Hatmaker","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://techcrunch.com/#/schema/person/image/","url":"https://secure.gravatar.com/avatar/747309cb9c48c7ea7566f1e57a1b9298?s=96&d=identicon&r=g","contentUrl":"https://secure.gravatar.com/avatar/747309cb9c48c7ea7566f1e57a1b9298?s=96&d=identicon&r=g","caption":"Taylor Hatmaker"},"url":"https://techcrunch.com/author/taylor-hatmaker/"}]}

@barrust
Copy link
Collaborator

barrust commented May 11, 2023

@julianofnascimento not sure why the github actions isn't running the tests, but I will run it locally later today to verify. Once that is done, I can build a test based on the data you provided.

@barrust barrust merged commit 54f0f7d into goose3:master May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants