Skip to content

Conversation

hugoabonizio
Copy link
Contributor

This PR adds Portuguese support for readability score.

Description

Since spaCy supports Portuguese, it's trivial to add support on textacy. The Flesch-Kincaid has its weights modified following those references:

Motivation and Context

Portuguese is already supported by spaCy and lacking on textacy.

How Has This Been Tested?

To test it I added another entry to assert on test_flesch_reading_ease_langs test.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation, and I have updated it accordingly.

@bdewilde
Copy link
Collaborator

Hi @hugoabonizio , thanks for the PR. Code looks correct to me, although I'm surprised that the Portuguese equivalent is just a 42-point shift in value. 🤷‍♂ I'm going to try to merge this into the develop rather than master branch, since that's where I try to keep new features before cutting a new release.

Thanks again for the submission!

@bdewilde bdewilde changed the base branch from master to develop July 30, 2019 17:22
@bdewilde bdewilde merged commit 7395f1a into chartbeat-labs:develop Jul 30, 2019
@hugoabonizio
Copy link
Contributor Author

Hi @bdewilde, sorry for the confusion on branches. This shift seems that was developed in the following work and used since then.

MARTINS, T. B. et al. Readability formulas applied to textbooks in Brazilian Portuguese. [S.l.]: Icmsc-Usp, 1996.

Thanks for merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants