Skip to content

Wordcloud and classical music #27

@Merkwurdichliebe

Description

@Merkwurdichliebe

There are two issues with wordcloud when listening to classical music (cf. attached capture).

  1. Tags tend to have many short words or abbreviations which are meaningless when taken out of context. For example : "iv" (fourth movement), "ma" (as in "allegro ma non troppo"), "d" (as in "Fugue in d minor"), "op" (as in "Opus 2"), or any digit or number, which are very common (e.g. "Symphony no. 5" — also notice the frequency of "no" in the attached capture).
  2. Some terms use accented characters, mainly from French, which are not rendered properly (e.g. "étude" becomes "tude", "exécution" becomes "excution"). Although I haven't checked, this would also be the case with widely used German titles containing accented characters (e.g. "Verklärte Nacht", "Die Zauberflöte", "Götterdämmerung").

Proposed solutions:

  1. Easy: add an option to ignore numbers and words shorter than a given length. More involved: add an option to ignore italian musical terms (cf. http://www.musictheory.org.uk/res-musical-terms/italian-musical-terms.php).
  2. Add support for accented characters/Unicode.

Wonderful stuff otherwise!

cloud

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions