🕷️ The pipeline for the OSCAR corpus
-
Updated
Dec 18, 2023 - Rust
🕷️ The pipeline for the OSCAR corpus
Parse and export the DGT-Translation Memory (DGT-TM) into an SQLite database.
Tools for processing the Corpus of Historical American English (COHA)
📔Transcribe XML formatted Android SMS export to nested TSV files. Written in Rust.
🚀 A blazing-fast Python extension (wheel) for advanced lexical dispersion metrics, ⚡ powered by Rust & PyO3.
Add a description, image, and links to the corpus-linguistics topic page so that developers can more easily learn about it.
To associate your repository with the corpus-linguistics topic, visit your repo's landing page and select "manage topics."