Skip to content

Submission: epubr #222

@leonawicz

Description

@leonawicz

Summary

  • What does this package do? (explain in 50 words or less):

Extract, read and parse EPUB format e-book file archive metadata and book text into tidy data frames to prepare for subsequent text analysis.

  • Paste the full DESCRIPTION file inside a code block below:
Package: epubr
Version: 0.4.0.9000
Title: Read EPUB File Metadata and Text
Description: Provides functions supporting the reading and parsing of internal e-book content from EPUB files. E-book formatting is non-standard enough across all literature that no function can curate parsed e-book content across an arbitrary collection of e-books, in completely general form, resulting in a singular, consistently formatted output containing all the same variables. EPUB file parsing functionality in this package is intended for relatively general application to arbitrary e-books.  However, poorly formatted e-books or e-books with highly uncommon formatting may not work with this package. Text is read 'as is'. Additional text cleaning should be performed by the user at their discretion, such as with functions from packages like 'tm' or 'qdap'.
Authors@R: person("Matthew", "Leonawicz", email = "mfleonawicz@alaska.edu", role = c("aut", "cre"))
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
ByteCompile: true
URL: https://github.com/leonawicz/epubr
BugReports: https://github.com/leonawicz/epubr/issues
Suggests:
    testthat,
    knitr,
    rmarkdown,
    lintr,
    covr,
    readr
Imports: 
    xml2,
    xslt,
    magrittr,
    dplyr,
    purrr,
    tidyr
VignetteBuilder: knitr
RoxygenNote: 6.0.1
  • URL for the package (the development repository, not a stylized html page):

https://github.com/leonawicz/epubr

  • Please indicate which category or categories from our package fit policies this package falls under *and why

Data extraction. This package focuses on importing EPUB file metadata and data into R in a useful format. It strips xml tags and formatting to focus on the readable, meaningful text. While future package versions will expand functionality around the edges, the core purpose will remain the data extraction described.

  •   Who is the target audience and what are scientific applications of this package?  

Data analysts or researchers performing text mining and other language analysis involving individual books or book collections in a typical, unrestricted EPUB file format.

I have not found other R packages that do this.

This package is already on CRAN (v0.4.0).

  •   If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.

Requirements

Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • has a CRAN and OSI accepted license.
  • contains a README with instructions for installing the development version.
  • includes documentation with examples for all functions.
  • contains a vignette with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration, including reporting of test coverage, using services such as Travis CI, Coveralls and/or CodeCov.
  • I agree to abide by ROpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

Publication options

  • Do you intend for this package to go on CRAN?
  • Do you wish to automatically submit to the Journal of Open Source Software? If so:
    • The package has an obvious research application according to JOSS's definition.
    • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
    • The package is deposited in a long-term repository with the DOI:
    • (Do not submit your package separately to JOSS)
  • Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
    • The package is novel and will be of interest to the broad readership of the journal.
    • The manuscript describing the package is no longer than 3000 words.
    • You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
    • (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no gaurantee that your manuscript willl be within MEE scope.)
    • (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
    • (Please do not submit your package separately to Methods in Ecology and Evolution)

Detail

  • Does R CMD check (or devtools::check()) succeed? Paste and describe any errors or warnings:

  • Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:

To the best of my knowledge, though it's quite possible I missed something small or have something not exactly as required.

  • If this is a resubmission following rejection, please explain the change in circumstances:

  • If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions