Skip to content

Conversation

jglogan
Copy link
Contributor

@jglogan jglogan commented May 6, 2021

Description

This PR fixes issue #927 with the following changes:

  • This enhancement defines parser() and setParser() methods on the Column interface, and adds a parser argument to the AbstractColumn constructor.
  • All column types have been updated to call super with the type-appropriate DEFAULT_PARSER, to preserve functional compatibility with prior releases.
  • All column appendCell() methods now parse input using the parser field value, instead of a hardcoded DEFAULT_PARSER.

Testing

Added unit tests for simple override of parsing strategy for each column type.

@jglogan
Copy link
Contributor Author

jglogan commented May 6, 2021

Notes:

  • My code formatter got a little busy adding @Override, and it removed a couple casts it deemed unnecessary. I have no problem backing out that diff clutter if you prefer!
  • In the discussion on the previous PR, @benmccann mentioned adding custom parser arguments to create() calls. I have not yet done this, for two reasons.
    • First, I wasn't sure which create calls - Column, ColumnType, both, and on which calls? Looking at all the variants made me worry about method proliferation, so I wanted to hold off on any of that work until after review and discussion.
    • Second, it seemed to me that method chaining after construction might be sufficient to support all use cases without adding create() methods. We could do this PR that allows custom parsing via method chaining, and if we need parser in a create() method, we can follow on with another PR.

@lwhite1
Copy link
Collaborator

lwhite1 commented May 8, 2021

Looks pretty good to me. In hindsight, we should have provided a builder interface rather than all these create methods. My biggest concern with this code is (a) did you miss any create methods, so that the parser may be null at some point, and (b) will someone do that in the future when adding or modifying a create method. I agree that adding custom parser options to each create is overkill.

@jglogan
Copy link
Contributor Author

jglogan commented May 10, 2021

I just resurveyed the code and as best as I can tell:

  • the ColumnType.create(name) methods all call the corresponding Column.create(name)
  • all of the Column.create(name) methods either call a public constructor that calls a protected/private constructor, or call a protected/private constructor directly
  • all Column classes define one or more protected/private constructors that call super() with the default parser for the column type
  • AbstractColumn has a single constructor which requires a parser to be supplied; this constructor invokes setParser(parser) to initialize the field

So AFAICT there is no way to create a column with anything but the default parser for the column type. The only way to get a null parser into a column today would be to invoke column.setParser(null).

One precaution we could take is to add a preconditions.checkNotNull() to setParser() in AbstractColumn, which would cause any attempt to initialize or update a column with a null parser to fail fast.

This behavior would be approximately equivalent to what I believe would happen today should someone try to create or update a column with a null name, as such an operation would fail with an NPE when attempting to trim the name.

jglogan added 3 commits May 10, 2021 08:54
- Column interface defines parser() and
  setParser(), and AbstractColumn
  constructor now requires a parser.
- Concrete column constructors invoke
  super() with DEFAULT_PARSER, and other
  references to DEFAULT_PARSER are now
  replaced with parser().
@jglogan jglogan force-pushed the parser-per-column branch from 7143453 to 58825f2 Compare May 10, 2021 15:54
@jglogan
Copy link
Contributor Author

jglogan commented May 10, 2021

Rebased to current main and force-pushed.

@lwhite1 lwhite1 merged commit 126b0bc into jtablesaw:master May 10, 2021
@jglogan jglogan deleted the parser-per-column branch May 10, 2021 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants