Skip to content

Conversation

fverac
Copy link
Collaborator

@fverac fverac commented Jul 14, 2023

Summary of changes:

  • Add enable_missing arg to most CATE estimators.
    _OrthoLearner and subclasses
    Metalearners
    OrthoForest models
  • Note of the estimators that accept missing values, most only accept missing in W.
  • Fix bug where DoWhyWrapper did not work with DMLOrthoForest when discrete_treatment=True

@fverac fverac force-pushed the fverac/enable_missing_values branch from e07c5c2 to ef790e5 Compare July 14, 2023 16:24
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please add at least one test exercising this functionality.

@fverac fverac force-pushed the fverac/enable_missing_values branch from 90f6fff to 00dd506 Compare July 18, 2023 20:47
@fverac fverac linked an issue Jul 19, 2023 that may be closed by this pull request
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

@esbraun
Copy link

esbraun commented Jul 24, 2023

Greetings! Per the office hours call - may we let the user opt in to nan's for X in the non parametric DML, the doubly robust learner, S learner, T learner and X learner? All of these models potentially allow for a CATE model able to handle nan's.

fverac added 6 commits August 15, 2023 10:58
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
@fverac fverac force-pushed the fverac/enable_missing_values branch from 9812ef4 to e34e85f Compare August 15, 2023 14:58
fverac added 2 commits August 15, 2023 16:55
…at dmlorf

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
@kbattocchi kbattocchi marked this pull request as ready for review August 21, 2023 22:19
@kbattocchi kbattocchi marked this pull request as draft August 21, 2023 22:20
Copy link
Collaborator

@kbattocchi kbattocchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These updated changes look good to me.

@fverac fverac changed the title enable nans in W add arg to allow missing values in W and sometimes X Aug 24, 2023
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
@fverac fverac marked this pull request as ready for review September 29, 2023 19:33
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
@fverac fverac merged commit 25c3b3b into main Sep 29, 2023
@fverac fverac deleted the fverac/enable_missing_values branch September 29, 2023 21:37
kbattocchi pushed a commit that referenced this pull request Oct 31, 2023
* enable nans in W

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* linting

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* add tests for ests that handle missing in W

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* allow missing in X for some ortholearner subclasses

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* refactor keyword arg to be bool only, add more tests

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* linting

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* enable missing for metalearners and orf, fix dowhywrapped discretetreat dmlorf

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* update arg name to allow_missing, add docstrings

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* add warning when missing values detected

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* dummy commit

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

* dummy commit revert

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>

---------

Signed-off-by: Fabio Vera <fabiovera@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can imputation of X, W before cross-fitting induce bias or overconfidence? Best practices to handle NaN/missing values in W or X features?
3 participants