Skip to content
This repository was archived by the owner on Nov 19, 2020. It is now read-only.
This repository was archived by the owner on Nov 19, 2020. It is now read-only.

DecisionTrees\Pruning\ErrorBasedPruning: Possibly-biased comparison between errors #236

@YaronK

Description

@YaronK

There's possibly an issue is with the ErrorBasedPruning class - The three errors (baseline, replace and prune), computed to determine the best action to perform on the current node, are not computed in the same manner.

The algorithm referenced in the comments specifies the procedure performs bottom-up traversal over all nodes and compares the values:
errors

According to the lowest value the procedure either leaves the tree as is, prune the node t, or replaces the node t with the subtree rooted by maxchild(T,t).

The computations of these errors are implemented in the following methods:

  1. computeErrorSubtree
  2. computeErrorWithoutSubtree
  3. computeErrorReplacingSubtrees

computeErrorWithoutSubtree and computeErrorReplacingSubtrees apply the changes (pruning / replacing subtrees) temporarily, call computeError to compute the error, and revert the changes. computeError considers all the observations.
However, computeErrorSubtree computes the local error (only observations which reach the current node):
I believe this might cause the following comparisons to be biased, and that all errors should be computed globally.

These are snippets of computeError and computeErrorSubtree:

private double computeError()
{
    return new ZeroOneLoss(outputs) { Mean = true }.Loss(tree.Decide(inputs));
}
private double computeErrorSubtree(int[] indices)
{
    int error = 0;
    foreach (int i in indices)
        if (outputs[i] != actual[i]) error++;

    return error / (double)indices.Length;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions