matthews correlation for multiple features?

I'd like to add matthews correlation for the multilabel case.
This essentially has a few options:

* ("micro") flatten predictions and targets, then calculate
* ("macro") calculate it per-feature and average
* Implement micro, macro as in f1 score.

Since scikit-learn doesn't support this out of the box, this may also be a terrible idea for some reason, in which case I'd like to learn why.