dabana + sklearn   10

Feature Request: Pipelining Outlier Removal · Issue #9630 · scikit-learn/scikit-learn
Yes, #3855, #4143, #4552 and scikit-learn/enhancement_proposals#2 all relate.

I don't see why resample is entirely inappropriate here.

Meta-estimator (almost; untested):

class WithoutOutliersClassifier(BaseEstimator, ClassifierMixin):
def __init__(self, outlier_detector, classifier):
self.outlier_detector = outlier_detector
self.classifier = classifier

def fit(self, X, y):
self.outlier_detector_ = clone(self.outlier_detector)
mask = self.outlier_detector_.fit_predict(X, y) == 1
self.classifier_ = clone(self.classifier).fit(X[mask], y[mask])
return self

def predict(self, X):
return self.classifier_.predict(X)
sklearn  pipeline  meta  estimator  transform  target  sample 
january 2019 by dabana

Copy this bookmark:



description:


tags: