Feature selection for Machine Learning The data features used to train the machine learning model have a great impact on the ultimate performance. Irrelevant or partially relevant feature can negatively influence the model. The various automatic feature selection techniques are: Univariate Selection Recursive feature elimination Principal Component Analysis Feature Importance Benefits of feature selection techniques: Reduces overfitting Improves accuracy Reduces training time Univariate Selection: This selection can be used to select the features that have the strongest relationship with the output variable. The example below uses scikit-learn which provides SeleceKBest class that can be used combinely with chi-squared (chi2) statistical test for non-negative features to select 4 of the best features from the dataset(Pima Indian). Recursive Feature Elimination: REF works by recursively removing the attributes a...
Comments
Post a Comment