In anticipation of the start of the basic course on Machine Learning , we have prepared an interesting translation for you, and we also offer you to watch a free demo lesson on the topic: "How to start making the world a better place with NLP" .
Introduction
If you've completed at least a few of your own Data Science projects, you probably already figured out that 80% accuracy isn't too bad. But for the real world, 80% is no longer suitable. In fact, most of the companies I've worked for expect a minimum accuracy (or whatever metric they look at) of at least 90%.
Therefore, I will talk about five things you can do to significantly improve accuracy. I highly recommend that you read all five points as there are many details that beginners may not know.
By the end of this article, you should have realized that there are many more variables that play a role in how well your machine learning model works than you imagine.
With that said, here are 5 things you can do to improve your machine learning models!
1.
, , , , . , /, , .
, , , , , . 15 80 , 80 , .
, , ยซ ยป?
, / :
: , , . , , ANOVA .
K- : K- , , , K- (.. ).
: , . , , .
2.
โ . โ , , . , Data Science โ , . , , :
DateTime , , ..
(, 100โ149, 150โ199, 200โ249 ..)
/ . , ยซIswomenor_childยป, True, , False .
3.
โ , / . , , , .
, :
: , XGBoost, , ยซยป . , , .
: โ (PCA). .
4. Ensemble Learning
โ . , .
(Ensemble Learning) โ , . , , - .
- , XGBoost, AdaBoost. , , :
. () . ? ยซ , ยป, .
, , , 0. 4 , 1. !
5.
, , , โ . , , , , .
:
class sklearn.ensemble.RandomForestClassifier(n_estimators=100, *, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None, ccp_alpha=0.0, max_samples=None
, , minimpuritydecrease, , ยซยป, ! ;)
!
, 80% 90+%. . Data Science.