Day 1: Spend some time setting your machine up for doing machine learning. For Python look at Numpy, IPython Notebook, Scipy, Pandas, Scikit-Learn, Matplotlib, Seaborn, NLTK, XGBoost wrapper, Vowpal Wabbit wrapper, Theano + Nolearn.
4: Study Scikit-learn documentation ( http://scikit-learn.org/stable/documentation.html ). Run a few examples. Change RandomForestClassifier into SGDClassifier and play with the results. Scale the data to make it perform better. Combine a RF model and a SGD model through averaging and try to improve the benchmark score.
Then next week start competing on Kaggle and form a team to join up with people at your level. You will learn a lot that way and start to open up the black box.
Kaggle also recently released a feature to run machine learning scripts in your browser. You could check those out and check out Python, R, common pipelines and even the more advanced neural nets: https://www.kaggle.com/users/9028/danb/digit-recognizer/big-... .
2: Learn how to manipulate Numpy arrays ( http://www.engr.ucsb.edu/~shell/che210d/numpy.pdf ) and how to read and manipulate data with Pandas ( https://www.youtube.com/watch?v=p8hle-ni-DM ).
3: Do the Kaggle Titanic survival prediction challenge with Random Forests. ( https://www.kaggle.com/c/titanic/details/getting-started-wit... )
4: Study Scikit-learn documentation ( http://scikit-learn.org/stable/documentation.html ). Run a few examples. Change RandomForestClassifier into SGDClassifier and play with the results. Scale the data to make it perform better. Combine a RF model and a SGD model through averaging and try to improve the benchmark score.
5: Study the ensemble module of Scikit-learn. Try the examples on the wiki of XGBoost ( https://github.com/dmlc/xgboost/tree/master/demo/binary_clas... ) and Vowpal Wabbit ( http://zinkov.com/posts/2013-08-13-vowpal-tutorial/ ). Practically you want to get to a stage of: Getting the data transformed to be accepted by the algo, a form of evaluation, and then getting the predictions back out in a sensible form.
Then next week start competing on Kaggle and form a team to join up with people at your level. You will learn a lot that way and start to open up the black box.
I found these series very accessible: http://blog.kaggle.com/2015/04/22/scikit-learn-video-3-machi...
Kaggle also recently released a feature to run machine learning scripts in your browser. You could check those out and check out Python, R, common pipelines and even the more advanced neural nets: https://www.kaggle.com/users/9028/danb/digit-recognizer/big-... .