Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm running automatic machine learning SaaS for 2 years, and after this time, I can tell you that it is huge problem that data scientist are living in their own world (including me! and including data science tools).

I had such situation: my user created 50 ML models (xgboost, lightgbm, NN, rf) and ensemble of them. Let's say that single best model was 5% better than single best model with default hyper-params and ensemble was 2% better than tuned best single model. For me it was a huge success, but the user didn't care about model performance. He wants to have insights about data, not tuned blackbox.



I understand every single word you said and fully agree with you. Good point!


In an interview, Ryan Caldbeck from Circle Up describes two categories of models: brainy models and brawny models. The ensemble described above sounds like a brawny model: you don't care how it made the decision, you're glad it did the heavy lifting and you might even double check the result.

However, the user's concern about the black box suggests they wanted what Ryan refers to as a brainy model, one with explicable decisions. Even within the features of the model there could be things to learn about the data.

How else are data scientists stuck within their own world?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: