Have a look at the RCV1 benchmarks on this page: http://leon.bottou.org/projects...

ogrisel · on June 25, 2012

One unmentioned caveat of SGD is how to configure the learning rate schedule. scikit-learn is using Bottou's tricks that seem to work reasonably well in practice but it might even be better to implement the online estimate of optimal learning rate schedule from this NIPS 2012 pre-print: http://arxiv.org/abs/1206.1106 (No More Pesky Learning Rates).