"PyTorch is better for research" is a weird, unsubstantiated statement. The fact is that few serious researchers use PyTorch (and even those complain about it). It's mostly grad students in a handful of labs. The only researchers I know who use PyTorch have been from FaceBook, and that's because they were implicitly forced to use it (PyTorch is developed by FaceBook).
Ok, so what about academia? The top deep learning groups in academia are:
* Montreal: Theano
* Toronto: TensorFlow
* IDSIA: TensorFlow
So, what about the greater academic research community? Maybe we could get some data about who uses what by looking at the frameworks cited by researchers in their papers. Andrej did that: it's mainly TensorFlow and Caffe. https://medium.com/@karpathy/a-peek-at-trends-in-machine-lea...
Few people use PyTorch largely because it is relatively new (0.1.12). It even doesn't have distributed training capabilities (coming in 0.2). Your arguments don't say anything about frameworks themselves. It is unfair!
When people say PyTorch is better for research, they mean it is more flexible, and it is easier to implement non-trivial network architectures with it, such as recursive network, which is a cumbersome task for TensorFlow. MXNet's documentation provides a good overview to these two different styles (http://mxnet.io/architecture/program_model.html).
Yep, to make it clear, TensorFlow is like Angular (acclaimed, widely used) and PyTorch is like React (much more flexible and composable). By the way, funny, both are made by respectively Google and Facebook. History repeats itself.
All of these frameworks are "relatively new". TensorFlow: 1.6 years. CNTK: 1 year. PyTorch: 0.5 year. Are they really impossible to compare?
> When people say PyTorch is better for research, they mean
That's not what "people" say. They tend to say the opposite. Maybe we can ask OP what he meant when he said it.
> it is easier to implement non-trivial network architectures with it, such as recursive network
It is interesting that you mention recursive networks. There are only a few dozens of researchers who work with recursive networks, and they are all accounted for, we know what tools they use. They use Chainer and DyNet.
With all due respect you don't appear to know much about what you are commenting about. There is a huge community of research and applications around rnn and all other architectures. Pytorch has an extremely vivid and fast growing codebase, across all neural architectures and applications, it's remarkable actually. One reason might be because it is simple an effective, including easy debugging. Another is because research now is much focused on new possibly complex and exotic architectures, new optimizers, inner guts and behavior understanding, including theoretically. And it appears pytorch makes that easy. Don't throw it away too fast as a DL enthusiast :)
> All of these frameworks are "relatively new". TensorFlow: 1.6 years. CNTK: 1 year. PyTorch: 0.5 year. Are they really impossible to compare?
1.6 years is a long time in DL community.
> That's not what "people" say. They tend to say the opposite. Maybe we can ask OP what he meant when he said it.
Go ahead! Ask it.
> They use Chainer and DyNet.
You know Chainer came far before PyTorch, which heavily influenced PyTorch's design. You are always saying XX is using XX. Why not talk about frameworks themselves?
If you insisted to your idea, let's settle down. I don't want to start a framework war. I just want all frameworks to be equally considered.
As long as you're citing @karpathy, "I've been using PyTorch a few months now and I've never felt better. I have more energy. My skin is clearer. My eye sight has improved." (https://twitter.com/karpathy/status/868178954032513024).
My two cents as a researcher who has used theano, caffe, pytorch and TF: they all have their pros and cons. After starting out with theano, I really appreciate the dynamic nature of pytorch: makes debugging and exploration easier compared to the static frameworks. Researchers tend to value these features over deployability, scalability and raw speed (though pytorch is no slouch). So I fully expect pytorch to get a lot of momentum in the near future.
Your list of who uses what is kind of contaminated by who wrote the code, I don't think it proves anything.
Obviously Google Brain use TF and Montreal use Theano - they wrote them. Deepmind use TF, but they used Torch before the google takeover. Similarly Google and Toronto are deeply intertwined.
According to https://medium.com/@karpathy/icml-accepted-papers-institutio... , 3 of the top research labs in the world are DeepMind, Google Brain (and the rest of Google), and Microsoft Research. Let's see:
* DeepMind: TensorFlow
* Google Brain: TensorFlow
* Microsoft Research: CNTK
Ok, so what about academia? The top deep learning groups in academia are:
* Montreal: Theano
* Toronto: TensorFlow
* IDSIA: TensorFlow
So, what about the greater academic research community? Maybe we could get some data about who uses what by looking at the frameworks cited by researchers in their papers. Andrej did that: it's mainly TensorFlow and Caffe. https://medium.com/@karpathy/a-peek-at-trends-in-machine-lea...