"PyTorch is better for research" is a weird, unsubstantiated statement. The fact...

chenzhekl · on June 17, 2017

Few people use PyTorch largely because it is relatively new (0.1.12). It even doesn't have distributed training capabilities (coming in 0.2). Your arguments don't say anything about frameworks themselves. It is unfair!

When people say PyTorch is better for research, they mean it is more flexible, and it is easier to implement non-trivial network architectures with it, such as recursive network, which is a cumbersome task for TensorFlow. MXNet's documentation provides a good overview to these two different styles (http://mxnet.io/architecture/program_model.html).

visarga · on June 18, 2017

Yep, to make it clear, TensorFlow is like Angular (acclaimed, widely used) and PyTorch is like React (much more flexible and composable). By the way, funny, both are made by respectively Google and Facebook. History repeats itself.

DLEnthusiast · on June 17, 2017

All of these frameworks are "relatively new". TensorFlow: 1.6 years. CNTK: 1 year. PyTorch: 0.5 year. Are they really impossible to compare?

> When people say PyTorch is better for research, they mean

That's not what "people" say. They tend to say the opposite. Maybe we can ask OP what he meant when he said it.

> it is easier to implement non-trivial network architectures with it, such as recursive network

It is interesting that you mention recursive networks. There are only a few dozens of researchers who work with recursive networks, and they are all accounted for, we know what tools they use. They use Chainer and DyNet.

pilooch · on June 17, 2017

With all due respect you don't appear to know much about what you are commenting about. There is a huge community of research and applications around rnn and all other architectures. Pytorch has an extremely vivid and fast growing codebase, across all neural architectures and applications, it's remarkable actually. One reason might be because it is simple an effective, including easy debugging. Another is because research now is much focused on new possibly complex and exotic architectures, new optimizers, inner guts and behavior understanding, including theoretically. And it appears pytorch makes that easy. Don't throw it away too fast as a DL enthusiast :)

chenzhekl · on June 17, 2017

> All of these frameworks are "relatively new". TensorFlow: 1.6 years. CNTK: 1 year. PyTorch: 0.5 year. Are they really impossible to compare?

1.6 years is a long time in DL community.

> That's not what "people" say. They tend to say the opposite. Maybe we can ask OP what he meant when he said it.

Go ahead! Ask it.

> They use Chainer and DyNet.

You know Chainer came far before PyTorch, which heavily influenced PyTorch's design. You are always saying XX is using XX. Why not talk about frameworks themselves?

If you insisted to your idea, let's settle down. I don't want to start a framework war. I just want all frameworks to be equally considered.

dimatura · on June 17, 2017

As long as you're citing @karpathy, "I've been using PyTorch a few months now and I've never felt better. I have more energy. My skin is clearer. My eye sight has improved." (https://twitter.com/karpathy/status/868178954032513024).

My two cents as a researcher who has used theano, caffe, pytorch and TF: they all have their pros and cons. After starting out with theano, I really appreciate the dynamic nature of pytorch: makes debugging and exploration easier compared to the static frameworks. Researchers tend to value these features over deployability, scalability and raw speed (though pytorch is no slouch). So I fully expect pytorch to get a lot of momentum in the near future.

shoo · on June 17, 2017

> @karpathy, "I've been using PyTorch a few months now and I've never felt better. I have more energy. My skin is clearer. My eye sight has improved."

http://www.oneweirdkerneltrick.com/

sweezyjeezy · on June 17, 2017

Your list of who uses what is kind of contaminated by who wrote the code, I don't think it proves anything.

Obviously Google Brain use TF and Montreal use Theano - they wrote them. Deepmind use TF, but they used Torch before the google takeover. Similarly Google and Toronto are deeply intertwined.