Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Deep Learning for Natural Language Processing – ICLR 2017 Discoveries (amundtveit.com)
68 points by amund on Nov 12, 2016 | hide | past | favorite | 8 comments


A few brief notes:

Not at all sure the recommendations system papers should be grouped with QA papers. The techniques used aren't that closely related.

FastText(.zip) continues to be a weird project.

LEARNING A NATURAL LANGUAGE INTERFACE WITH NEURAL PROGRAMMER - yeah, that's something. The neural programmer and related stuff seems like something out of sci-fi, even more so than normal neural networks.


The neural programmer is indeed interesting. The current research into have neural networks learn algorithms and procedures for data retrieval and computation is really exciting (to me at least). It almost makes the image classification work of a few years ago seem quaint.

For those who don't want to read the paper, it's a way to (try to) have a computer learn to answer questions on datasets, like those contained here: http://nlp.stanford.edu/software/sempre/wikitable/viewer/#20...


> FastText(.zip) continues to be a weird project.

What is weird to you about the project? I haven't looked at the details, but the motivation seems pretty obviously to be able to run deep learning models on people's phones without seriously impacting UX.

Hell, even running a large vocabulary model on a server can be annoying when these models take ~10GB to just store the word vectors.


Well, it isn't deep learning for one thing.

Basically it's a reappraisal of early 2000-style manually engineered features. It's good work, but doesn't add much over VopalWabbit.

I haven't read the .zip paper in depth, but the mobile angle doesn't seem convincing to me. Text models generally just aren't that big! Drop the number of dimensions in W2V and it's really pretty small, and still expressive.

Don't get me wrong - I like FastText. But it suprises me it remains a research direction - almost everyone else is working on trying other approaches to get an AlexNet like breakthrough on NLP tasks. It's pretty clear that breakthrough won't come from the FastText approach.


> Well, it isn't deep learning for one thing.

You know, I didn't actually realise that. I had only glanced at it and assumed they were applying these ideas to deep models.

> Drop the number of dimensions in W2V and it's really pretty small, and still expressive.

I don't think it's crazy to want to be able to do get better performance with small memory targets though.

They're working on other directions too, but maybe this is useful for their product groups.


What do you think are the more promising approaches? If you could link to some papers, I'd love to read some of them.


I provided a write-up of interesting ICLR submissions, primarily for NLP, but have also provided a small snippet explaining why I thought the paper was important or how it fit with other recent relevant papers.

http://smerity.com/articles/2016/iclr_2017_submissions.html

I'd also strongly recommend people read through all the submissions themselves as every person seems to select a different set of papers depending on what they're interested in. My write-up has very few papers discussing CNNs for example as I'm focused on RNNs for the most part.

http://openreview.net/group?id=ICLR.cc/2017/conference


"Error establishing a database connection"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: