Twss.js

tibbon · on Jan 12, 2012

Now I'm just waiting for someone to make a Twitter bot that randomly samples tweets and responds to them with this...

sc00ter · on Jan 12, 2012

Already done (and undone) it would seem:

https://github.com/DanielRapp/twss.js/commit/d025aa46eef8559...

ecaron · on Jan 12, 2012

I tried to do this last year, but every single twss-related handle was taken. Don't believe me, look for yourself. I can't wait for the day when Twitter starts reclaiming unusued/squatted names (and have a pay for service that let's you claim it for life.)

mattspitz · on Jan 12, 2012

There's this, too.

http://yepthatswhatshesaid.com

It randomly samples tweets for training, but it doesn't write back automatically. If you find one you like on the site, though, you can tweet it.

</shamelessselfpromotion>

swores · on Jan 12, 2012

Would interest me much more if you open sourced it, are you already decided on not doing so?

edit: I see you do actually have it at https://github.com/mattspitz/yepthatswhatshesaid - is it up to date? Thanks for sharing!

hanula · on Jan 12, 2012

http://twitter.com/Michael_Scott here's agent Michael Scarn's friend, although not very active recently.

notb · on Jan 12, 2012

I think your negative sample set is a little biased. Since all the phrases start with verbs like "was in the car" or "went to the park", these kinds of phrases are given lower probabilities.

For example:

    > twss.prob("was on a stiff pole");
    0.016050826334564946

Only 1.6% chance of that's what she said?!?

EDIT: Counter example:

    > twss.prob("that's one stiff pole");
    0.9767718880285885

blahpro · on Jan 12, 2012

Related: http://www.quora.com/How-would-you-programmatically-parse-a-...

driverdan · on Jan 12, 2012

An interesting (and funny) exercise.

For those interested in neural networks and Bayesian classifiers check out the brain.js library: http://harthur.github.com/brain/

It works in both node and the browser.

adam_albrecht · on Jan 12, 2012

Well that open source project left me satisfied and smiling

shutton · on Jan 12, 2012

that's what she said

ward · on Jan 14, 2012

http://twss.guan.dk/?text=Well%20that%20open%20source%20proj...

Darn!

yuvadam · on Jan 12, 2012

A while back I was interested in implementing a much less naive algorithm for classifying TWSS expressions, based on this [1] paper. Never actually got around to finishing the work.

Interesting problem though, and nice work.

[1] - http://www.cs.washington.edu/homes/brun/pubs/pubs/Kiddon11.p...

TwistedWeasel · on Jan 12, 2012

Soon to be implemented in all IRC bots the world over

asm89 · on Jan 12, 2012

There was a similar project before that implemented this for an IRC bot. You can also train this bot by telling it what jokes are good and bad ones. :)

https://github.com/jsocol/scottbot http://coffeeonthekeyboard.com/say-hi-to-scottbot-594/

jarek · on Jan 12, 2012

Our IRC bot had this feature before it was popular...

bjornsteffanson · on Jan 12, 2012

This is probably the first time I've understood node.js.

VolatileVoid · on Jan 13, 2012

I was wondering if anyone knew of a place where I could learn about this stuff in general. I know nothing about unigrams, bigrams, trigrams, tf-idf, Bayesian filtering, etc. Maths - while not awful - is not my strongest point, but I think I could grok a well-written tutorial to this stuff (with code examples!).

I was hoping/wondering if anyone knew of sites I could start learning about this from? I find this very interesting, and I'm sure it could be highly useful and applicable to many different types of problems...

zeratul · on Jan 12, 2012

DanielRapp: in file twss.js/lib/classifier/knn.js, number of NN should be odd to prevent ties [EDIT: also, NN should be large enough to prevent over-fitting; small NN would mean that the difference (decision boundary) between twss and not-twss is highly non-linear; you need to implement cross-validation to find best NN]

Note to self: machine learning using node.js; what's the speed of calculations, what's the memory management in node.js, can I find pure JS implementation of SVM?

DanielRapp · on Jan 12, 2012

Thanks. I did do a simple analysis[1] and changed it[2] to 5 neighbors. Though when I look at the graph now, I see that 4 is actually the optimal value..

Swedish graph (täckning = recall): http://cl.ly/BJRa/pr.png

[1] https://github.com/DanielRapp/twss.js/blob/master/lib/analyz...

[2] https://github.com/DanielRapp/twss.js/commit/3cfcda785583084...

zeratul · on Jan 12, 2012

Why don't you try 10-fold CV (http://en.wikipedia.org/wiki/Cross-validation_%28statistics%...) - the graph might drastically change. Here is example how to do it: https://onlinecourses.science.psu.edu/stat857/book/export/ht...

If precision & recall monotonically go down when increasing NN then it means you don't have enough training data.

donohoe · on Jan 12, 2012

I'm still looking for a classifier that will take a phrase, determine if and what the "In Soviet Russia X Y you" response would be.

Anyone?

samg_ · on Jan 12, 2012

I don't think that would be a classifier, or at least not reasonably. You could have "In Soviet Russie X Y you" for each X,Y as your classes, but that would be unreasonable.

Yakov Smirnoff is a structural joke. You would need to parse sentences, pattern match, transform it, and then do some kind of regression on the phrase to get its humor quotient.

The Stanford Parser for structural parsing, then some custom pattern matching and transforming code, might get you somewhere.

aidenn0 · on Jan 14, 2012

A first approximation might just taking the simple permutation's such as

You can X Y <-> Y X you

and find the probability that it is an english sentence

phreeza · on Jan 12, 2012

Has anyone thrown this on a web server with a simple interface?

dangrossman · on Jan 12, 2012

Sure, why not -- http://pulse.w3counter.com:8080/

hawtshot · on Jan 12, 2012

It seems to be returning a lot of false positives, at least with the default options. "Good morning" = true, "How are you?" = true

LeafStorm · on Jan 12, 2012

Based on my linear algebra homework:

> "You need to use Gauss-Jordan elimination."

> That's what she said.

svdad · on Jan 12, 2012

"Capitalism without failure isn't capitalism."

That's what she said.

dangrossman · on Jan 12, 2012

I tried adding dropdowns to change the algorithm and threshold, but changing to knn crashes out ("ReferenceError: trainingPrompt is not defined"), so scrapped that and just left the demo running the defaults.

mckoss · on Jan 12, 2012

MRI's have shown that humans are able to do this because of a dedicated site in the brain called "Scott's region". Once activated, this linguistic region is constantly searching for linguistic cues, surfacing signals to our conscious thoughts when the cues are strong enough.

tnorthcutt · on Jan 12, 2012

I've seen a Siri proxy TWSS implementation: http://www.youtube.com/watch?v=p4LamngB070

jfriedly · on Jan 12, 2012

We made our IRC bot respond to TWSS jokes, but ours was just a dumb match from a set of few thousand jokes that we scraped from offline. You can look at the code at: https://github.com/jfriedly/jenni

Now that I took Stanford's Machine Learning class though, I think I might just duplicate what this guy did for our bot.

mertd · on Jan 12, 2012

The training data is pretty funny. I suppose he collected it from an online TWSS thread.

burgerbrain · on Jan 12, 2012

@Rapp

You seem to have been hellbanned 132 days ago.

_f1dq · on Jan 12, 2012

While it seems on the surface like a waste of time (albeit amusing one), I actually expect this is a great project to learn from because of its use of Bayesian classifiers.

In other words, I'm TOTALLY going to be using this on my next project.

radikalus · on Jan 12, 2012

Great start -- interesting to watch it go vs the twitter stream. (If you restrict to < 8 word tweets)

loganlinn · on Jan 12, 2012

Looks like this could easily be integrated into a script for Hubot

tibbon · on Jan 12, 2012

I've never had a script do that for me

emehrkay · on Jan 12, 2012

twss?

tlrobinson · on Jan 12, 2012

What exactly is Node.js specific about this?

ricardobeat · on Jan 12, 2012

Using exports, NPM package, and an executable that depends on /usr/bin/node? What's your point?

tlrobinson · on Jan 12, 2012

I just hate when people release JavaScript libraries that needlessly depend on specific platforms. For awhile that dependency was usually jQuery, then with the rise of server-side JavaScript it was the DOM in general, now it appears to be Node.js.

Just write "X for JavaScript", dammit.

That said, this doesn't appear to have any Node.js specific dependencies, it could be used in any CommonJS environment.

DanielRapp · on Jan 13, 2012

The reason I chose node over "browser-js" is because it was originally going to be a Twitter bot, but decided to simplify the GitHub repo into just a node module to make it more useful.

But you're totally correct. This could've easily be written in any language.

ricardobeat · on Jan 13, 2012

The source is open and it is really easy to port it to the browser, so this shouldn't warrant a complaint. Each one works with what he feels more comfortable with.

aithleyadeno · on Jan 12, 2012

goo.gl/39SLa

TWSS · on Jan 12, 2012

I approve this post.