More

zyroth · on Jan 28, 2008

I personally think that C++ and Java are less readable then Lisp, Python, Ruby, ...

I think you get used to every Syntax. The point is, Lisp et al are much more compact, which makes the concepts easier to grasp.

I get a (mapcar #'f '(1 2 3)) much quicker than a

x = {1, 2, 3}; for (int i = 1; i < 4; i++) { x[i] *= 2; }

Xichekolas · on Jan 28, 2008

Well in all fairness it would be:

  (mapcar #'(lambda (x) (* x 2)) '(1 2 3))

vs

  x = {1, 2, 3};
  for (int i = 1; i < 4; i++) { x[i] *= 2; }

(I think... my only real experience is with Scheme, so I'm sure I messed up the pound-tick or some parens in Lisp)

But yeah, I generally agree the Lisp version is easier... especially once you get past trivial stuff like this.

reitzensteinm · on Jan 28, 2008

Easier, but more importantly safer in that any potential memory trashing code is implemented inside the language implementation itself (the bug that's in your C version couldn't really happen in the Lisp version).

rglullis · on Jan 29, 2008

Nor it would happen in the C version. This code doesn't compile. The compiler will throw you a WTF at x = {1, 2, 3} ...

Xichekolas · on Jan 29, 2008

I was just assuming the zyroth's example was in Java, since that is what the article's author favored. Not sure where you guys got the idea that it was C.

But, just to cover all bases, it wouldn't work as Java either. This would though.

  int[] x = {1, 2, 3};
  for (int i = 0; i < 3; i++) { x[i] *= 2; }

Yet another reason to use:

  (mapcar #'(lambda (x) (* x 2)) '(1 2 3))

... you don't have to think about whether array indexes start at zero or one.

zyroth · on Jan 15, 2008

That's why he was talking about the macpro and not the macbook pro.

zyroth · on Dec 28, 2007

and planned economy is superior, too.

zyroth · on Dec 27, 2007

Actually it is not guaranteed that every string of bits will at some point be a substring of pi. Bad joke, since wrong.

mojuba · on Dec 27, 2007

Can you generate an infinite string of bits that does not contain a given finite string?

Edit: no periods in the infinite string, of course

kf · on Dec 27, 2007

http://en.wikipedia.org/wiki/Normal_number#Properties_and_ex...

Sure... the infinite string of 1s, or the infinite string of 0s, or the infinite string of 0s and 1s that is made up of only decimal 3s and 7s. There are an uncountably many number of non-normal real numbers like this, even though the infinite amount of normal real numbers is bigger.

zyroth · on Dec 27, 2007

> There are an uncountably many number of non-normal real numbers like this, even though the infinite amount of normal real numbers is bigger.

How sure are you with that one? Mind to supply a proof?

kf · on Dec 28, 2007

I have no proofs; I don't think very many proofs exist with regards to this topic.

From what I've been able to gather, I think the cardinality of the normal and non-normal numbers are the same, even though the non-normal numbers are measurably greater because of probability distributions. This is a paradox that I don't really understand. http://forums.xkcd.com/viewtopic.php?f=3&t=4270

zyroth · on Dec 28, 2007

Anything related to the concept of infinity tends to be hardly understandable in an "emotional" or "intuitive" way. We can just apply the rules of logic and accept.

mojuba · on Dec 27, 2007

Alright, that makes sense, and how about pi?

zyroth · on Dec 27, 2007

Since pi is not random, we cannot say that every finite string is in it. (If it was, the probability P[s in binary representation] would be 1 for every finite s)

Actually, there is no proof and no counter proof that every finite string comes up in pi. (...that I know of)

zyroth · on Dec 27, 2007

Yes, I can:

01001000100001...

Guess what? There are uncountable ones.

zyroth · on Dec 17, 2007

Search for 'deep belief hinton' on google.

zyroth · on Nov 23, 2007

> Shift-F11 Hide all windows in slow motion

Productivity Booster, right.

zyroth · on Nov 20, 2007

Machine Learning, especially Reinforcement Learning

greg · on Nov 21, 2007

me too... have you read Sutton and Barto's book? http://www.cs.ualberta.ca/~sutton/book/the-book.html

zyroth · on Nov 13, 2007

I don't believe you.

marcus · on Nov 13, 2007

Challenge me. Select a dataset send me the training data and I'll send you my results so you can verify it...

I don't mind being tested.

cstejerean · on Nov 13, 2007

I'd like to see the result of your algorithm on the Netflix challenge as well.

lucindo · on Nov 13, 2007

You can tell us your results with KDD data sets: http://kdd.ics.uci.edu/

hhm · on Nov 13, 2007

He says in some other comment:

"I've tested it on the data from the KDD 2006 cup a contest in the KDD conference whose goal was to identify Pulmonary Embolism based on data generated from CT scans and it out scored the cup winners by a 50% margin."

zyroth · on Nov 13, 2007

There are a lot of different fields in ML. I don't believe that you have an algorithm that beats all algorithms out there , even if those are specialized on a specific setting.

marcus · on Nov 14, 2007

You are right there are quite a few problem types in ML and a lot of different algorithms but because my idea is a basic insight into something that is missing in existing algorithms, I've been able to incorporate the change into several different algorithms.

jwp · on Nov 13, 2007

What's your score on the Netflix Prize quiz?

marcus · on Nov 13, 2007

Can't really get you a score that you'll be able to trust without submitting a result set, can't submit a result set without agreeing to publicize my algorithm... Pick a different test. One where you can do the verification without a 3rd party that requires me to relinquish my trade secret.

jwp · on Nov 13, 2007

That's not true. You can submit your results and then refuse to release your algorithm, disqualifying you from the competition.

"Upon qualifying, as described above, the Participant is required to submit within one (1) week for judging a description of their algorithm along with all source code. The Participant warrants that the source code is either fully or substantially developed and functions or will function as represented by the description. Failure to deliver both the description and source code within one (1) week will disqualify that entry and additional qualifying entries will be considered."

marcus · on Nov 13, 2007

Ok missed that part in the contract... it was a bit long and in legalese. I'll tinker with it to fit it to the problem (different algorithm category, mine is best fitted for classification problems and the test is a clustering problem. The dataset needs to be manipulated a bit first for it to work )

zyroth · on Nov 13, 2007

Uhm? The latency as a public api will render it useless for most applications.

marcus · on Nov 13, 2007

Most AI application aren't realtime (although some of them in the chemical industry for example are ). A doctor doesn't care if his cancer diagnosis takes another 20 milliseconds so long as its a bit more accurate. And a bank trying to analyze if a credit card transaction is fraud or not doesn't care about the short delays either.

zyroth · on Nov 13, 2007

> And a bank trying to analyze if a credit card transaction is fraud or not doesn't care about the short delays either.

Oh yes, they do. That I do know.

marcus · on Nov 13, 2007

even if its in the milliseconds range?

zyroth · on Nov 14, 2007

fincancial fraud detection bottlenecks are typically between ram and the processor. Thousands of snychronously incoming transactions have to be examined simultanously, because they correlate heavily.

It is not like "here is one transaction, is it a fraud?" but "here are 2^20 transactions, what are the frauds?".

You could do this by pipelining, but I guess Banks want a zero downtime system and I personally would not trust an API in terms of reliability.

Another point is, that banks will not give you the original data. They will have to "pseudonymize" several entries, such as credit card numbers, names, ... This would force them to preprocess the data which gives every transaction a very little + O(n) and which might decrease the speed even more.

(I'm not saying it's technically impossible, but I'd say there are better ways, such as releasing it closed source or just using it to predict financial data - which as we all know is possible and being done by hedge fonds, so this should be the best way IF you have that algorithm ;)

daniel-cussen · on Jan 10, 2008

You could simply prove it works and sell the whole thing to the bank.

henning · on Nov 13, 2007

I've taken classes from people who worked on fraud detection for banks (Fair Isaac) and they were working on legacy hardware (shitty old mainframes) with retardedly limited floating point precision.

Performance is of the essence in these situations; any clever trick you can think of to speed things up should be used in such a situation (but keep it fairly simple; lookup tables and so forth, for example).

neilc · on Nov 13, 2007

Applying the algorithm to financial markets would be highly latency-sensitive, though.

I'm not sure I buy the web service idea, either: wouldn't typical applications require a lot of input data (training set + test set) in order to be effective? Uploading all that data could be annoying, compared with just running the algorithm locally at the customer's site.

marcus · on Nov 13, 2007

Training always takes a ton of time, even more so with my algorithm which is a bit more complex. But training is usually only done once and afterwards results can be generated very quickly.

neilc · on Nov 13, 2007

I wasn't talking about training time -- I was talking about data set size. Frequently uploading a few GB of data over the public Net to do effective training is going to be an annoyance. You may also need to perform the training multiple times, especially if your algorithm takes any parameters.

marcus · on Nov 13, 2007

It does take a few parameters but there is no reason why the same dataset would have to be uploaded every time you tweak a parameter, I can just store it and let you play with it until you are satisfied with the results. And again training isn't done that frequently usually.

zyroth · on Nov 12, 2007

Those things are useful. :)