Easier, but more importantly safer in that any potential memory trashing code is implemented inside the language implementation itself (the bug that's in your C version couldn't really happen in the Lisp version).
I was just assuming the zyroth's example was in Java, since that is what the article's author favored. Not sure where you guys got the idea that it was C.
But, just to cover all bases, it wouldn't work as Java either. This would though.
int[] x = {1, 2, 3};
for (int i = 0; i < 3; i++) { x[i] *= 2; }
Yet another reason to use:
(mapcar #'(lambda (x) (* x 2)) '(1 2 3))
... you don't have to think about whether array indexes start at zero or one.
Sure... the infinite string of 1s, or the infinite string of 0s, or the infinite string of 0s and 1s that is made up of only decimal 3s and 7s. There are an uncountably many number of non-normal real numbers like this, even though the infinite amount of normal real numbers is bigger.
I have no proofs; I don't think very many proofs exist with regards to this topic.
From what I've been able to gather, I think the cardinality of the normal and non-normal numbers are the same, even though the non-normal numbers are measurably greater because of probability distributions. This is a paradox that I don't really understand. http://forums.xkcd.com/viewtopic.php?f=3&t=4270
Anything related to the concept of infinity tends to be hardly understandable in an "emotional" or "intuitive" way. We can just apply the rules of logic and accept.
Since pi is not random, we cannot say that every finite string is in it. (If it was, the probability P[s in binary representation] would be 1 for every finite s)
Actually, there is no proof and no counter proof that every finite string comes up in pi. (...that I know of)
"I've tested it on the data from the KDD 2006 cup a contest in the KDD conference whose goal was to identify Pulmonary Embolism based on data generated from CT scans and it out scored the cup winners by a 50% margin."
There are a lot of different fields in ML. I don't believe that you have an algorithm that beats all algorithms out there , even if those are specialized on a specific setting.
You are right there are quite a few problem types in ML and a lot of different algorithms but because my idea is a basic insight into something that is missing in existing algorithms, I've been able to incorporate the change into several different algorithms.
Can't really get you a score that you'll be able to trust without submitting a result set, can't submit a result set without agreeing to publicize my algorithm...
Pick a different test. One where you can do the verification without a 3rd party that requires me to relinquish my trade secret.
That's not true. You can submit your results and then refuse to release your algorithm, disqualifying you from the competition.
"Upon qualifying, as described above, the Participant is required to submit within one (1) week for judging a description of their algorithm along with all source code. The Participant warrants that the source code is either fully or substantially developed and functions or will function as represented by the description. Failure to deliver both the description and source code within one (1) week will disqualify that entry and additional qualifying entries will be considered."
Ok missed that part in the contract... it was a bit long and in legalese.
I'll tinker with it to fit it to the problem (different algorithm category, mine is best fitted for classification problems and the test is a clustering problem. The dataset needs to be manipulated a bit first for it to work )
Most AI application aren't realtime (although some of them in the chemical industry for example are ).
A doctor doesn't care if his cancer diagnosis takes another 20 milliseconds so long as its a bit more accurate. And a bank trying to analyze if a credit card transaction is fraud or not doesn't care about the short delays either.
fincancial fraud detection bottlenecks are typically between ram and the processor. Thousands of snychronously incoming transactions have to be examined simultanously, because they correlate heavily.
It is not like "here is one transaction, is it a fraud?" but "here are 2^20 transactions, what are the frauds?".
You could do this by pipelining, but I guess Banks want a zero downtime system and I personally would not trust an API in terms of reliability.
Another point is, that banks will not give you the original data. They will have to "pseudonymize" several entries, such as credit card numbers, names, ...
This would force them to preprocess the data which gives every transaction a very little + O(n) and which might decrease the speed even more.
(I'm not saying it's technically impossible, but I'd say there are better ways, such as releasing it closed source or just using it to predict financial data - which as we all know is possible and being done by hedge fonds, so this should be the best way IF you have that algorithm ;)
I've taken classes from people who worked on fraud detection for banks (Fair Isaac) and they were working on legacy hardware (shitty old mainframes) with retardedly limited floating point precision.
Performance is of the essence in these situations; any clever trick you can think of to speed things up should be used in such a situation (but keep it fairly simple; lookup tables and so forth, for example).
Applying the algorithm to financial markets would be highly latency-sensitive, though.
I'm not sure I buy the web service idea, either: wouldn't typical applications require a lot of input data (training set + test set) in order to be effective? Uploading all that data could be annoying, compared with just running the algorithm locally at the customer's site.
Training always takes a ton of time, even more so with my algorithm which is a bit more complex. But training is usually only done once and afterwards results can be generated very quickly.
I wasn't talking about training time -- I was talking about data set size. Frequently uploading a few GB of data over the public Net to do effective training is going to be an annoyance. You may also need to perform the training multiple times, especially if your algorithm takes any parameters.
It does take a few parameters but there is no reason why the same dataset would have to be uploaded every time you tweak a parameter, I can just store it and let you play with it until you are satisfied with the results.
And again training isn't done that frequently usually.
I think you get used to every Syntax. The point is, Lisp et al are much more compact, which makes the concepts easier to grasp.
I get a (mapcar #'f '(1 2 3)) much quicker than a
x = {1, 2, 3}; for (int i = 1; i < 4; i++) { x[i] *= 2; }