Hacker Newsnew | past | comments | ask | show | jobs | submit | grav1tas's commentslogin

I don't think you connected your arguments very well. Just because a human can do something doesn't mean that AI's are the best solution to the task. I'm not sure how your specific example plays to the general case.

Both AIs and Humans both attempt to ape some fundamental truism or formalism about logic to compile and decompile programs. To me, it's better off knowing what those formalisms are than aping them with intelligent behavior, though finding formalisms is hard so sometimes we make do.


The set of all possible inputs for a compiler is infinite, too. Does that mean that compilers are all harangued by the halting problem as well? Nope. Having an infinite number of possible inputs (infinite input set size) does not prevent you from showing that a program can actually halt. I think most compilers are written in a way that can be shown to halt, as well.

The Halting Problem does apply in the general case, but if you carve up your programs and reason about them, you can still show that you can have a halter. The Halting Problem just states that there does not exist a method that will take an arbitrary program and show it to be a halter.


"The set of all possible inputs for a compiler is infinite, too. Does that mean that compilers are all harangued by the halting problem as well? "

No, because unlike this "try all possible inputs" plan for decompilers, compilers only operate on any particular single input.

"I think most compilers are written in a way that can be shown to halt, as well."

Not C++ compilers (Turing Complete macros) or Lisp dialects (same issue, but even moreso)


The point about the macros being Turing Complete is trivial in this instance because if you wrote a macro that never terminated, you'd never compile anything do decompile ;-).


No, you are incorrect about the point being trivial. First off, the macros used in the original program will of course terminate, but if you simply "try every input to the compiler" like is being suggested, you will be attempting to compile any number of non-terminating macros (and you will of course not know which terminate and which do not (Halting Problem)).

The point is that "compilers terminate for every input" is trivially false.


Non-termination of some compiler inputs is a red herring. The only modification required to the original algorithm is that the search proceed in parallel (such that every input eventually makes arbitrary progress).

I agree with your point in a sister thread that decidable doesn't imply practical, but the claim of the original article was undecidable, which has a technical meaning.


I would qualify that statement better by saying "compilers with Turing Complete macros or type systems terminate for any input is trivially false". The point may not be trivial (which I'm not ready to concede), but it still doesn't matter. The author's means implicitly relies on an evasion where programs aren't written in this (rather strange) way. It's a safe assumption to say that you can eliminate programs written in such a way that will not compile, then the Halting Problem as you've stated it doesn't apply.

I think you may be right about the point your trying to make, but the point I'm making is that your point covers a negligible section of programs that the author would hope to analyze, so it doesn't matter.


My point is not about checking if the given program will halt or not.

The parent comment's suggestion was: (1) for each possible input program, (1.a) compile it, and (1.b) check if the result equals the given compiled code.

Agreed, steps (1.a) and (1.b) terminate deterministically (for a given compiler).

However, the search space for this search procedure is infinite.

Similarly, it would be impossible, in general, to exhaustively test every possible input of a compiler.


And yet, if the output you are decompiling is known to come from a particular compiler, the search will terminate.


If you know the compiler (and every other tools involved in the process), then yes, there exist at least one solution (the original input). The search will eventually find it.


Assuming all possible inputs to the compiler cause it to halt. Not true for many high-profile languages.

And while not strictly related to "is it decidable", in the real world we have to keep in mind the complexity of the solution. The presence of the naive solution for a particular language doesn't say much about how well we can* actually do it. Is the problem actually decidable within the confines of the physical observable universe for real world inputs?


"Assuming all possible inputs to the compiler cause it to halt."

If the compiler did not halt, then we don't have anything to disassemble in the first place.

Formally, it is obvious that a diagonalization-based search will eventually find a correct input, regardless of the halting status of any given input. In practice, none of this matters very much.


As I explained in my other comment, the compiler certainly does halt for the correct input, but that does not mean it will halt for all inputs.

You are however correct that this can be circumvented with diagonalization.


It depends on what you want from decompilation. Decompilation by its basic nature is just as decideable as compilation is. It's a translation from one language to another in addition to semantics-preserving translations. Now what a lot of people seem to want from a decompiler that makes them intractable is a guarantee that the output source will look just like the source used to compile the source program for the decompiler.

Program1 -> Compiler -> BinProg1 (compilation)

BinProg1 -> Decompiler -> Program2 (decompilation)

Even if the source language of the Compiler and the target language of the decompiler are the same, you're usually still screwed because the compiler is going to manipulate your program to optimize it or fit it onto your target architecture. The result is going to be a program that looks way more general case than your original program.

Is this undecideable? No way.


We typically reserve the term "undecidable" for the case where there is a mathematical function we are interested in, but it isn't computable by any algorithm. Inverting a many-to-one function is just impossible.


Applying AI to a decompilation problem is just like applying AI to a compilation problem. You're taking a potentially nondeterministic approach to a problem that should be formalized and deterministic. I think AI applied to compilers is something to avoid, in general.


>Applying AI to a decompilation problem is just like applying AI to a compilation problem

I think they're very different. Compilation strips semantic information from code. Decompilation attempts to recover semantic information from program structure and, in the AI case, domain knowledge. They are fundamentally different processes.


That's fine that they're different. My point is that they're both formal processes. Adding AI to the mix will not help you arrive at formal conclusions. The conclusions will be probabilistic, instead. If an AI could use formal routines to show you formal conclusions, then you wouldn't need an AI, you'd just need your routines.


And those formal processes will tell you, in the most formal of manners, that when information is actually removed from a system that by definition of information you are incapable of formally reconstructing it. If you can reconstruct it, by definition it was not removed in the first place.

Intent is gone from compiled code. You can not formally reconstruct it. All you could formally get are reasonable guesses, and actually that's what the decompiler is giving you; it is not hard to interpret its output as very precisely vague on exactly the points it does not know about.


Sometimes I wonder if these payment processing companies behave like this so they can keep their books straight/solvent.


I think it might be important to note that while the terms map and reduce do come from Lisp, they're not one-to-one with what these functions do in Lisp. The original MapReduce paper mentions the borrowing, but doesn't really go into specifics. There's a good paper by Ralf Lämmel that describes the relation that MapReduce has to "map" and "reduce" at http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.... . I liked this paper much better and found it the most informative functional explanation to MapReduce (note, it's in Haskell).

I think MapReduce is really part of a more general pattern where you have an (in more Haskell-y terms) unfold (anamorphism) to a foldr (catamorphism). If your operations on the items in your intermediate set of data in the MapReduce are associative/commutative, you can work out parallelization more or less for free. It's pretty cool stuff, and really not that complicated when you sit down and think about it.


Wow. Thank you. I wish I could upvote you more. I have never looked into Google's "MapReduce" but if what you say is true then my assumptions on it were completely wrong. I assumed it was 1-1 with map and reduce. But if it is generating a set of unfolds into 'essentially' fold (is it? I only skimmed the first pages due to time) then that is a very important detail and makes the name a bit of a misnomer. Why don't people make a bigger deal about this - unfold, fold is not harder but it requires a different mindset and approach than map fold.

If you are approaching it thinking it is just a map and fold then you are going in disadvantaged and will have to unlearn that fact to properly leverage something that is actually even more impressive/useful/powerful than a mare fold o map.


Yeah I've always been tweaked by the "inborn" trait thing. I think there's a natural affinity that people can have for things, but at the same time, anybody who puts in the hours can get there with this stuff. Yes, some people "get" how to play a piano right off, and they're naturally really good. For many people who are really good, though, they put in tons of practice hours honing their "piano qi". My 2 cents, at least.


I'm not sure what's surprising about a compiler with no C in it. They're becoming much more common. I personally think C is an awful language to write a compiler in. One shouldn't be worrying about C's bookkeeping while also worrying about the bookkeeping a compiler requires.

Also, I think it's interesting when people use the word "interesting" when they probably really mean to use the word "annoying". If something really bugs you just say it bugs you.


Perhaps Peter Thiel's real master plan is to increase the value of an education by convincing a lot of people that they shouldn't go to school, and thus increasing scarcity of those getting degrees. AmIright?


Your suggestion that the deans are pro-college institution because they are deans doesn't necessarily follow. It actually is an ad hominem against the deans because you're trying to weaken the author's point by reducing the weight of a series of supporting claims made by the deans because you claim that the deans are acting out of self-interest (or interest of some educational-industrial complex of which they are a part) when they wrote these statements. Given that one side of this argument likes to frame the educational institution as bad (and sometimes go as far as saying the people in the institution are actively working to further its negative ends), it would be reasonable to interpret your first remarks as an ad hominem attack.

That said, your initial statement doesn't necessarily follow because it could also be the case that the deans became deans because they earnestly believe in the system they're helping to further, and that's why they became deans. Claiming the possibility of bias doesn't entail the presence of bias.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: