Hacker Newsnew | past | comments | ask | show | jobs | submit | CarolineW's commentslogin

I agree with you entirely, although my comment making pretty much the same point is seeing wild swings in the voting:

https://news.ycombinator.com/item?id=19980584


There's a lot of incel lurkers here, and they seem to coordinate their actions somewhere else. Typical up/down vote patterns start with sensible upvotes and then a torrent of jealous downvotes.


Your "shell" solution isn't really a shell solution, it's an awk solution.

It's really valuable to know awk, but it's a bit misleading to claim that "it's shell".


This is just pedantry.

I this read you other replies and I can't see them adding much value to the discussion.

You are just being rigid about semantics.

What action are we supposed to take having read your comments?

To be careful of not calling a pipeline with awk in it 'shell?'

To be ashamed of using awk because it's a programming language, and that is 'cheating'?


Wow, I would never have expected my comments to have been interpreted that way. Genuinely shocked.

Thank you for the feedback.


The awk language and shell are now in one bundle called POSIX. A POSIX shell environment is not conforming if it doesn't have an awk command. A conforming POSIX implementation could make awk a shell builtin.


That I did not know - thank you. We here still think of awk as fundamentally different from other pipeline facilities such as tr, sed, sort, uniq, and so on, but I can see why it could, perhaps should, be though of as being "shell".

I guess I was triggered by the fact that the proposed shell solution is:

* not on a command line (although it could be),

* is significantly longer than the original command line solution, and

* gives a different result.

But you're right, it's shell. I might, however, given my background, and remembering as I do its first introduction, always have trouble thinking of it as such.


> We here still think of awk as fundamentally different from other pipeline facilities such as [...] sed

If you consider awk "not-shell" because it's an entire language, then it's really inconsistent to consider sed "shell". sed is a stream programming language. For example, this is a Sudoku solver written in sed: http://sed.sourceforge.net/local/games/sedoku.sed.html


Actually, we are pretty marginal on sed, but point taken. It feels like there's a difference between "stream mode" and "program mode".

I remember when awk was first implemented. sed was already standard, and awk was this new thing. I love it, and for some things it's my "go to" language. That colors how I think of it - I think of it as a language.

But this has been done to death, everyone is jumping on me, so there seems little to add.


>But this has been done to death, everyone is jumping on me, so there seems little to add.

No one is "jumping on you", as you put it, neither the other people who replied to your comments, nor me. It's absolutely normal in a tech forum (more so in such ones, because they are mostly fact-based) and in fact even in any online or real-life forum, for that matter, for people to point it out if they think some statement a person has made, is wrong.

In fact, you did exactly that (point out that you thought I was wrong - see your comments that I've quoted just below), in your top-level reply to my original comment about my having created solutions in Python and shell, which is what started this whole sub-thread. Going by your logic, I should have complained about you jumping on me :)

Here's where you pointed out that (you thought) I was wrong:

>Your "shell" solution isn't really a shell solution, it's an awk solution.

>It's really valuable to know awk, but it's a bit misleading to claim that "it's shell".

Not only that, you claimed that it was misleading, without having any way to know whether I had any intention to mislead or not. Come off it. You cannot read my mind. It's mine, not yours. Come to think of it, that was a poor judgement call on your part, too, because even if I had some intention of misleading people, which I did/do not, what could I possibly gain by misleading them about whether some code is a shell solution or an awk solution? The whole "misleading" idea is a figment of your imagination, or of your unclear thinking, I'm sorry to have to say.

Anyway, this thread has gone on for too long, with barely any benefit to anyone. I'll just briefly touch on that fundamental flaw in one of your points about my work, that I mentioned in another comment, and then be done with this whole thing. Doing that in a separate comment.


>I remember when awk was first implemented. sed was already standard, and awk was this new thing. I love it, and for some things it's my "go to" language. That colors how I think of it - I think of it as a language.

Google for "sed is Turing complete" and see the results, including the post at catonmat.net - Peteris Krumins' blog :) There's even an HN thread about it.


I can pull off some ok stuff with the shell toolkit, including awk and sed but that's a whole different level.

Whoever authored that, mad props.


If you ever have a low moment, worrying about about the marketability of the skills you have developed in your side interests, you can always think of these people that think it's a good use of their time to learn how to develop a Nintendo Game Boy emulator in Sed.


>Your "shell" solution isn't really a shell solution, it's an awk solution.

>It's really valuable to know awk, but it's a bit misleading to claim that "it's shell".

It's more than a bit misleading to accuse someone of being misleading, without checking your facts first.

Maybe you spoke too soon, without reading the full post, as I find people sometimes (often?) do (not just in reply to my comments, but to those of others too), not just on HN, but on many forums.

Read both the header comment and the last line of the script you called "an awk solution":

Header comment:

# bentley_knuth.sh

Of course, the filename extension does not make it a shell solution instead of an awk solution, but I used .sh because I knew what I was doing [1]. See next point.

Last line of that script:

    ' < $2 | sort -nr +1 | sed $1q
So the script uses all of awk, sort, sed and shell - which you seem to have missed, maybe because you only skimmed the first some lines before replying here. That makes it a shell script in my book, plus see below.

The $2 and the $1 - in the $1q bit - are shell command line parameters, because this code is invoked from the shell as a script, with arguments passed. Also see the pipe symbols (|). All these are part of shell syntax, not awk syntax (although awk has $1, $2, etc. too, they have a different, though related meaning - in fact, I use a $i in my awk script within this shell script too). The whole script is a pipeline, that pipes the awk script's output to sort and sort's to sed.

I don't know if you are a shell/awk newbie or not, but regardless, you missed those points above, like I said, probably due to haste.

[1] Check out this recent article by me in Linux Pro Magazine:

http://www.linux-magazine.com/Issues/2018/217/Exploring-proc

for a bit of shell quoting magic along with some use of awk.

And this post:

UNIX one-liner to kill a hanging Firefox process:

https://jugad2.blogspot.com/2008/09/unix-one-liner-to-kill-h...

and also the interesting comments on that post, from which I learned some things.

After having worked for years on Unix platforms (as both a dev and system engineer), from even before Linux was created, I think I know the difference between shell and awk (and a bit more, although still do not claim to know everything, or even close).

On a more positive note, I'll use this as an opportunity to put in a plug for my Python and Linux training offerings :) Course outlines and a couple of testimonials here:

https://jugad2.blogspot.com/p/training.html


I read your entire post, and I didn't miss any of the points you make. I simply disagree with you.

Quoting from your post:

    And here is my initial solution in UNIX shell:

    # bentley_knuth.sh

    # Usage:
    # ./bentley_knuth.sh n file
    # where "n" is the number of most frequent words 
    # you want to find in "file".
    
    awk '
        {
            for (i = 1; i <= NF; i++)
                word_freq[$i]++
        }
    END     {
                for (i in word_freq)
                    print i, word_freq[i]
            }
    ' < $2 | sort -nr +1 | sed $1q
So you invoke awk, and then run the output of awk through sort and sed.

You're doing all the word counting in awk.

Yes, you're invoking awk from a shell script, but that's really not the same thing as "using shell." McIlroy’s solution is genuinely shell:

    tr -cs A-Za-z '
    ' |
    tr A-Z a-z |
    sort |
    uniq -c |
    sort -rn |
    sed ${1}q
"awk" is generally accepted as a full programming language, whereas "tr", "sort", "uniq", and "sed" are command line utilities. I don't think "awk" classes as a command line utility, so I don't class your solution as "shell".

Perhaps you don't agree, perhaps you think "awk" is a command line utility. If so, then we'll agree to disagree.


Wow. Multiple misunderstandings on your part in one single fairly short comment. I'll of course reply to it, substantiating what I said, as best as I can, but it's late here, and when replying to an argument, I prefer to do it thoroughly enough, so I'll do it, hopefully, by tomorrow night my time, otherwise a day later, if too busy. I think the reply link should be alive until then.

Meanwhile, you might want to scrutinize your own reply (the one to which I am replying here) and think a bit more deeply about what might be wrong with it. And until my full reply to come later, here are a couple of hints:

Hint 1:

>"awk" is generally accepted as a full programming language, whereas "tr", "sort", "uniq", and "sed" are command line utilities.

- a tool can very much be both a full programming language as well as a command line utility at the same time. awk falls into that category [1], as do many other Unix commands. Who made up a rule that it cannot be both at the same time? You?

Hint 2:

Check out your line:

>So you invoke awk, and then run the output of awk through sort and sed.

and compare and contrast its meaning with the meaning of your few lines immediately below it, including the one that says "McIlroy’s solution is genuinely shell:". Try to see the similarity/difference/contradiction.

[1] Finally, read the book The Unix Programming Environment, a classic, by Kernighan and Pike (Unix pioneers). I cut my Unix teeth on it, years ago, although, of course, years do not mean I am right and you are wrong. Facts do. There are chapters in the book on awk and sed. And IIRC they come under the topic of filters (maybe even the chapter name is that) a.k.a. command-line utilities, although not every such utility needs to be, or is, a filter. I think you have some confusion about terms and their meanings, and/or are assigning your own meaning, even though you use words like "generally accepted".

Also skim this article (published by me, years ago, on IBM developerWorks) to get your fundamentals more clear:

Developing a Linux command-line utility:

https://jugad2.blogspot.com/2014/09/my-ibm-developerworks-ar...

Enough for today - will do follow-up comment as I said, if needed, in a day or two.


Then we'll agree to disagree. I think you are wrong on so many points here, it's clear we're not going to agree, and probably won't find common ground.

Thank you, by the way, for your references to various published material. FWIW, I've worked with BCPL, C, AWK, C++, Unix, Linux, GNU, and much, much more, for the last four decades or so, so I'm not inexperienced, and I have read most of the classics. That also doesn't mean I'm right, but it does mean that I have a basis for my opinions.

So thank you for your offer to school me, but I'll decline, and, as I say, accept that we disagree.


>Then we'll agree to disagree. I think you are wrong on so many points here

Thanks for casting aspersions without even so much as a mention of what the "so many points" are that I am supposedly wrong on.

When I said upthread that you have misunderstandings, I at least mentioned some and hinted at or gave a clue to what the others were.

Also interesting that when kazinator said to you that awk is part of shell, you meekly accepted that he was right, thereby contradicting your earlier claim that my shell solution was not a shell but an awk solution. And in that same comment ( https://news.ycombinator.com/item?id=19279030 ) accepting it, you still seem to be neither here nor there, by your own words, where you say things like you "see why it could, perhaps should, be though (sic) of as being "shell", but "always have trouble thinking of it as such".


Just happened to see your reply here before I went off to sleep:

https://news.ycombinator.com/item?id=19276012

Interesting and maybe significant that you say: "Then we'll agree to disagree. I think you are wrong on so many points here, it's clear we're not going to agree, and probably won't find common ground."

First, interesting that you say "I think you are wrong on so many points here ..." but do not deign to offer any points to back up your statement. Kind of a cop-out, looks like. Anyone can say someone else is wrong; such statements do not carry any weight unless backed up with something more substantial.

And about your "four decades", like I said in a previous comment, years or age do not matter, facts do. I care not a whit if the person I am arguing with has 4 years or 4 decades or 4 centuries of experience. They (or I) can still be wrong (or right) about any specific topic we happen to be arguing about. I've been known to acknowledge that I was wrong, in arguments with people less experienced than me, many times, and vice versa has happened too.

Nor is finding "common ground" the goal (this is not some sort of compromise between political parties, it's a technical argument). Getting things right is the goal. For which, sometimes one party or the other may have to admit they are wrong - including me. Just that I do not think I am wrong in this case.

Will still write my fuller reply as I said earlier, to keep my word, and to make the picture more clear for other readers, since you have made these statements, even if you have hastily left the conversation.


OK, so as kazinator has pointed out, awk is now a mandatory part of Posix, and so is genuinely a part of "shell". My reply there says that I and my colleagues still think of awk as fundamentally different from other pipeline facilities such as tr, sed, sort, uniq, and so on, but I can see why it could, perhaps should, be though of as being "shell".

So it's shell. I might, however, given my background, and remembering as I do its first introduction, always have trouble thinking of it as such.


It's not a cop-out, we disagree.

> Nor is finding "common ground" the goal (this is not some sort of compromise between political parties, it's a technical argument).

We disagree. When there is a disagreement, finding what you agree with the the first step in finding where the lines of reasoning diverge. Finding common ground is the first step in resolving differences.

> Getting things right is the goal.

Sometimes in software there are judgement calls. Maybe this is one of them, maybe our definitions differ. Sometimes definitions differ because of context or experience. In each case, the terms used are not right or wrong, they are definitions that are useful in the context.

> For which, sometimes one party or the other may have to admit they are wrong - including me.

This is not an "I'm right, you're wrong" situation. By my experience, in my context, what you wrote would be called a "shell solution" in the same sense as the original command-line solution would be called a "shell solution."

You think that invoking AWK from the command line means that it's still a command-line script. Your definition of the terms means that you accept that invoking AWK still lets you call it a "shell solution."

I think that is fundamentally and structurally different from using command line utilities such as tr, sed, sort, and uniq.

So my position is clear - your solution that you call "shell" is not, in my opinion, just "shell". To me, your solution is an AWK solution, and you feed the output from your AWK program through shell utilities.

You are using the terms in a manner that is different from how I'm using them, that much is now clear.

Do you agree that you have written a shell script that invokes a program written in AWK?

Would it be different if you wrote a shell script that invoked a C program by calling a C interpreter? Would you still call it a "shell solution to the problem?"

Does it matter? Really? I've made clear why I've said that I don't class your solution as being shell, why do you care?


Well, I'm a few days late to write my final point, due to being busy with other work. I know you've probably left this thread by now, as I didn't see any replies to my other challenges to you (about your misconceptions, about your calling some of my points "wrong" without substantiating why, and about your outright waffling (using terms like "could", "should", "maybe", etc., that I referred to elsewhere in the thread), but as I said, I'm not just making the replies for you, but for others, and also because you made accusations against me, so as to vindicate myself (although I do not need to do it, and the choice to do it or not is solely mine - it's just that I choose to do so this time). So here goes - my last comment in this largely futile thread:

I said I would point out a "fundamental flaw" in your points. The flaw is this:

You thought (and said) that my shell solution was an awk solution. That is wrong. It is a shell solution (and not an awk solution) for multiple reasons, which any slightly-more-than-beginner-person to awk and shell, should have easily known, if they had their fundamentals clear, which implies that you do not. It is a shell solution because:

1) the entire script is a pipeline (which is obvious to see from the pipe signs used, if you knew your stuff and had paid attention to the code, before writing your first reply). awk does not have the pipeline operator (as meaning send the output of the previous command to the input of the next command). That itself should have told you that it is a shell script, not an awk script. There is an awk command embedded in the shell script, but that is very different from saying that it is an awk script.

2) You said elsewhere in this thread, in reply to kazinator:

>That I did not know - thank you. We here still think of awk as fundamentally different from other pipeline facilities such as tr, sed, sort, uniq, and so on, but I can see why it could, perhaps should, be though of as being "shell".

That statement of yours above is wrong on two counts:

a) awk is not fundamentally different from tr sed, sort, uniq, etc. It is a Unix command-line command like any other. The fact that it happens to be what you and some others may call a full programming language (not a well-defined term, anyway) does not make it any less of a command-line command. A tool can be both of those at the same time, and awk is. So is Perl. So is Python. So are many other languages. In fact as someone else said and I hinted at, sed may be a Turing-complete language. So does that suddenly make my script a sed script, just because I used sed in it? But I used awk in it too. So should it be called an awk-sed script? But I used sort too. So now should I call it an awk-sort-sed script? See what I am getting at? No, it should just be called a shell script, because that is what it is. The shell is a high level language that orchestrates other programs via its syntax and operators. (See below about the shell's operand being whole programs.) You claimed that the main work of my script (the word counting) was done in awk, and the results piped to other commands, therefore it was an awk solution. But it is the shell that is doing the piping, not awk! awk cannot do such orchestration, at least not easily, not without resort to the "system()" library function it has, but that is again implemented using the shell (and other stuff, like fork and exec system calls - I'm simplifying here).

All shell scripts can consist of any command or combination (not just pipelines [1]) of commands, irrespective of the type of the command, whether it is a programming language or not, what language it is written in, etc. In fact there is not even a requirement that the commands used in shell scripts should all be filters; that requirement is only for shell pipelines. [2]

[1] A shell script can: consist of just a sequence of (one or more) command(s), terminated either by semicolons or newlines, or both; consist of one or more pipelines only; consist of any combo of the preceding. And also other variations, including at least an ampersand (&) terminating the command (or pipeline), which makes the preceding command or pipeline run asynchronously from the rest of the overall command/pipeline/script, if any, i.e. in the "background", as we say in Unix.

[2] Here is a shell script that demonstrates many of the above points:

  # a_script.sh
  foo1 # run foo1
  foo2; foo3 # run foo2, then foo3
  foo4 & # run foo4 in the background
  foo5 > f1 #run foo5, redirect its stdout to f1
  foo6 < f1 | foo7 arg1 arg2 | foo8 arg3 arg4 arg5
Any of those foo* commands in the script could be any command at all, without any restrictions. Only the commands in the pipeline on the last line of the script, even need to obey the conventions of filters, that I described above. The commands on the preceding lines do not.

All this is part of the flexibility, beauty and power of the shell, whether used in scripts or on the command line. Which brings me to my next key point: there is essentially (almost) no difference between typing commands interactively at the shell prompt, and invoking the same commands from within shell scripts that are run by the shell. The exact same syntax with the exact same semantics can be used (for all practical purposes, maybe with a few exceptions, in both modes, interactive or script).

In fact, you can even type for and while loops [3] (including with redirection of their input and output) at the shell prompt. (You can even type if statements at the shell! Same for case statements.)I do it all the time, for throwaway "scripts" such as ones to monitor the execution of some processes, and so on. And many standard Unix books - like classic book, The Unix Programming Environment (UPE), that I mentioned in this thread - show that in examples.

Another thing that UPE says and shows is something to the effect that "the shell is a very high level language - its operands are whole programs (emphasis mine)". That is why we can do things like the example in [3] below, but first, another example:

  while : # : is a built-in that evaluates to True, 
  saving having to run the true command from disk each 
  time
  do
    ps -aef | grep foo
    sleep 10
  done
This is a script (but can equally well be typed directly at the shell prompt, for the reason I gave above) that monitors the execution of the foo command. Better versions of it, using while and until commands of the shell, are shown in the book, you can look them up. One version may start like:

[3]

  while ps -aef | grep foo
  do
     # something or just sleep a bit
  done
which shows the point about the shell's operands being whole programs - the "ps -aef | grep foo" part is used as an operand in the while condition - and it is a pipeline, bigger even than "a whole program"! This works because the exit code of the pipeline is the exit code of the last command in it, which is grep, so the while condition is true if grep finds a match of foo in the ps output.

b) You called tr, sed, sort, uniq and so on, "pipeline facilities". They are that, but are not just that. Before and apart from the fact of being "pipeline facilities" (which is anyway, a non-standard term you used, a better and more standard term would be just "Unix commands" or "filters" - filters is a standard term, for programs that read either filename arguments or their standard input, process the input in some way, and write the results to standard output, thereby enabling the whole Unix pipeline paradigm), they are also simply normal commands, or programs. Any of those commands can be used either standalone, or in a pipeline. In fact there are other ways of using them too, for example, you can invoke any of those commands (as well as any other executable) as a child process from some other program you write, in C, Python or other programming language. You are creating distinctions where none exist, for who knows what reason.


RiscOS !Draw

Vector drawing package that was neat, clean, fast, versatile, user-friendly, and powerful.


If you do a search for "black bar" you'll find lots and lots and lots and lots of results, most of which answer your question.


Not a true statement. Where are you searching? I can find nothing to explain this other than the comment below about someone dying, but that doesn't explain anything either.


> Not a true statement. Where are you searching?

At the bottom of this page is a search box. Into it type "black bar" (without the quotation marks). It takes you to this link:

https://hn.algolia.com/?query=black%20bar&sort=byPopularity&...

Have you not seen the search box at the bottom of nearly every HN page? Perhaps that question sounds a bit brusque, but in effect, whether you meant it or not, you called me a liar, and I rather take exception to that.



You say: IPFS promises nothing, nor tries to, in the way of permanent archiving.

From the IPFS web site[0]:

Humanity's history is deleted daily

...

IPFS keeps every version of your files and makes it simple to set up resilient networks for mirroring of data.

This is clearly stating that the system keeps every version of your files, and it says nothing about "pinning", or the fact that the files will, in fact, not be kept. At best the web site is misleading, at worst it is simply lying.

I don't yet know with the IPFS really is, nor what it really does, but it's statements like that on the IPFS web site that makes me distrustful of the hype.

[0] https://ipfs.io/#why


Imo, the hype is irrelevant. Many people misunderstand IPFS, and the wording that you highlighted doesn't help. Luckily, I view IPFS as a very valuable "technology", and whether or not it specifically wins in this space, I don't care - I care that the technology is useful and I think it (or something like it) will eventually be the future of the web.

So I don't really buy into the hype-drama. So many people are concerned with hype.

Anyway, to your specific points - if you understand IPFS those comments are not entirely off board. However I can understand why they would lead people astray. In reality I see those comments, ie human history being deleted, as a reference to the mutable web. I can find a post on Reddit and today it is meaningful, tomorrow it might be deleted. In a general immutable system, if I reference an immutable address to the content I care about I will always find exactly that content. Whether or not it exists permanently is another issue, one that I don't care about honestly - I care that what exists can't change out from under you. Just by viewing data in an IPFS-like system naturally makes you own it, as you effectively download a copy of it. No one can take that from you.

Now, whether or not you decide to permanently hold onto the data you want is another story. But again, permanency is not likely to be "solved" by anyone.. and honestly, given how so much "content" can be illegal, I don't think we ever can or should solve the permanency issue.


As the quotation goes: "What Andy giveth, Bill taketh away."


So when you say "longest" you don't mean "longest section of code", I'm guessing you mean "section of code in which execution spends the longest time".

If so, good, but it wasn't clear to me that that's what you meant. If you mean something else then I don't know what you mean at all.


"Longest" seemed clear to me. (Over "largest" and "slowest".)


But you wrote it, so of course it's clear to you. To me, when someone says "the longest part of the program" I immediately think of the routine that stretches over the most lines. That's the longest part of the code.


IDK, "longest part of code" to me clearly seems to refer to length of the code (i.e. lines of code), not to length of its execution time; so I'd say that there definitely seems to be some confusion caused by the choice of words.


Just to further these two posts, that is why I asked my question at the top here. "Hottest" code, I thought, was already well established for most executed. I had never seen "longest" for anything.

To that end, I was taking it to mean that longest "synchronous" path through your system. Not necessarily a single method, by any measure. But, systems have plenty of what I will call "checkpoints" where code can be restarted/rerun with no hard to recover penalty. That is what I took to mean by longest.


Very close over human ranges:

    C = (F-30) / 2
    F = 2 * C + 30
Exact conversions:

    C = (F-32) * 5 / 9
    F = C / 5 * 9 + 32
Table:

    C     F
  -40   -40
    0    32 (Approx water freezing point)
   10    50
   20    68 ( ~ 70)
  ~21    70
   30    86
   37    98.6 (body temperature)
   40   104
   50   122
  ...   ...
  100   212 (Approx water boiling point)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: