It sets a bad precedent to call things like this hacks.
Firstly, calling this redaction implies that the data is missing, and calling what was done "unredacting" is akin to saying someone "decrypted" a cryptographic hash function.
Nobody unredacted anything here, they merely discovered that it hadn't been redacted, and simply looked like it was redacted.
Calling this a hack places responsibility on the people who discovered the information, rather than on the people were put in charge of handling the redaction and screwed it up.
The journalist writing the story has the same level of technical knowledge about how to "redact" properly in the digital realm as the individuals doing the redaction. To the journalist, with zero knowledge of the technical aspects, viewing the "redacted" document, it appears to be "redacted", so when someone "unredacts" it, the action of revealing the otherwise hidden material appears to be "magical" to them (in the vein of the Arthur C Clarke quote of: "Any sufficiently advanced technology is indistinguishable from magic").
To the journalist, it looks like "hackers at work" because the result looks like magic. Therefore their editor attaching "hacks" to the title for additional clickbait as well.
To us technical people, who understand the concept of layers in digital editing, it is no big deal at all (and is not surprising that some percentage of the PDF's have been processed this way).
>How someone like this gets a paying job as a journalist is beyond me.
You seem highly confused on what a journalists job is in this era. Very few publishers are about correctness. It's about speed of getting the article out and getting as many eyeballs as possible to look at the ads in the article.
Or as the saying goes, A lie can travel halfway around the world before the truth can get its boots on.
But there is a more-powerful combo we’re beginning to see: journalists can take a story and prompt their way into a list of missing perspectives. The Lindbergh baby, for example.
You could easily replace them with an LLM if that were the case.
Although I don’t completely disagree with your cynical take I don’t think that’s actually the case for most of the Guardians journalists, they do a lot of quality reporting too
And the lawyers should have used an LLM to perform a first pass of the redactions and methods of redaction.
Going forward the full stack of perpetrators, unindicted coconspirators, lawyers, judges, legislators, journalists, editors, fact checkers, ... it'll all be LLM all the way down such that nothing will be trustable save something akin to Stephenson's gargoyles and Flock cameras for which people will conduct spectacles to shape the salience landscape.
Back when LLM chatbots were new and shiny, I was comparing the failure modes to journalism by way of the Gell-Mann amnesia effect.
Sure, deep investigative jounalism with real skill and effort behind it is a thing; but it is an expensive thing, and opinion pieces disguised as jounalism are much cheaper, as is reporting on other people's reports.
I wish it was true.
From the bullshit jobs in the book, I can only see the box tickers being replaced.
The flunkies, goons, task masters, and duct tapers will probably continue to exist.
Als unless we come up with something like UBI or a dramatic rethinking of how capitalism works in our society there will probably be __more__ bullshit jobs.
My wife was a reporter with a top tier news agency in DC and I was shocked how they divvied up topics.
At best, it was "you're good with computers, go report on this hearing on cybersecurity" but more commonly, it was "who has this morning open? You do? Great. Go cover this 9am on the Israel-Palestine negotiations and what the implications are. We'll do a segment in the 11am hour."
It's important to understand who becomes a journalist in this age.
It's people who are very good with words, and at talking to anyone and everyone about anything, both is a friendly and confrontation way.
They also have almost no understanding of math, science or technology. If they did, they'd get better paying jobs.
Journalism used to be a well paid prestigious career that attracted brilliant people. There is not enough money in what's left of that industry to do that anymore.
I agree they have no understanding of math, science or technology. But I disagree with your assessment of motivations to get "better paying jobs", most people who went into journalism I knew were in brownstones right out of college. They didn't need the money, they inherited it, it was the lifestyle they were after.. that's why we get the journalism we do..
They're not after money. They're motivated by prestige which CAN be money (ew, tacky) but is actually measured by access to key figures, your name being in the right places with the right people, and the cocktail party circuit.
My wife was a reporter in DC and she was at the White House Correspondents Dinner and everything. Living in those circles is surreal. The namedropping is a whole other level. When I realized I was doing it too (with some legit impressive names at the time), I gtfo. I'd rather be evaluated by what I've done or can do vs who I know or knows me.
I don't think you have an understanding of job specialization.
I know some journalists. They are smart people. However, they are not experts in math, science, or technology. They are experts in journalism. This wasn't any different at any time in the past.
I think you have the source of the problem wrong. It's just rich kids who don't actually need the salary, and want to align to a point of view that gets them a contract to write a book, so they get invited to the right parties. They don't know anything, or care about anything.
Journalism school is "eye-wateringly" expensive:
> J-school attendees might get a benefit from their journalism degree, but it comes at an eye-watering cost. The price tag of the Columbia Journalism School, for instance, is $105,820 for a 10-month program, $147,418 for a 12-month program, or $108,464 per year for a two-year program. That’s a $216,928 graduate degree, on top of all the costs associated with gaining the undergraduate prerequisites. (Columbia, it seems important to say, is also the publisher of Columbia Journalism Review, the publication you’re now reading.)
And FWIW, in my very limited and anecdotal experience, the programs are inhabited by people who fully understand their employment and salary prospects, but believe in the work, and often have above-average family wealth to compensate for the gaps. They're good people, but they are not experts.
Haha. I was a journalist for many years. I went to UC Berkeley. I likely currently have a far better paying job than you and have invented technical concepts that founded the LLM.
Obviously, "who becomes a journalist in this age" does not translate to "every person who is alive now who has ever been a journalist".
I'm not sure if your error lies in parsing colloquial English, or in basic statistics. Either way, I think you have fully illustrated the commenter's point.
Journalists are not reliably selected for, or demonstrative of, comprehension or accuracy.
This is dumb trying to call others dumb. This argument is not just inhumane it’s also wrong. The average of something assumed does not negate a real data point. If you did even bit of data science you’d know that. But just another HNer calling someone dumb while confidently wrong. And ironic calling others dumb because of it. So think on that.
Maybe Christmas just leaves the worst on HN … statistically.
(You can’t engagement logically technically or even correctly here and keep
Spouting others are wrong. Think hard on how poorly you comprehension here is even when explained why you are
Wrong.)
Points have been illustrated contradicting the statement. No points have been made supporting it.
Your argument boils down to “all x is bad is valid by default and all Ys that contradict are inherently ‘statistically invalid’”. Do you not get how horribly dumb your logic is?
By this logic I could state all HNErs posting on Christmas are idiots and wrong by default. This of course can’t by contradicted by any statement you make because you are just a data point of one and therefore invalid. Also the original point is supported with exactly 0 data points so in actuality data point of 1 > 0. So my guy. Jesus. Learn stats. Or anything.
To us, it's a life skill. To a non-technical person, it's black magic.
Some folks had to be taught on how folder structures work because they grew up with the appliance we called a "phone" as opposed to a real computer that also happened to be known as a "phone".
I can assure you that plenty of people who were using computers before smartphones, and who have used them every day at work for decades, also do not grasp what we could consider the very basics of file management.
I think.. the way to understand it is: levels. After all, files as the abstractions work with are not exactly there in the form of files in a cabinet. In a sense, even names are made made up fiction, BUT.. a helpful one.
> To us, it's a life skill. To a non-technical person, it's black magic.
I’m sorry, but “this text is black on black background; the actual letters are still there” isn’t “black magic” unless someone is being deliberately obtuse.
So I don't know your specialty, but I'm going to make a wild guess and assume that it isn't stage magic.
State magicians have a whole range of different ways to make something seem like it's levitating, or to apparently get a signed playing card inside a fruit that they get someone in the audience to cut open to reveal.
To a magician, these things are cute, not mysterious.
To the general public… a significant percentage have problems with paged results and scroll bars. Including my dad, who developed military IFF simulation software before he retired, and then spent several years of retirement using Google before realising it gave more than three results at a time.
Would he, with experience working with the military, have made this soecific mistake about redaction? Perhaps, perhaps not, but the level of ignorance was well within his range. (I'm not better, it's just my ignorance is e.g. setting fire to resistors).
Your analogy fails because the purpose of stage magic is concealing what’s going on. That’s not what happened here. Someone just made a really stupid mistake that even non-technical folks can accidentally discover.
There are undoubtedly some people who would be fooled by this, but you don’t have to be technical in order to not be one of them.
Most journalists are ex. English majors (or some other non-technical degree). I would not expect any (even the supposed tech. journalists) to understand the technology they report upon to the level that us here on HN understand that same technology.
Their job is to write coherent articles that gather views, not truly understand what it is they are writing about. That's why the Gell-Mann Amnesia [1] aspect so often crops up for any technical article (hint, it also crops up for every article, but we don't recognize the mistakes the journalist makes in the articles where we don't have the underlying knowledge to recognize the mistakes).
I’m my experience most posters on HN are don’t under technology either. So they both don’t understand people or technology putting them two steps behind a journalist.
Journalists are people, like everyone else, and most people are bad at their jobs.
Plus, what even is the job? For most journalists out there, it's just writing something that draws ad impressions and clicks.
The percentage of journalists that work for outlets where the content itself is the cash source is very small (NYTimes, probably a bunch of other paid subscriptions). And even the NYTimes isn't above clickbait.
No, it is not. But given the abysmal lack of technical knowledge of the "typical computer user" they don't see the redacted PDF's as "having black stick-it notes stuck on top of the text". They see the PDF as having had a "black marker pen" applied that has obliterated the text from view.
When someone then shows them how to copy/paste out the original text, because the PDF was simply black stick-it notes above the text, it appears to them as if that someone is a magical wizard of infinite intelligence.
The journalist is not necessarily responsible for the title. Editors often change those and they don’t need to get the approval of the journalist. The editor knows what they are doing and that it will irk some tech folks.
I seriously doubt the journalist doesn’t understand exactly how this “hack” worked too. Right in the first paragraph, “simply highlighting text to paste into a word processing file.”
A lot of people in the thread here are calling them a non-technical English major who doesn’t understand the technology. Word processors also happen to be the tools of their trade, I am sure they understand features of Word better than most of the computer science majors in this thread…
Agreed - not sure why so many are being so critical here. They probably didn't write the title and for better or worse "hack" has now become a common word casually used by many to mean "workflow trick" or similar.
As far as creating a click bait title, yep, the editor knows what they are doing, and most likely picked the word for the click bait factor.
But I'd also bet the editors technical knowledge of how this "revelation" of the hidden material really works is low enough that it also appears to be magic to them as well. So they likely think it is a 'hack' as well.
This. Similar issue if you introduce someone to how you can "view source" and then edit (your view of) a website. They're like "omg haxors!"
True story: one time I used that technique to ask for a higher credit card limit than the options the website presented. Interestingly enough, they handled it gracefully by sending me a rejection for a higher amount and an acceptance for the maximum offered amount (the one I edited). And I didn't get arrested for hacking!
Using view-source to accomplish something could be considered hacking in the old school MIT sense* of curious exploration of some place or thing for clever purposes.
*: disclaimer, I didn't attend MIT, but did hang out with greybeards on 90s IRC
I have helped someone get an executive job at a Fortune 500 company... by teaching them how to use the dev tools and edit the DOM to replace text and images.
They had been asked for an assignment as part of the interview process, where they were supposed to make suggestions regarding the company's offers. They showed up on the (MS teams) interview having revamped what looked like the live website (www. official website was visible in the browser bar).
The interviewers gave them the job pretty much on the spot, but did timidly ask at the end "do you mind putting it back though, for now?", which we still laugh about 5 years later
Typical quality of The Guardian unfortunately. Don't read their energy reporting if you're at all literate about any of those topics. Any time they do a story on fusion I just about have an embolism.
I also like to think this was maybe done as a form of malicious compliance. Someone inside the agency was tasked with redacting this, and found a way to sneak the information through but still getting it passed by their supervisors, so that the information got out.
To me this is the only explanation that makes sense.
However wouldn’t they risk repercussions when this is inevitably found out? I assume they have records who redacted which documents
> I assume they have records who redacted which documents
(1) Considering it was a rush job (2) general ineptness of this administration and (3) the management wouldn't have defined the explicit job description ("completely black out, not use black highlighter"), the likeliness that there is any evidence that this was intentionally malicious is pretty low.
This happens too regularly across both minor and major issues for me to think this is entirely redactors intentionally messing up. It's just a lot of people being pulled on to the job and not all of them are competent. Maybe some of it is intentional but not all of it I'm certain.
Not in this case, this is just a cover for the guilty because this shows that Epsteins Estate also works for Trump. The rot runs deep. There is no investigation, that is the point.
Out of a thousand people? Where they probably have an email from a PHB that says something like "put a black box over all references to <this list of things?"?
Furthermore, this happens so often, so frequently, in so many high profile cases that even my 80 year old mother knows this "secret hack to unredact a pdf".
If you are CIA / FBI / Court / Lawyer or professional full time redactor of documents you should know that the highlighter doesn't delete the text underneath it.
I think the more likely cause was precisely that it wasn't a technical professional/lawyer/writer doing the redacting, but someone in the administration or close to it that has no idea how to redact information correctly.
Yeah but there was no lock; somebody put a box around the doorknob without anything holding it there, and somebody removed the box and opened the door.
There's nothing else to say about this. Also, your comment is nested even deeper within the same semantic squabbling, so it's odd that you think that it's a waste of time in light of more important things that you are also not talking about.
I think that doesn’t do the scenario justice. They tried to redact and did so in a way that looks visibly redacted (in screenshots many have seen) but can be uncovered.
If you say “they failed to redact data” to a layperson looking at a visibly redacted document they’re going to be confused.
They're likely viewing the electronic documents by analogy to photocopies with blacked out sections where there is nothing to distinguish the text from the redacting marks and nothing you can project out. They don't know the structure of the file format and how information in it is encoded or rendered, or even that there is a distinction between encoding and rendering.
(A better analogy might be the original physical document with redaction marks. If the text is printed using a laser printer or a type writer, and the marker used for redaction uses some other kind of ink - let's say one that doesn't dissolve the text's ink or toner in any way - then you can in principle distinguish between the two and thus recover visibility of the text.)
File formats are complicated. The only reliable way to redact is to reduce that complication to one which humans can manage. This is even true for software that is written by humans.
Plain text and flat images are my preferred formats for things which must be redacted. Images require a slight bit of special care, as the example in the underhanded C contest highlights, but it's possible to enforce visible redaction and transcription steps that destroy hidden information.
To be fair, I put partial blame on the advertisers.
They've been claiming "AI" on their products on anything that has an algorithm basically for the past few years.
They are not. They are factually incorrect. Look up the various definitions of redacted. They fit perfect for the title. Arguing otherwise suggests you are making up definitions and words, in which case, I am still correct.
> It sets a bad precedent to call things like this hacks.
That ship sailed a long time ago. The “phone hacking scandal” in ~2010¹ was mostly calling answering services that didn't have pins or other authorisation checks set.
These days any old trick gets called a hack, heck tying your shoelaces might get called a miraculous footwear securing hack.
I agree, but this would mean that almost anything can’t be called hacking, bc it usually relies on vulnerabilities and implementation defects. If something is poorly encrypted and you retrieve data, you didn’t hack because it wasn’t encrypted to begin with. That can’t be the standard.
There is a line, it is fuzzy, but if all you did was find something which was there for anyone to find, I would place that firmly on the not hacking side. If it was rot 13 I would put that marginally closer to hacking than this.
I think we should all come to terms with it that "hack" doesn't mean anything anymore so we don't have to fight over words that were never clearly defined anyways. On most days this site here should be called "frontendnews".
I find it funny to use a hack to argue about the misuse of words and definitions.
Regardless, redaction does not imply that data is missing. The words were censored or obscured. That's it. Simply looking at the documents proves that. Interacting with them showed how easy they were to uncensor, but the simplicity of the method doesn't change facts.
By all means, complain about definitions and words, but get it right.
It also removes blame from the departments that redacted, it's not like they messed up big time, no, some resourceful brainiac hackers did things that were not allowed to undo the redaction process that was put in place to protect victims.
Firstly, calling this redaction implies that the data is missing, and calling what was done "unredacting" is akin to saying someone "decrypted" a cryptographic hash function.
Nobody unredacted anything here, they merely discovered that it hadn't been redacted, and simply looked like it was redacted.
Calling this a hack places responsibility on the people who discovered the information, rather than on the people were put in charge of handling the redaction and screwed it up.