I believe dginev's Docker image https://github.com/dginev/ar5ivist is very close to what runs on arXiv and can be run locally. It uses a recent LaTeXML snapshot from September.
> Just like a lecturer won't suddenly switch to a German accent when saying words like "schadenfreude" or names like "Friedrich Nietzsche"
Is there a middle ground? Whenever I check my content with a screen reader, uncommon foreign names are often mispronounced in ways that are sometimes almost irrecognisable. Even my name comes out wrong, although it would be understandable (typically, the stress ends up on the wrong syllable).
The article talks about converting LaTeX to HTML, which is feasible today, if only buggy and fragile. This is the textbook the author talks about, which is written in LaTeX (but compiled with LaTeXML instead of pdflatex): https://forallx.openlogicproject.org/html/
It does. But why can't LaTex produce png or something? Why does it have to be either pdf or pretty much abandon the idea of typesetting? Or am I misunderstanding?
By the way, LaTeX and TeX existed long before PDFs were ubiquitous. A common workflow in the 1990s and prior was
TeX/LaTeX -> DVI -> PostScript -> printer
And DVI stands for "device independent", so the idea was you can take a DVI and convert it to any format. PDFs just eventually became the dominate format.
or dvisvgm, which will produce scalable images. However, images are even worse than PDF in terms of accessibility, which is what the article is talking about.
Even after properly tagging your PDF and using Acrobat Reader to reflow the text, you cannot achieve this level of flexibility: https://forallx.openlogicproject.org/html/
Note how Richard's book adapts to any screen size, can change fonts and color schemes, system settings such as 'high contrast' will affect the rendering of the page, and you could even use browser extensions to restyle the page to e.g. use a more dyslexic friendly font of your choice.
This kind of functionality is not afforded by Adobe Reader. Even the official Adobe's example of reflowing that was posted in another thread is quite bad:
https://helpx.adobe.com/uk/acrobat/using/reading-pdfs-reflow...
The reflowed PDF is just stacking all text and removing all non-text visual cues. For example, pairs of name/role are separated by whitespace in the PDF, but after reflowing they are undisguishable from each other (who would be the senior VP, Sunny or Daniel?). In HTML, reflowing would preserve semantically relevant whitespace out of the box.
Some people like text which they can reformat easily (increase font size, change to different screens dimensions).
What is generally bad about the PDFs that Latex produces (and is a problem with latex, not a problem with PDF) is that they are very inaccessible, they don't work with screen readers.
The reason it's so hard to make latex output HTML (although people are working on it) is that latex is actually a programming language, which is executed to decide where things go on in the PDF.
Make latex output HTML is a bit like trying to take (say) a game engine like Unity, and change it's rendering engine to output HTML instead of graphics -- in the worst case it's basically impossible, as the game just generates commands like "draw triangle here", without context or semantics.
The accessibility and reflowability of HTML content, not to mention the ability to customize color schemes, fonts, line spacing, and similar are not possible with PDF, even using Adobe software. Even using the latest PDF 2.0 standard, you are ultimately expected to convert it to HTML if you need all that flexibility (such as via https://ngpdf.com/).
> The reflowed PDF is just stacking all text and removing all non-text visual cues. For example, pairs of name/role are separated by whitespace in the PDF, but after reflowing they are undisguishable from each other (who would be the senior VP, Sunny or Daniel?). In HTML, reflowing would preserve semantically relevant whitespace out of the box.
The tagging work is still highly experimental. A major missing element is equation tagging: for now, you need to produce an 'associated MathML' file externally, for instance using LaTeXML. Even then, PDF readers do not support the MathML tags yet! If anything, I am sure that the LaTeX3 team would appreciate you posting minimal examples of mistagged PDFs.
If you want to produce accessible documents from LaTeX, you should convert to HTML. ATs such as screen readers just work miles better than with PDFs, and given the resources put in developing browsers compared to PDF readers, I don't think this will change any time soon. Luckily conversion from LaTeX to HTML is very feasible today, as proved by arXiv. (Shameless plug: I maintain BookML specifically to help lecturers with the LaTeX to HTML work)
A 50-page PDF loads a lot faster and shows a lot smoother than an HTML of equal textual length. And I've never seen any modern tools that turn TeX into multifile HTML (one per section).
> A 50-page PDF loads a lot faster and shows a lot smoother than an HTML of equal textual length.
Very true! Although they are now comparable, if you rely on the browser native MathML instead of MathJax/LaTeX.
(You can test this on long arXiv HTML papers, e.g. https://ar5iv.labs.arxiv.org/html/1710.07304 is more than 60 pages as PDF. Mind you, the ar5iv default CSS is not great. I would use Latin Modern for formulas, at the very least.)
> I've never seen any modern tools that turn TeX into multifile HTML (one per section).
I believe all of them can do it out of the box now. I know for sure that LaTeXML, tex4ht and lwarp can split by chapter or section.
Would you consider adding a web font with the proper OpenType features to render the MathML content? The Igalia work in Chrome is quite decent when the font is good, but for now it needs to be set via CSS (presumably Chrome does not feel like bundling a math font just yet).
Paeno axioms do not define multiplication. Multiplication is considered to be part of second-order arithmetic, which includes Paeno arithmetic (which is a first order system) and augments it to form a stronger set of axioms, of which multiplication is one of them.
The definition of multiplication is indeed in the Wikipedia article, but if you reread it then you'll see it doesn't claim to be a Paeno axiom. A better article to read (after reading about Paeno axioms!) is here: