Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Hi everyone - I'm the author of this post/site. This is last weekend's project.

I got frustrated with not having a good way to print Markdown files from GitHub (root level repo READMEs and individual .md/mdown files).

It's super simple to use: just replace 'github.com' with 'gitprint.com' (when on a GitHub markdown file page) and it will convert it to a beautiful print-friendly PDF and prompt you to print.

Any feedback would be appreciated. (It's just a fun side project, but would like to improve it as I think it could be useful for like-minded people)



What is doing the typesetting? I do not recognize:

   Producer: Qt 4.8.4 (C) 2011 Nokia Corporation and/or its subsidiary(-ies)
Why didn't you use pandoc+latex:

  $ pandoc -s -S -o gitprint.pdf FILE.md


As a fun exercise, you should checkout the Gitit README[0], which is a cool VCS-managed wiki that renders everything via Pandoc, using this Gitprint service. Or not. I see you or someone else constantly retorting with "Why not Pandoc?" and I agree with you. I started to get interest in Haskell again just because Pandoc, and Gitit, and some other very impressive stuff, is written in Haskell and seems to do some stuff very powerfully.

However, a caveat. It is not mentioned in the documentation, but Gitit does not enable PDF export by default, when Pandoc is more than capable of doing so? Why? Because, in all fairness, it can add significant load. According to the conf file comments.

> pdf-export: yes

> # if yes, PDF will appear in export options. PDF will be created using

> # pdflatex, which must be installed and in the path. Note that PDF

> # exports create significant additional server load.

How much load, I am not sure. I would love to see a similar Happstack app like this program that runs gitit or the new fork gitit2, rewritten for Happstack updates now that the API has matured. This makes me think performance is an issue.

As an aside, I know ShareLatex opened up their code. I have not had time to check, but I was curious if they do use standard Latex for generating files and if they do, how they make it performant (I know they have a couple layers, I watched their old Node/JS London talk, I mean hear Latex in particular) so fast for potentially thousands of people at once.

[0] https://gitprint.com/jgm/gitit/blob/master/README.markdown


The server load is due to using Latex for the typesetting. Is it surprising that publication quality typesetting takes more resources than the ragged right text that this puts out? No.

Is it surprising that processing inline TeX equations or Bibtex citations takes more resources? No.

Compare and Contrast:

Gitprint's pdf of README: https://gitprint.com/jgm/gitit/blob/master/README.markdown

Gitit's html export of README: http://gitit.net/README?printable

If I don't care about the quality of the text layout gitit's printable html page is equivalent to the gitprint pdf. Adding Merriweather to default.html is trivial. But as another commentator pointed out maybe using a new trendy font might not be the best choice if you want decent unicode coverage; just in case one of your users is among the "minority" of people who were unsatisfied by the basic ASCII charset.

Moreover comparing gitprint's markdown support to pandoc as if it is apples to apples is silly.


I definitely appreciate what you are saying but when I read the thread three separate response to comments were example incantations of Pandoc. Pandoc is amazing, this appears to a weekend project.

Is Latex requiring processing power to generate PDFs that become professional publishing house books surprising? Of course not. Latex is sick. I was a non-tech student in college, and when I discovered Latex in a CS class my mind was blow and loved it for a while, despite fucking up regularly with its learning curve. (Funny anecdote: I even had a terrible philosophy professor in my liberal arts college senior year who went to Stanford and noticed he used Latex for everything, unlike any professor at my tech-illiterate school; he used it because his friend was this guy Donald Knuth; I kissed that guy's ass for the rest of the semester after that tidbit just for stories about Knuth). My point was scaling Latex in a server environment would be difficult, and I was addressing a few people equating this whole thing to Pandoc, which I did not find fair. Running Latex concurrently for lots of people sounds rough to me, but then again I never implemeneted anything nearly as complicated as that.

Gitit does not just give you just a printable HTML page. If you notice, gitit uses the Pandoc backend to export many formats. It might not set them as printable, but to see gitit+pandoc in action, look at the export menu to the bottom left corner of the default them on the page you link. You do not see the PDF format export, because as I mentioned in my original comment, by default Pandoc Latex+PDF support is not enabled because of said performance load.

I was speculating writing an equivalent of this service with Pandoc and Haskell is doable, but solely generating PDFs would be a load that might make it difficult at scale. I do not know. ShareLatex, who arguably is way ahead beyond my speculation because they implemented a powerful web-based Latex editor, did some very unorthodox (as far as just reloading Latex in the shell every time) architecture to make it fast and resilient.

I am not sure how to respond to the comment re Gitit and Pandoc. Gitit is basically a wrapper around Pandoc, and it is merely a web interface to everything Pandoc does for different document formats, and hackage clearly lists it as a dependency with version 1.10+.[0] I know they are not apple to apple comparison. It was just a speculation. In the other post I was actually using Gitit and Pandoc interchangeably because the latter is the integral component of the former.

I feel there is a type checking joke lurking in my last line, but I am too tired to find it.

[0] http://hackage.haskell.org/package/gitit-0.10.3.1


I totally agree: pandoc is amazing, this is just a weekend project that is a solution in search of a problem. I can just highlight github's markdown pretty print and tell iceweasel to print selection and select save as pdf instead of sending to cupsd queue.

To be clear this project:

  1. Converts a certain unknown basic markdown syntax to html 
     (iff the filename is something that it recognizes[^1])
  2. adds css for Merriweather, Open Sans, Dejavu Mono and a silly iconfont
     (I prefer pandoc's pygments syntax highlighting over goofy icon)
  3. renders html->pdf using webkit by way of phantomjs. 
Pandoc to pdf at ROFLSCALE? Speculate no more:

  $ pandoc -s -S -t html5 -o out.html README.md 
  $ wkhtmltopdf out.html out.pdf 
or

  $ pandoc -s -S -t html5 -o out.html README.md  
  $ phantomjs rasterize.js  ./out.html out.pdf Letter


To be honest using pdflatex is just as fast as phantomjs/wkhtmltopdf.

The type checking joke is that I never compared gitit to pandoc. So don't worry about responding to any comment you think I made regarding gitit and pandoc.

[^1]: https://gitprint.com/jgm/pandoc/master/README = "Something went wrong! Couldn't process https://raw.github.com/jgm/pandoc/master/README/master/READM... = ROFLSATIRE


The Gitprint error is cute. I did not have that when I tried. I guess more indication I will stick to my own tools. As for everything else, thanks for the clarification.


This is great. Would happily pay a few $$$ to be able to use on private repositories.


I haven't looked into private repos yet, but good idea: will add it to the todo list.


Private markdown files should be accessible from a tokenized raw.github.com link. So it'd be neat if https://gitprint.com/josh/secret/master/REAMDE.md?token=123 worked.


Great service, thank you! I've already used it to print out https://gitprint.com/hueniverse/hawk




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: