Any Adobe Acrobat gurus here?
Sep. 7th, 2007 08:10 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I've been collecting free knitting patterns from websites for a while, and while most are in PDF format many are not. PDFs are certainly easier to collect since they are one file rather than several so I thought I'd convert them to PDF myself.
Unfortunately (or fortunately maybe) I started with the ones I collected from Berroco, since the majority of their patterns are in PDF already I figured I'd just convert the few older ones that aren't. After I removed the site navigation and the extraneous ads for the yarn and other patterns and otherwise cleaned up the page so it looked how I wanted it to as a PDF, I tried using Acrobat to convert it.
This is the HTML:

This is the PDF:

Is Acrobat just that bad at converting HTML or is there something I'm missing? There's no reason I can see for the inserted line breaks in the heading or for the double spacing in the body. Is it because the containers are tables? Maybe it doesn't like external stylesheets? I don't want to invest too much time into this project but I want the PDFs to look good, and in Berroco's case, like the PDFs they put out themselves.
ETA: Well the link spacing issue is just that, a line spacing issue in the CSS. While 1.5em works on a webpage it needs to be 1em in the PDF. Maybe that's the issue with the lack of padding in the tables as well. Still haven't figured out the problem in that heading cell with the inserted line breaks....
ETA2: In fine hacker tradition I have managed to strip the CSS down so it mostly works. I combined a few tags and removed some empty paragraphs. I ended up adding the padding to the paragraph tags in the instructions rather than to the section ID because Acrobat's CSS handling is borked. It's not as pretty as I'd like and there's a bit more slash and burn than I'd hoped for before I can convert the files, but it works. Thankfully Berroco uses one style sheet for their pattern pages (except for a few really old patterns, which I don't think I saved any of) so the actual CSS hacking is done. I just need to cut chunks out of the HTML and tweak it before I convert to PDF.
Unfortunately (or fortunately maybe) I started with the ones I collected from Berroco, since the majority of their patterns are in PDF already I figured I'd just convert the few older ones that aren't. After I removed the site navigation and the extraneous ads for the yarn and other patterns and otherwise cleaned up the page so it looked how I wanted it to as a PDF, I tried using Acrobat to convert it.
This is the HTML:

This is the PDF:

Is Acrobat just that bad at converting HTML or is there something I'm missing? There's no reason I can see for the inserted line breaks in the heading or for the double spacing in the body. Is it because the containers are tables? Maybe it doesn't like external stylesheets? I don't want to invest too much time into this project but I want the PDFs to look good, and in Berroco's case, like the PDFs they put out themselves.
ETA: Well the link spacing issue is just that, a line spacing issue in the CSS. While 1.5em works on a webpage it needs to be 1em in the PDF. Maybe that's the issue with the lack of padding in the tables as well. Still haven't figured out the problem in that heading cell with the inserted line breaks....
ETA2: In fine hacker tradition I have managed to strip the CSS down so it mostly works. I combined a few tags and removed some empty paragraphs. I ended up adding the padding to the paragraph tags in the instructions rather than to the section ID because Acrobat's CSS handling is borked. It's not as pretty as I'd like and there's a bit more slash and burn than I'd hoped for before I can convert the files, but it works. Thankfully Berroco uses one style sheet for their pattern pages (except for a few really old patterns, which I don't think I saved any of) so the actual CSS hacking is done. I just need to cut chunks out of the HTML and tweak it before I convert to PDF.