home | list info | list archive | date index | thread index

Re: [OCLUG-Tech] looking for perl(?) script to convert list of URLs to PDF files

  • Subject: Re: [OCLUG-Tech] looking for perl(?) script to convert list of URLs to PDF files
  • From: "Robert P. J. Day" <rpjday [ at ] crashcourse [ dot ] ca>
  • Date: Wed, 26 Sep 2012 16:53:22 -0400 (EDT)
On Wed, 26 Sep 2012, Aidan Van Dyk wrote:

> On Wed, Sep 26, 2012 at 4:36 PM, Shawn H Corey <shawnhcorey [ at ] gmail [ dot ] com> wrote:
> > On Wed, 26 Sep 2012 15:43:26 -0400 (EDT)
> > "Robert P. J. Day" <rpjday [ at ] crashcourse [ dot ] ca> wrote:
> >
> >>   for anyone who knows perl, i'm sure this will be trivial given the
> >> right perl module.  if anyone can help me out, i would be grateful to
> >> the extent of a beer or three, or something along those lines.
> >
> > Right of the bat, I think you will need WWW::Mechanize to scrap the
> > website, and PDF::API2 to create the PDF.
> >
> > I don't think your project is going to be as simple as you do.
>
> I'ld cheat:
>
>    #!/bin/bash
>    page=0
>    while read URL
>    do
>       page=$((page+1))
>       if [ "$page" ne "SKIP" ]; then
>          html2ps -o - $URL | ps2pdf - $(printf page-%03d.pdf $page)

  hmmmmm ... that has potential but a simple test of html2ps shows
that it doesn't accurately portray, say, oclug.on.ca.

  i ran (on my ubuntu system):

$ html2ps http://oclug.on.ca > oclug.ps
$ gv oclug.ps

and what i get is a *very* simplified version of the page.

  i suspect i just need to look at the html2ps options a bit more to
see what i'm missing.

rday

-- 

========================================================================
Robert P. J. Day                                 Ottawa, Ontario, CANADA
                        http://crashcourse.ca

Twitter:                                       http://twitter.com/rpjday
LinkedIn:                               http://ca.linkedin.com/in/rpjday
========================================================================