[ale] FOP, XML, XSLT, perl, HTML, apache, PS, PDF, oh my.

Fri Aug 23 10:49:40 EDT 2002

>>>>> "Michael" == Michael Solberg <misha at solberg.music.uga.edu> writes:


[...]

    Michael> The system needs to be a CGI interface to a database that
    Michael> allows people to edit various official documents.  These
    Michael> documents then need to be instantly available to the
    Michael> users in HTML and PDF formats.


        Aiieee.  If at all posible, use mod_perl and Apache::Registry
at the least (unless you're running on a server you don't control, or
you have requirements that CGIs run as different users from the user
running the httpd processes).  Or look at one of the Perl-in-HTML
solutions like HTML::Mason (my personal favourite) or Template Toolkit
(which I haven't used as much for web things, but it's pretty much
replaced m4 for my generic text preprocessing needs for anything new
that I write).


[...]

    Michael> Surely some of you guys are doing this already.  How
    Michael> would this work?  Would I have to write a cron job that
    Michael> ran FOP or Xalan every five minutes to create fresh
    Michael> PDF's?  Is there some magical apache module that would
    Michael> create the PDF's on demand?  Should I just use perl and
    Michael> ghostscript?  Is this XML stuff processor intensive?  Do
    Michael> I need a chunk of hardware for it?  Is this stuff all
    Michael> it's cracked up to be or does it just help pad resumes
    Michael> and increase IPO's?


        If your data isn't changing that frequently, you probably
could get away with periodically rebuilding documents.  Given the
right dependencies you might even get it all down to just one
Makefile. :)  Judicious caching of generated output and dependencies
would probably reduce the amount of output you'd have to remake from
scratch (see the Cache::Cache family of modules).


        As for `to XML or not to XML', consider if you'll ever need to
use the data that you'll be keeping in XML for multiple purposes.  A
database may be better suited to keeping your raw data, and you can
always have an export-to-XML functionality if you ever need it.  XML
parsing will add some amount of overhead (both time and space, the
exact amounts of both depending on what tools you use to do your XML
processing).


        Also, another output path you might consider would be
generating your PDFs from LaTeX (generated from XML or straight from
your database).  From the same LaTeX source you can generate very
similar looking output in both PDF and HTML formats.


-- 
Fletch                | "If you find my answers frightening,       __`'/|
fletch at phydeaux.org   |  Vincent, you should cease askin'          \ o.O'
770 933-0600 x211(w)  |  scary questions." -- Jules                =(___)=
770 294-0820 (m)      |                                               U

---
This message has been sent through the ALE general discussion list.
See http://www.ale.org/mailing-lists.shtml for more info. Problems should be 
sent to listmaster at ale dot org.