This site is now 100% read-only, and retired.

Creating nicely formatted HTML and PDF documents

Posted by Steve on Wed 31 Aug 2005 at 01:12

Tags: none.

There are many ways you can produce HOWTOS and other organised documents, but one of the simplest to get started with is the linuxdoc tools which allow you to create nice clean HTML, LaTeX, and PDF output from the same input document.

Most of the documents you will find in The Linux Documentation Project are created with these tools. The tools accept SGML files as input and generate output in various forms including:

  • HTML
  • PDF
  • LaTeX

Getting started with these tools is very straightforward, and the output is both simple to read and something that readers should be used to.

To get started and create HTML output files you will need to install the linuxdoc-tools package:

root@lappy:~# apt-get install linuxdoc-tools
Reading Package Lists... Done
Building Dependency Tree... Done
The following extra packages will be installed:
  libsp1 sp
The following NEW packages will be installed:
  libsp1 linuxdoc-tools sp

For creating PDF output you will also need to install the linuxdoc-tools-latex package:

root@lappy:~# apt-get install linuxdoc-tools-latex 
Reading Package Lists... Done
Building Dependency Tree... Done
The following extra packages will be installed:
  tetex-extra
The following NEW packages will be installed:
  linuxdoc-tools-latex tetex-extra

Once you have installed your chosen packages you're ready to begin. To get started we will look at a basic input file which demonstrates several aspects of creating documents.

The following input file has can be used to get started with. Save it as intro.sgml:

<!doctype linuxdoc system>
<article>
<title>This is my title.
<author>Steve Kemp
<date>V0.1  2005-31-08
<!-- Primary category: 2.6. {Security} -->
<!-- Keywords: Basic, Introduction -->
<!-- Oneliner: My tagline -->

<abstract>
<nidx>Howto</nidx>
<p>This document exists as a simple sample of the LinuxDOC XML format.</p>

<!-- Table of contents -->
<toc>


<!-- =========================== -->
<sect>Section Title
<p>Here we have a section.</p>

<p>
 <itemize>
  <item>With a little list.</item>
  <item>And another</item>
 </itemize>
</p>

<sect1>Subsection Title
<p>Where more content could be included.</p>


<sect>Second Section Title
<p>Here we might have more text.</p>
<p>Link to more details: <url url="http://tldp.org/HOWTO/Howtos-with-LinuxDoc-5.html#ss5.2" name="Docbook markup HOWTO"></p>

</article>

This file has several key points:

  • It contains the entry to generate a table-of-contents automatically.
  • It contains two sections, denoted with .
  • It uses a sub-section with .
  • It also uses a couple of markup items, for creating lists and URLs.

Once you've created your input SGML file you can use it to generate a nice collection of HTML files with the following command:

sgml2html --split=1 intro.sgml

This will split up each section into its own HTML file.

When it comes to PDF generation you can use:

linuxdoc -B latex --output=pdf intro.sgml

In this case the output will be a single PDF file.

The two output formats can be viewed online:

If you wish you can customize the output by adjusting the system-wide templates. For example when producing PDF output files I like each new section to begin upon a new page. This can be accomplished by editing the file /usr/share/linuxdoc-tools/dist/linuxdoc-tools/latex2e/mapping and adding to the section marked :

\n\\newpage

This will leave you with:

          +       "\n\\newpage\n\\section"

Once you make your change you can rebuild the PDF output file as before. If you wish you can use the supplied Makefile to rebuild all output whenever the input file changes by simply invoking:

make

For more details on creating the SGML input files, and the available markup please consult the Howtos with LinuxDoc Mini-HOWTO. This explains the available markup, etc.

 

 


Re: Creating nicely formatted HTML and PDF documents
Posted by Anonymous (201.9.xx.xx) on Wed 31 Aug 2005 at 02:52
Please, this is not XML, but SGML. XML mandates that every tag must be closed.

[ Parent ]

Re: Creating nicely formatted HTML and PDF documents
Posted by Steve (82.41.xx.xx) on Wed 31 Aug 2005 at 04:42
[ View Weblogs ]

Slip of the fingers. Fixed now, thanks.

Steve
-- Steve.org.uk

[ Parent ]

Re: Creating nicely formatted HTML and PDF documents
Posted by chme (83.227.xx.xx) on Fri 2 Sep 2005 at 07:50
I would have liked to see some kind of (short) discussion about the advantages of sgml over LaTeX - if there are any.

Can't a LaTeX code produce the same document just as easily, be turn into turn into more fileformats and be a better way to start since in the end you will be able to do more advanced documents with it as well?

Maybe you could change the title to "Creating nicely formatted HTML and PDF documents with SGML", and comment briefly on a few alternatives in the beginning, with some links?

[ Parent ]