lpml alpha: Weave: Write documentation pages

Weave: Write documentation pages

[ Previous: Tangle: write code output ] [ Top: LPML alpha ] [ Next: Prepare indices ]

The weave step is the other main function of traditional literate programming: it takes the items as defined and creates the documentation file. Since the documentation in this case is a set of pages, the philosophy is a little different. For each main item, we write a separate documentation page (subitems are left on the same page as their respective parents, with headings.)

The formats used to generate pages are, of course, those read in on the initial scan into the format hash. If no format is specified for an item, the format 'default' is used; if a format is named but not defined, then the body of the item is simply printed (i.e. an undefined format is effectively equal to '[##body##]'). This sounds useless but it comes in handy for embedding fixed pages (such as index.html) directly into the XML document.

The file <project>_format.html is used to generate each of these pages; a more elaborate formatting control mechanism would be nice later, but this will do the trick for now. Code pieces are written enclosed in <code> tags, and <insert> tags are replaced with the label, not the name, of the respective items.

The overall code for weave looks a lot like the scan code, because it's doing very similar things. It scans lines of input, using global variables to mark its current state.

Setup is straightforward: we rewind the INPUT filehandle, load the template into a list (because we'll be scanning it once for each item page) and set some globals to blank values.

seek INPUT, 0, 0;

$name = '';
$item = 0;
$piece = 0;

while (<INPUT>) {

So let's get down to business scanning. The thing weave is interested in is items, so let's look for <item> tags first. For each item we encounter, we'll set up replacement tags as follows:

[##next##] and [##prev##]
These contain the URLs of the next and previous items in the list, respectively. The first item has a [##prev##] pointing to the index, and the last has a [##next##] pointing to the index.
[##nextlabel##] and [##prevlabel##]
The labels of the next and previous items, for nice link formatting.
[##body##]
The body of the item is its documentation and properly formatted code. In other words, most of the work of weave is to build up the [##body##] tag.
[##name##], [##url##] and [##label##]
The name, URL and label of the current item.

Eventually we'll want change dates as well, but we're not tracking it yet, so there's not much point. All in good time.

   if (/(<item .*>)/i)
   {
      $tag = $1;
      $tag =~ s/^<item\s+//i;
      $attr = "";
      %thisitem = (name => '', label => '', pattern => '', language => '', format => 'default');
      foreach $part (split /"/, $tag) {
         if ($attr eq '') {
            $attr = $part;
            $attr =~ s/^\s*//;
            $attr =~ s/\s*=\s*$//;
         } else {
            $thisitem{$attr} = $part;
            $attr = '';
         }
      }

      $name = $thisitem{name};
      $formatname = $thisitem{format};

      if ($parent{$name} eq '') {
         $tags{name} = $name;
         $tags{url} = $url{$name};
         $tags{label} = $label{$name};

         if ($item == 0) {
            $i = $items[$#items];
         } else {
            $i = $items[$item-1];
         }
         if ($parent{$i} ne '') { $i = $parent{$i}; }
         $tags{prev} = $url{$i};
         $tags{prevlabel} = $label{$i};

         $tags{body} = '';
      } else {
         $n = $name;
         $n =~ s/^.*?\.//;
         $tags{body} .= "<br><br>\n";
         $tags{body} .= "<a name=\"$n\">\n";
         $tags{body} .= "<i>$label{$name}</i><br>\n";
      }

      if ($item == $#items) {
         $tags{next} = $url{$items[0]};
         $tags{nextlabel} = $label{$items[0]}; 
      } else {
         $tags{next} = $url{$items[$item+1]};
         $tags{nextlabel} = $label{$items[$item+1]};
      }

      next;
   }
   next if $name eq '';

And of course we terminate the item similarly to the scanner. The difference is that we also write out the current item's page using the various tags we've accumulated during scanning. Note that one of the things we're doing while writing the file is to convert HTML-like XML tags into bona fide HTML; eventually this sort of thing should be handled in some formatting module, but in the meantime I'm intent on HTML output.

Things are only written when either the current item has no children (and no parent), or the current item is the last child of its parent. If the current item is a child, then its body has been written onto the pre-existing body of its parent, and other tags (like next and prev) have already been set during processing of the parent. And the format to be used is also already set by the parent. (Corollary: if you set a format in a child item, nothing will happen.)

   if (/(<\/item\s*>)/i) {
      if (($parent{$name} eq '' && $children{$name} == 0) || $lastchild{$parent{$name}} eq $name)
      {
         if ($parent{$name} eq '') {
            open OUT, ">$url{$name}";
         } else {
            open OUT, ">$url{$parent{$name}}";
         }
         foreach $line (split /\n/, $format{$formatname}) {

            $_ = $line . "\n";
            while (/\[##(.*?)##\]/) {
               $tag=$1;
               s/\[##$tag##\]/$tags{$tag}/e;
            }

            s(</li>)()g;
            s(<p/>)(<p></p>)g;
            s(<br/>)(<br>)g;
            s(<hr(.*?)/>)(<hr$1>)g;
            s(<nbsp/>)(&nbsp;)g;
            s(<li><b>)(<b><li>)g;

            s([##)([##)g;
            s(##])(##])g;

            print OUT;
         }
         close OUT;
      }

      $name = '';
      $item++;

      next;
   }

While in an <item>, we scan for pieces, just like scan. Text inside pieces is formatted differently (with spacing and linebreaks intact.)

   if (/(<piece.*>)/i)
   {
      $tag = $1;
      $tag =~ s/^<piece\s*//i;
      $attr = "";
      %thispiece = (add-to => '', language => '');
      foreach $part (split /"/, $tag) {
         if ($attr eq '') {
            $attr = $part;
            $attr =~ s/^\s*//;
            $attr =~ s/\s*=\s*$//;
         } else {
            $thispiece{$attr} = $part;
            $attr = '';
         }
      }

      $tags{body} .= "<table width=100%>\n";
      $tags{body} .= "<tr><td width=30 bgcolor=eeeeee>&nbsp;</td><td width=100%>\n";
      if ($thispiece{'add-to'} ne '') {
         $tags{body} .= "<i>Add the following to \"$label{$thispiece{'add-to'}}\"</i><br>\n";
      }
      $tags{body} .= "<pre>";
      $piece = 1;
      next;
   }
   if (/<\/piece\s*>/i) {
      $tags{body} .= "</pre>";
      $tags{body} .= "</td></tr></table>\n";
      $piece = 0;
      next;
   }

Almost done. The only remaining thing is to format <insert> tags.

   if (/(<insert\s.*>)/i)
   {
      $tag = $1;
      $before = $`;
      $after = $';

      $tag =~ s/^<insert\s+//i;
      $attr = "";
      %thisinsert = (name => '');
      foreach $part (split /"/, $tag) {
         if ($attr eq '') {
            $attr = $part;
            $attr =~ s/^\s*//;
            $attr =~ s/\s*=\s*$//;
         } else {
            $thisinsert{$attr} = $part;
            $attr = '';
         }
      }
      if ($thisinsert{name} =~ /^\./) {
         $thisinsert{name} = $name . $thisinsert{name};
      }

      $tags{body} .= $before . "<i>See <a href=\"$url{$thisinsert{name}}\">";
      $tags{body} .= "$label{$thisinsert{name}}</a></i>$after";
      next;
   }

And finally, any line which doesn't contain an <insert> and which is actually within an <item> simply gets tacked onto the current item's body after we make sure all the characters are going to display correctly. Note especially the attention to detail with the [## delimiter -- if we didn't replace that with a version containing an HTML entity instead (#) then we'd have a problem with recursion once we get into the template application code. Which, of course, I realized only after having a problem with recursion once I got into the template application code.

   s/&/&amp;/g if $piece;
   s/\[\[/&lt;/g if $piece;
   s/\[\#\#/[&#35;#/g;
   s/#\^7/&amp;/g if $piece;
   s/#\^lt#/&lt;/g if $piece;

   s([##)([##)g if $piece;
   s(##])(##])g if $piece;

   $tags{body} .= $_ if $name ne '';
}

[ Previous: Tangle: write code output ] [ Top: lpml alpha ] [ Next: Prepare indices ]

This code and documentation are released under the terms of the GNU license. They are copyright (c) 2000-2006, Vivtek. All rights reserved except those explicitly granted under the terms of the GNU license.