@items = (); @objects = (); @formats = (); $name = ''; $piecename = ''; $parentname = ''; $formatname = ''; |
while (<INPUT>) { See Looking for tags if ($piecename ne '') { $pieces{$piecename} .= $_; } if ($formatname ne '') { $format{$formatname} .= $_; } } |
<object>
, <item>
, and
<piece>
. Oh, and I've just added <format>
.
One of the things the tag handlers do is to set and unset various state markers.
For instance, I terminate the <item>
tag by setting the $name
global to blank. I also set the $piecename
global to blank in case the user
forgot to terminate the current piece. I know that violates the principles of XML tokenization,
but again, later we'll get into real XML tokenization and I don't want to mess with it yet.
if (/(<object .*>)/i) { See Handle object tag } if (/(<item .*>)/i) { See Handling item tags next; } if (/(<\/item\s*>)/i) { if ($name !~ /\./) { $parentname = $name; } $name = ''; $piecename = ''; next; } if (/(<piece.*>)/i) { next if $name eq ''; # Pieces are silent outside of items. See Handle piece tag within item next; } if (/(<\/piece\s*>)/i) { $piecename = ''; next; } if (/(<format.*>)/i) { See Handle format tag next; } if (/(<\/format\s*>)/i) { $formatname = ''; next; } |
<object>
tag.
Note that this is assuming that the tag will be the only thing on the line. I don't want to
get into real tokenizing of the XML input, because that will be the province of the QDMT, which
is my next four-letter vowelless acronym. The next version of lpml will use the QDMT to tokenize
its input.
At any rate, if the object tag is encountered, I read its attributes into the
$thistag
hash. The other handlers reuse this hash.
$tag = $1; $tag =~ s/^<object\s+//i; $attr = ""; %thistag = (name => '', language => '', item => ''); foreach $piece (split /"/, $tag) { if ($attr eq '') { $attr = $piece; $attr =~ s/^\s*//; $attr =~ s/\s*=\s*$//; } else { $thistag{$attr} = $piece; $attr = ''; } } |
if ($thistag{name} eq '') { print STDERR "$. : Nameless object encountered.\n"; next; } if ($thistag{item} eq '') { print STDERR "$. : Object '$thistag{item}' has no starting item.\n"; next; } |
@objects = (@objects, $thistag{name}); $starter{$thistag{name}} = $thistag{item}; |
$tag = $1; $tag =~ s/^<item\s+//i; $attr = ""; %thistag = (name => '', label => '', pattern => '', language => '', format => 'default'); foreach $piece (split /"/, $tag) { if ($attr eq '') { $attr = $piece; $attr =~ s/^\s*//; $attr =~ s/\s*=\s*$//; } else { $thistag{$attr} = $piece; $attr = ''; } } if ($thistag{name} eq '') { print STDERR "$. : Nameless item encountered.\n"; next; } $name = $thistag{name}; $lastchild{$name} = $name; $children{$name} = 0; if ($name !~ /\./) { $parentname = ''; $parent{$name} = ''; } else { $parentname = $name; $parentname =~ s/\..*?$//; $parent{$name} = $parentname; $lastchild{$parentname} = $name; $children{$parentname} += 1; } push @items, $name; if (defined $label{$name}) { print STDERR "$. : Duplicate item name '$name'.\n"; } if ($thistag{label} eq '') { $thistag{label} = $name; } $label{$name} = $thistag{label}; if ($parentname eq '') { $url{$name} = "$name.html"; } else { $n = $name; $n =~ s/^.*?\.//; $url{$name} = $url{$parentname} . '#' . $n; } |
<piece>
tag, which is pretty analogous to
<item>
.
$tag = $1; $tag =~ s/^<piece\s*//i; $attr = ""; %thistag = (add-to => '', language => ''); foreach $piece (split /"/, $tag) { if ($attr eq '') { $attr = $piece; $attr =~ s/^\s*//; $attr =~ s/\s*=\s*$//; } else { $thistag{$attr} = $piece; $attr = ''; } } $piecename = $name; $piecename = $thistag{'add-to'} if $thistag{'add-to'} ne ''; |
<format>
tag simply takes its content and stashes it into a hash, just
like pieces. The only attribute we care about in a format tag is its name.
$tag = $1; $tag =~ s/^<format\s*//i; $attr = ""; %thistag = (name => ''); foreach $piece (split /"/, $tag) { if ($attr eq '') { $attr = $piece; $attr =~ s/^\s*//; $attr =~ s/\s*=\s*$//; } else { $thistag{$attr} = $piece; $attr = ''; } } if ($thistag{name} eq '') { print STDERR "$. : Nameless format encountered.\n"; next; } $formatname = $thistag{name}; push @formats, $formatname; |
This code and documentation are released under the terms of the GNU license. They are additionally copyright (c) 2000, Vivtek. All rights reserved except those explicitly granted under the terms of the GNU license. |