One of the neat little things I did over the past few days was a simple Word macro -- at least, it should have been simple, but the problem is one I've had for a long time.
In this case, what I wanted to do was to fix up a few documents I had from a translation customer. This particular end user, for reasons known only to them, captions their figures using fields. The fields are in text boxes for easy positioning, and the field results (the text you see on the screen) are the captions.
Only one problem: the fields are always variable results for variables which don't exist in the document. All I can figure is that the document preparer makes these things in little snippets with some other tool which spits out Word texts, then they paste those into the text boxes.
So, you're asking now (unless you're a professional translator) who cares? You just type your English over the German in the captions, and you're home free, right? Well: no. Everybody who's anybody in the wonderful world of translation nowadays uses translation tools, in this case TRADOS.
TRADOS does two things for you: it stores each and every sentence you translate in a translation memory (a TM), so you (sort of) never need to translate anything twice, and it also makes it much easier to step through a document translating. The use of TRADOS makes translation much easier, and it also helps you stay consistent in your use of words and phrases.
Herein lies the problem: those fields were untouchable by TRADOS. There are two modes in TRADOS: one steps through the document using Word macros but doesn't deal well with text boxes (and yes, you'll note they're in text boxes). So that approach was out. The other (the TagEditor) converts the entire document to an XML format, then edits that in a very convenient way. The TagEditor makes short work of text boxes, but those field results were invisible to it.
Stuck! And so for a series of three jobs from that customer, I just didn't use TRADOS on the figure attachments, and hated it. Last week, though, I took screwdriver in hand (metaphorically speaking) and decided it was showdown time.
OK, that's the teaser -- follow the link to get the ... rest of the story. !FOLD So as it turns out, the reason TRADOS Word macros don't deal well with text boxes is -- to put it gently -- Microsoft Word's handling of text boxes is stupid.
The main document text you see before you turns out to be only one of an arbitrary number of stories in the document. And yes, each text box is itself a story. So any macro which wants to look at all the text in a document must walk through all the stories. Word's own word counter won't do that, by the way. If your text is in text boxes, Word won't count it. (Which is why most translators use some other tool to count words.)
Here is how Microsoft's documentation defines stories:
A document area that contains a range of text distinct from other areas of text in a document. For example, if a document includes body text, footnotes, and headers, it contains a main text story, footnotes story, and headers story.
There are 11 different types of stories that can be part of a document, corresponding to the following WdStoryType constants: wdCommentsStory, wdEndnotesStory, wdEvenPagesFooterStory, wdEvenPagesHeaderStory, wdFirstPageFooterStory, wdFirstPageHeaderStory, wdFootnotesStory, wdMainTextStory, wdPrimaryFooterStory, wdPrimaryHeaderStory, and wdTextFrameStory. The StoryRanges collection contains the first story for each story type available in a document. Use the NextStoryRange method to return subsequent stories.
So all I have to do is to loop through the stories, find all the fields in each, select each field, and replace it with its own contents. Then I can run TRADOS on the resulting document and translate my little heart out.
Only one problem (ha!): Word won't loop through the stories. Word instead will only loop through the types of story in the document, and you can loop through the stories in each type after that. Ugh!
Here's my code:
For Each sr In ActiveDocument.StoryRanges If sr.StoryType = wdTextFrameStory Then done = False While Not done ct = sr.Fields.Count For i = 1 To ct Set f = sr.Fields(1) t = f.Result.Text f.Select Selection.Text = t Next i
Set sr = sr.NextStoryRange If sr Is Nothing Then done = True Wend End If Next sr
Note a few things:
- Always use "Set" when setting object references! Otherwise (as I always rediscover, since I use Visual Basic once every two years) VB will helpfully dereference the object for you and convert it to a string. Which, of course, will be useless.
- The "sr" object is the story range. That is simply the full text of each story in the loop. Note that for this particular task, I didn't want to mess with headers, footers, etc., so I'm skipping anything that's not a text box (a "text frame story" in OO parlance.)
- The payload starts at "ct = sr.Fields.Count". That's where we look at the fields in the text box. For each one, we grab its result with "f.Result.Text", then select the field with "f.Select", then replace it.
- After each field is replaced, the next field is still field number 1 in the range! That's why we can't do a For Each loop here.
- Note the double loop: we have a For Each sr around the top, then a While Not done with the test at the end. The For Each iterates through the story range types (by moving from the first of each type to the first of the next type). The "Set sr = sr.NextStoryRange" is responsible for finding all the instances of a given story type while in that loop.
And thus concludes my daily code posting. Join us again tomorrow, kids!