freelanceprogrammers.org Forum Index » XML / XSL
Rant: Is XML for documentation going the way of SGML? (long)
Joined: 13 Jun 2003
Posts: 14
Rant: Is XML for documentation going the way of SGML? (long)
Thank you Jim.
We need more data points like yours to better understand when it
makes sense to move away from the book paradigm, what`s involved, at
what cost for what benefits.
Personally, I favor semantically rich schemas too but I freely admit
my preference is for reasons of intellectual and technical
satisfaction and is NOT the right choice in many, perhaps most,
situations.
Your brief overview briefly brought up some of the major drawbacks.
Clearly, the project was:
1. Disruptive to current work and production.
2. More costly, initially and ongoing.
3. Complicated to manage and maintain.
And, most importantly, while "customers loved it," were they willing
to pay for it as measured by increased sales and profits solely based
on this change?
And, what about the customers that "hated it;" what do detractors
have to say about it. From your description and my own experience
with such systems, getting good hardcopy is a major exercise. Also,
such a system should be able to produce conventional documentation
for customers that demand it but such flexibility is seldom provided
for.
Finally, what has happened to staffing? Are they spending more time
feeding the system and less time researching and developing content?
Are full time production staff now required to make sure things
continue to work? What are the ramp up costs for new people,
including contractors?
I`m on the run now, heading for work but I look forward to reading
your write-up. Again, we could use many more data points like yours.
Thanks for being forthcoming.
Tim
Joined: 13 Jun 2003
Posts: 39
Rant: Is XML for documentation going the way of SGML? (long)
> I think you have inadvertantly concluded that one the things that XML
> is "good" for is making technical writers think and talk explicitly
> about structures, including hierarchy, they intuitively already work
> with. This includes not only the obvious structures that make up
> documents but the "structuring" of information to these conventions.
I`d guess that most writers looking at XML are looking too closely,
seeing DTDs, tags, attributes, and not the wide-angle view of *how*
we do what we do and where we could be doing it more effectively.
Maybe this thread is starting to provide a wider angle.
> But, if we look at leading efforts, like DocBook, we can see that
> writers haven`t gotten very far. They continue to confuse meta-data -
> data about data - with the "container" in which information is kept.
> It is simply an instance of the general document model of Frame or
> other established tools rendered in XML for a set of specialized
> documents (software).
>
> This gives us a langauge we can share as writers when talking about
> containers but where is the common language for talking about content?
You`ve mentioned this a couple of times, and now I`m pretty sure I
understand... again, it`s a matter of people focusing too closely on
elements like sect, procedure, and para, rather than what those bits
assembled are talking about. Right? It`s like the old jokes about the
human body being worth a few bucks worth of carbon and water, when
the organic compounds and organs comprising those basic elements are
worth much more.
In this view, even the higher-level DITA constructs (topic, concept,
task, reference) miss the forest for the trees. They`re still con-
tainers; useful ones but still missing that semantic content.
Topic maps [http://www.topicmaps.org/xtm/] amount to an (external)
catalog of semantic content. Section numbers in UNIX manpages also
provide a very simple content map -- section 1 describes user-level
commands, section 2 for kernel calls, section 3 for the C library
API, and so on. But there are aesthetic as well as practical reasons
for each topic or file to include a description of its semantic
content -- adding new files doesn`t mean you have to go back & update
an external map; just rebuild the map using the working content.
Manpages also require a .NM (name) element that contains a description
of the manpage (i.e. what it`s about); ideally, the description should
contain keywords that programs like "makewhatis" can use to build a
search index. Is there really anything new under the sun? :-)
> As you might say, "it`s in the text, the title, the chapter heading"
> but remember, we`re talking about a "programmatic conversation," and to
> a program, such text doesn`t mean much without a reference to identify
> it against.
Yes... I talked about the differing needs between human readers and
computer readers. As human writers writing for (up to now) primarily
human readers, we have relied on the shared context of hierarchy and
organization to provide the semantic content. Now we have a second
audience, computers running search programs. This new audience isn`t
really interested in the steps of, say, an installation procedure --
they`re not going to actually install a widget, after all -- but want
to know that some topic *is* an installation procedure for some
particular product. DocBook actually *can* contain some of this
metadata as-is -- for example, a section can have a sectioninfo
element with a "role" attribute and a "keywordset" element. It`s easy
to miss this kind of stuff in a big DTD.
The hard part is remembering to add the metadata, especially in the
heat of a looming deadline. My home-grown DTD has an (in the light
of this discussion) incomplete set of attributes that allow one to
specify a product, interface (HTTP, SNMP, console, etc), and what I
call "use phase" (planning, installation, configuration, operation,
administration, maintenance). I was providing (by design) hooks for
searching, although I hadn`t gone so far as to build a search mechanism.
Not all the containers have had those metadata attributes filled in;
making those attributes required instead of optional could go a long
way toward fixing that problem.
> Even if that turns out to be the case, you might be right in that the
> lessons of XML can reinforce the "good practices" you mentioned
> writers already practice and even inspire better application of them;
> concise, well cataloged libraries, with documents with good, clear,
> consistent structure, formed appropriately to thoughtfully contain the
> information, with various indexes to give quick, smart access to key
> pieces of information.
>
> It so happens that this is the very prerequsite that seems necessary
> for a relatively easy, non-disruptive, low-cost, successful migration
> to DocBook-like XML and structure. If you meet this prerequsite, then
> you are left with the question "is it worth paying to do in XML what
> you`re already doing for "free" (relatively speaking)?"
If writers (individually or in groups) are providing this type of
information, I think doing it in XML *can* be worth it -- a lot
of those catalogs and indexes could be built automatically and any
documents missing proper metadata can be flagged for attention.
> > Complexity has gotten us nowhere. Let`s try simplicity.
>
> Or something more fundamental like "at what point are the imagined
> benefits worth the effort and complexity, and just how will we know?"
That one I have no answer for. Putting a price tag on *not* having
documentation (or good documentation) has always been problematic.
--
Larry Kollar, Senior Technical Writer, ARRIS
"Content creators are the engine that drives
value in the information life cycle."
-- Barry Schaeffer, on XML-Doc
Joined: 02 Feb 2005
Posts: 2
Rant: Is XML for documentation going the way of SGML? (long)
Tim,
I appreciate your reply, thanks.
You`re right to point out that many folks won`t need such a heavyweight
approach to marking up their technical documentation. In our case, it was
justified by the size of the bigger picture -- an all singing, all dancing
portal for customer support, training and consultancy. This was a very early
example of what we all expect now from enterprise software vendors, so the
output of that system would be regarded by many support departments as
essential these days.
As for some customers `hating` it (the online library), we recreated the
traditional doc sets for them as well, by getting to pdf through FrameMaker.
Having set up the scripts to export the correct document components and
build a FM book, it was a process that essentially ran itself for free,
aside from the regular editing pass.
To answer your question about how the staff spent their time -- it
definitely meant that a disproportionate amount of every day was spent
`feeding the system`, but that balanced out over time. The benefits were the
ones we could apply automation to. For example, early in the project, we set
ourselves the goal of creating the first online course to emerge from the
system by building it from components and automating the publishing. We did
this by taking the equivalent of a manual, an instructor-led course, a
trainer`s guide, and some reference material, and entering the source text
and graphics into the system and then producing out the other end the
animated online course, pdf and html. We put a team of 30 people on it -- a
mixture of trainers, graphic designers, authors, course developers, editors
and publishers (Omnimark experts), and a manager or two (;-) and turned it
around in a week. It was rough at the edges, but a phenomenal demonstration
of what repurposing content and multiple outputs are all about.
I haven`t worked for the company concerned for some years now, but I gather
that the system has now joined the league of `legacy` apps. A re-org
distributed the staff around various other cost centers. You need a strong
central team to keep this sort of thing running, so I imagine the knowledge
would have evaporated. Such a team gets very expensive, and can be difficult
to justify these days.
Rgds,
Jim
> And, what about the customers that "hated it;" what do detractors
> have to say about it. From your description and my own experience
> with such systems, getting good hardcopy is a major exercise. Also,
> such a system should be able to produce conventional documentation
> for customers that demand it but such flexibility is seldom provided
> for.
> Finally, what has happened to staffing? Are they spending more time
> feeding the system and less time researching and developing content?
> Are full time production staff now required to make sure things
> continue to work? What are the ramp up costs for new people,
> including contractors?
>
Joined: 17 Jun 2003
Posts: 4
Rant: Is XML for documentation going the way of SGML? (long)
/ Larry Kollar <larry.kollar@...> was heard to say:
[...]
| Shortly after XML appeared came XSL, a highly complex transformation
| and formatting language
I`m not sure how or why XSLT gets criticized (or maybe it`s praise?)
for being highly complex; it has a grand total of about 35 elements.
It was designed, from the beginning, to be sufficiently rich to
transform documents (the stuff people write, not the stuff that gets
shipped around in XML-RPC packets) for the purpose of publishing them.
There`s gobs and gobs of stuff that could usefully be put in a
transformation language that were explicitly left out because they
weren`t needed for documentation formatting. There were long arguments
about whether we even needed the ability to do the identity transform
for goodness sake, because it wasn`t clear that that was necessary for
documentation.
It turns out that lots of folks have used XSLT for lots more
interesting and complex things than documentation, and the XSLT 2.0
effort has taken a somewhat broader view of transformation than 1.0
did, but it`s hardly a model of intrinsic complexity.
It has two fundamental aspects that force stylesheet writers to think
differently than programmers in modern, procedural languages. It`s
expressed in XML (which some people hate but I don`t know how I`d live
without since I often want to transform my stylesheets) and it`s a
functional language which is really different from Perl, C, etc.
Neither of those is intrinsicly complex, just different.
| that was quickly broken into XSLT and XSL-FO components. And we have
There was debate about whether those should be separate specs or not
too. Eventually there were folks on the WG that wanted to do one and
not the other so it was an issue of editorial simplicity as much as
anything else.
| XPath (another amoeba-like split of XSLT),
XPath was perceived by the community as generally valuable. I think
it`s been a dreadful mistake in some ways (because it introduced
QNames in content), I regret that I didn`t see the train wreck coming
and work harder to preserve the XSL submission`s original model of a
pattern and a rule, both expressed in XML, but that`s water under the
bridge now.
| XSchema, XQuery, SOAP, etc. More layers, many of which are
| fortunately geared toward the interchange side of things rather than
| documentation.
Why is it bad, in principle, that the simple ideas of XML have been
used by other communities to satisfy their goals?
| So what`s the problem? In a word, attitude.
|
| The first problem is the attitude of "everything must be XML"
| once again leads us down the path of ignoring perfectly workable
| (and free, in most cases) technologies in favor of "solutions"
| that add expense and complexity and give little or no gain in
| return. Seriously, what does XSL-FO do that TeX or Groff can`t?
I really don`t know how to answer that question. Nothing, I suppose,
is one answer. Except, of course, that it`s expressed in XML which
makes all those general purpose tools that I`ve now got in my back
pocket (parsers, validators, editors, transformation languages,
query languages, etc., etc., etc.) immediately useful.
Perhaps you would have been happy experimenting with extensions to TeX
or groff by writing your own special purpose tools for parsing,
validating, editing, transforming, and querying those documents (not
only in English but also in every other language supported by
Unicode?), but I`m just as happy not to have to.
| I know for sure I can define a page layout in Groff using a lot
| fewer lines of code than I could in FO.
Really? Are you sure? And could you write the tool that allows you to
manipulate that code to change the presentation and layout just as
easily? I admit there`s a certain unfortunate "bare minimum" of
boilerplate required in an XSL-FO document, but it seemed necessary at
the time in order to support the possibility of richer page models in
the future and fully internationalized printing.
| may be. If we wait for the vendors to give us *their* solution,
| we`ll be right back to the days of six-figure implementation
| and five-figure support costs -- and XML (as a documentation
| tool) will become "eXcellent, Maybe Later" and join SGML in
| obscurity.
I dunno, I`ve built an entire XML publishing system around a
reasonable schema (free), a good editor (free), and a reasonable set
of stylesheets (free). Getting to the web is free, getting to print
isn`t free yet, but I`m hoping it will become free eventually.
| It doesn`t have to be that way. Not only is XML just as open
| (if not more so) than SGML, a plethora of Free, Open Source,
| and low-cost commercial tools have grown up around it or have
| been adapted to work with it. The pieces are lying all around
| us, like several model car kits all jumbled together. We need
| to start picking up pieces, fit them together, and start applying
| glue and paint.
Absolutely!
| Needless to say, this is my personal opinion and not necessarily
| that of my employer.
Mine too.
There`s no question that the family of XML specifications has gotten
as complicated as SGML, perhaps more so. But there`s nothing that says
we have to use all of them.
I guess I don`t fundamentally disagree with the spirit of your rant,
I`m just not sure I perceive the details of the situation in the same
way that you do.
OTOH, as a professional standards wonk, I suppose I`m part of the
problem, not the solution :-)
Be seeing you,
norm
--
Norman Walsh <normyahoo@...> | A man can believe a considerable
http://nwalsh.com/ | deal of rubbish, and yet go about
| his daily work in a rational and
| cheerful manner.--Norman Douglas
[Non-text portions of this message have been removed]
Joined: 13 Jun 2003
Posts: 39
Rant: Is XML for documentation going the way of SGML? (long)
Jim Gabriel wrote:
> I haven`t worked for the company concerned for some years now, but I
gather
> that the system has now joined the league of `legacy` apps. A re-org
> distributed the staff around various other cost centers. You need a
strong
> central team to keep this sort of thing running, so I imagine the
knowledge
> would have evaporated. Such a team gets very expensive, and can be
difficult
> to justify these days.
That`s a great illustration of *how* complex structured documentation
systems aren`t working, unfortunately. If the whole shebang breaks
down at the first re-org, it`s (IMO, of course) too fragile to deploy.
I`m afraid large documentation departments are a thing of the past
for most industries, as far as I can tell. In these dark days of
tiny departments and nearly nonexistent budgets, a simplified
approach to XML and structured documentation is the only way I can
see for it to reach widespread acceptance.
In the long run, the need for explicit structure may become moot --
eventually, computers (or rather, the software) could become smart
enough to interpret the shared context we already use and extract
structure and metadata for us. It`s possible to extract some amount
of structure already, based on formatting (see Exegenix and several
other programs, commercial to Free). But human intervention is still
required at this stage -- for example, to interpret the meaning (or
role) of italic strings -- and metadata isn`t on the radar yet.
For now, it`s up to us to feed the system, maintain it, and still
find the time to actually *write* something on occasion. A system
that`s easy to set up & maintain is the only thing that`s going to
work for small groups or lone writers.
--
Larry Kollar, Senior Technical Writer, ARRIS
"Content creators are the engine that drives
value in the information life cycle."
-- Barry Schaeffer, on XML-Doc
Joined: 13 Jun 2003
Posts: 14
Rant: Is XML for documentation going the way of SGML? (long)
Right on Larry.
It`s fairly typical in the software industry to have techncial writers
directly associated with engineering groups where they can get a
certain amount of cover (cut Documentation but not the engineering
team) but are constrained by the things you and Jim mentioned and
others as well.
This does NOT mean that XML is beyond reach or, if adopted in some
form, is a cost and management landmine. We just have to start facing
facts - actually, start raising them over the din of enthusiastic hype
- and begin to understand the implications and true costs of XML
solutions. As it turns out, the surrounding issues are actually more
challenging and critical than the technology. They include:
1. Sustaining staffing and management.
2. Hardening systems against reorganization and project/product
changes.
3. Scalibility and ongoing extensibility (new/changing applications).
4. Legacy convervsion.
5. Retreat - how do I get back, if the "solution" fails?
6. Economic/business justification.
7. Maintaining tools and source over time.
8. Inventory management.
9. Training, work complexity, production complexity and other
demands on staffing and productivity.
As for the technology, Norman said it for me. There is nothing to
debate or be confused about; the tools are just too damn useful to not
take up, and not just for documenation. But then I came into
documentation from software engineering. I`m a "writer" who continues
to think "technology, tools and process" before I think "paper and
pencil."
Tim
--- In xml-doc@yahoogroups.com, Larry Kollar <larry.kollar@a...> wrote:
> Jim Gabriel wrote:
>
> > I haven`t worked for the company concerned for some years now, but I
> gather
> > that the system has now joined the league of `legacy` apps. A re-org
> > distributed the staff around various other cost centers. You need a
> strong
> > central team to keep this sort of thing running, so I imagine the
> knowledge
> > would have evaporated. Such a team gets very expensive, and can be
> difficult
> > to justify these days.
>
> That`s a great illustration of *how* complex structured documentation
> systems aren`t working, unfortunately. If the whole shebang breaks
> down at the first re-org, it`s (IMO, of course) too fragile to deploy.
>
> I`m afraid large documentation departments are a thing of the past
> for most industries, as far as I can tell. In these dark days of
> tiny departments and nearly nonexistent budgets, a simplified
> approach to XML and structured documentation is the only way I can
> see for it to reach widespread acceptance.
>
> In the long run, the need for explicit structure may become moot --
> eventually, computers (or rather, the software) could become smart
> enough to interpret the shared context we already use and extract
> structure and metadata for us. It`s possible to extract some amount
> of structure already, based on formatting (see Exegenix and several
> other programs, commercial to Free). But human intervention is still
> required at this stage -- for example, to interpret the meaning (or
> role) of italic strings -- and metadata isn`t on the radar yet.
>
> For now, it`s up to us to feed the system, maintain it, and still
> find the time to actually *write* something on occasion. A system
> that`s easy to set up & maintain is the only thing that`s going to
> work for small groups or lone writers.
Joined: 13 Jun 2003
Posts: 14
Rant: Is XML for documentation going the way of SGML? (long)
Right on Larry.
It`s fairly typical in the software industry to have techncial writers
directly associated with engineering groups where they can get a
certain amount of cover (cut Documentation but not the engineering
team) but are constrained by the things you and Jim mentioned and
others as well.
This does NOT mean that XML is beyond reach or, if adopted in some
form, is a cost and management landmine. We just have to start facing
facts - actually, start raising them over the din of enthusiastic hype
- and begin to understand the implications and true costs of XML
solutions. As it turns out, the surrounding issues are actually more
challenging and critical than the technology. They include:
1. Sustaining staffing and management.
2. Hardening systems against reorganization and project/product
changes.
3. Scalibility and ongoing extensibility (new/changing applications).
4. Legacy convervsion.
5. Retreat - how do I get back, if the "solution" fails?
6. Economic/business justification.
7. Maintaining tools and source over time.
8. Inventory management.
9. Training, work complexity, production complexity and other
demands on staffing and productivity.
As for the technology, Norman said it for me. There is nothing to
debate or be confused about; the tools are just too damn useful to not
take up, and not just for documenation. But then I came into
documentation from software engineering. I`m a "writer" who continues
to think "technology, tools and process" before I think "paper and
pencil."
Tim
--- In xml-doc@yahoogroups.com, Larry Kollar <larry.kollar@a...> wrote:
> Jim Gabriel wrote:
>
> > I haven`t worked for the company concerned for some years now, but I
> gather
> > that the system has now joined the league of `legacy` apps. A re-org
> > distributed the staff around various other cost centers. You need a
> strong
> > central team to keep this sort of thing running, so I imagine the
> knowledge
> > would have evaporated. Such a team gets very expensive, and can be
> difficult
> > to justify these days.
>
> That`s a great illustration of *how* complex structured documentation
> systems aren`t working, unfortunately. If the whole shebang breaks
> down at the first re-org, it`s (IMO, of course) too fragile to deploy.
>
> I`m afraid large documentation departments are a thing of the past
> for most industries, as far as I can tell. In these dark days of
> tiny departments and nearly nonexistent budgets, a simplified
> approach to XML and structured documentation is the only way I can
> see for it to reach widespread acceptance.
>
> In the long run, the need for explicit structure may become moot --
> eventually, computers (or rather, the software) could become smart
> enough to interpret the shared context we already use and extract
> structure and metadata for us. It`s possible to extract some amount
> of structure already, based on formatting (see Exegenix and several
> other programs, commercial to Free). But human intervention is still
> required at this stage -- for example, to interpret the meaning (or
> role) of italic strings -- and metadata isn`t on the radar yet.
>
> For now, it`s up to us to feed the system, maintain it, and still
> find the time to actually *write* something on occasion. A system
> that`s easy to set up & maintain is the only thing that`s going to
> work for small groups or lone writers.
Joined: 08 Feb 2005
Posts: 8
Rant: Is XML for documentation going the way of SGML? (long)
Hi All
I have been watching this thread with great interest and I think it`s time I
chimed in.
Larry Kollar wrote:
> I`m afraid large documentation departments are a thing of the past
> for most industries, as far as I can tell. In these dark days of
> tiny departments and nearly nonexistent budgets, a simplified
> approach to XML and structured documentation is the only way I can
> see for it to reach widespread acceptance.
This is exactly what I`m doing at the moment. I agree it has to be simple to
have any chance of success. The other thing that is the stumbling block for most
documentation departments (or sole writers) is that you can`t take your
documents out of circulation for six months while you sort out the whole
toolchain.
To this end, we are using FrameMaker 7.1 as our way in to XML. This gives us
many advantages:
* we can publish at any stage - unstructured, structured, and finally,
round-tripping to XML
* we can polish the print publication before final printing - pagination,
tweaking graphic sizes, etc
* we can use standard single-sourcing tools such as WebWorks or RoboHelp to
create online help, html, etc
* our writers are familiar with this type of editor (even Word users quickly
come up to speed), there are some differences with how they work in structured
FrameMaker but they are not having to learn a whole new paradigm
* we don`t have to include the whole book in the structure
To cover the last point in more detail, writing a DTD that caters for all the
corner cases of front and back matter is not always necessary. The document I am
currently working on is a catalogue where the guts of the document - the
products - are regular in structure and are going to be single-sourced with a
web publication where the presentation of the information is going to be quite
different. Ideal XML fodder.
The front and back matter, on the other hand, is only for use in the printed
catalogue and contains marketing and graphic designer fluff, and tables
generated from the product information. Very unsuited to XML.
Using FrameMaker I can create a book that contains unstructured and structured
chapters and only round-trip the structured chapters into my CMS so that the
website can access them.
We have got to the structured stage and this is working even though my authors
are half-way round the world and hadn`t even used Word styles before - so it
***had*** to be simple.
FrameMaker is by no means the perfect tool and has been poorly supported by
Adobe for years, but it seems to be the best answer to making XML accessible to
the people it was originally designed for - documentation rather than data
transfer. Don`t get me wrong I am happy that the data transfer people have
embraced XML precisely because it gives it traction, but it does mean that there
are conflicting agendas out there.
For example, one feature request mentioned before in this thread was the ability
to have invalid structure at certain stages. This is something a data transfer
person would probably not have a use for, but as a writer I immediately agreed
that this was very important. The perfect example of this is cut and paste -
valid structure in one spot, invalid in the new placement - I don`t want to be
prevented from moving text simply because the structure will be temporarily
invalid.
The way FrameMaker handles this is that you can move stuff where you want but
the invalid structure is shown in red in the Structure View and if you try to
export it to XML it will report the error and fail.
Alternatively, if you are creating a new element you would normally be offered
only valid elements, but you can change a setting to allow you to access all
possible elements if you need to break the structure temporarily (eg if you want
to create structure valid for the new placement before you move it - different
people work in different ways).
"What about propriatary lock-in?" I hear you say - after all that`s one of the
tenets of using a standard such as XML. This is not high on my priority list
right now, but it is comforting to know that once my structured chapters are
safely round-tripping through my CMS in XML I can use any XML editor and if I
want to publish directly from the XML I can learn XSL-FO or whatever - but
there`s no rush.
As for my front and back matter, there`s not enough content in those chapters to
cause a drama even if Adobe fell into a black hole and all copies of FrameMaker
installed around the world exploded immediately.
I hope this has given some of you an insight into how XML is being used in the
real world, for real documentation, right now.
Thanks for listening.
-Melanie Kendell
Joined: 13 Jun 2003
Posts: 39
Rant: Is XML for documentation going the way of SGML? (long)
> To this end, we are using FrameMaker 7.1 as our way in to XML. This
> gives us many advantages [...]
>
> writing a DTD that caters
> for all the corner cases of front and back matter is not always
> necessary.
That`s pretty much the current setup I`m using, except I`m using 7.0
(because Adobe in their nonexistent wisdom dropped MacOS development).
Like you, I don`t try to structure boilerplate stuff like front and
back pages.
Still, setting up Frame for structured writing is nontrivial. The
EDD is a beast and can get out of hand pretty quickly if (like me)
you don`t know what you`re doing at first. I`ve recommended to others
looking at using Frame this way that they mentally prepare to throw
out their first efforts and start over. There are benefits to the way
Frame does things -- you can`t make changes to a DTD and forget to
fix the stylesheet, for example. Being able to break structure is
another advantage; one item in my bag of tricks transforms OPML
(an XML outliner DTD) to my home-grown DTD, missing a couple of
required elements but enough to capture the outline properly. Open
Structure View, look for red marks, fill in elements. It`s not quite
as nice as having a built-in outliner, but it does the job.
Some of the later talk in this thread discusses the importance of
metadata, "feeding the system" as it were. What kind of metadata
has your department found useful, and how much effort does it take
to properly maintain it?
--
Larry Kollar, Senior Technical Writer, ARRIS
"Content creators are the engine that drives
value in the information life cycle."
-- Barry Schaeffer, on XML-Doc
Joined: 08 Feb 2005
Posts: 8
Rant: Is XML for documentation going the way of SGML? (long)
Hi Larry et al
> Setting up Frame for structured writing is nontrivial. The
> EDD is a beast and can get out of hand pretty quickly if (like me)
> you don`t know what you`re doing at first. I`ve recommended to others
> looking at using Frame this way that they mentally prepare to throw
> out their first efforts and start over.
I would agree 100% with you there! To me, getting the structure right is
***the*** most critical aspect of the whole exercise and it`s unlikely that
you`ll get it right first time unless you`ve been through the hoops quite a few
times (and there aren`t many people that have done that yet).
> There are benefits to the way
> Frame does things -- you can`t make changes to a DTD and forget to
> fix the stylesheet, for example.
The way I tackle the whole process (maybe because my background is technical
writing rather than programming and I find working from the text easier) is:
1. Work out the stylesheet on a real example of the text
It doesn`t matter if the font isn`t exactly the shade of sky-blue-pink you
wanted but the styles will automatically help you think about the structure even
if it is only from a terminal leaf point of view.
2. Start working out an EDD to give hierarchy to the elements
You`ve got to concentrate on EDD *or* DTD not both. Personally I prefer EDD `cos
I can work directly with the real text and immediately see problems with the
structure - mainly by seeing red bits in the structure, but also through how the
text looks. People that disparage WYSIWYG editors for XML don`t give credit that
some people *can* distinguish between a problem with a style and a problem with
the structure, and miss out on the fact that WYSIWYG can help to alert you to
problems with structure.
3. Work out the attributes you will need, mainly for manipulation of content for
publications
For my project, even though the website is some way off yet, the most important
consideration for attributes was what search criteria would be useful to website
visitors.
4. Once the EDD is starting to take shape, use the conversion table to fine tune
it
FrameMaker has a really useful feature where you can generate a table for
converting unstructured to structured Frame. This automatically lists all the
styles used and suggests elements. You then edit the table to reflect the real
elements, build hierarchical rules (basically the same rules as in the EDD) that
allow you to wrap elements together under a parent element iteratively, populate
default attributes, etc. This helps you test your emerging EDD `cos if the rules
don`t work in conversion they probably aren`t quite right.
5. Once you`ve tested your conversion six or seven thousand times, do the
conversion for real
You only want to do this once your EDD is looking pretty solid as it can be
harder to change things after this point depending on the volume of text and
exactly what the changes are. This is the easiest way I`ve found for marking up
unstructured content (once the conversion table is debugged).
6. Work on the text formatting and layout side of the EDD
FrameMaker allows you to do some fairly clever but sometimes arcane stuff with
conditional rules and prefixes and suffixes to generate regular text from
attributes. For example, rather than have an element containing the text
"Lyophilised Monoclonal" stored 1,000 times (with no guarantee that it is always
correctly spelled), if the product is of ProductType m and has the CodePrefix
NCL-, the text "Lyophilised Monoclonal (NCL-" is automatically generated before
the content of the ProductCode element. You may find during this process that
you have to change some of your initial ideas on structure (particularly to do
with attributes).
7. When the EDD is perfect (until someone asks for the next change), export to
DTD
This is where I am at now and, while I can get an error free export for the DTD,
I`m having a few headaches applying the resulting DTD to the XML export of the
chapters. I`m sure (like the problems I had at earlier stages) that once I know
what I`m doing it will all fall into place. And that once it works it will be
robust.
> Some of the later talk in this thread discusses the importance of
> metadata, "feeding the system" as it were. What kind of metadata
> has your department found useful, and how much effort does it take
> to properly maintain it?
Metadata is simply data about data. This means different things to different
people. A lot of people use it to mean the keywords used to promote their sites
to search engines. To me, the metadata is inherent in the elements and
attributes used to mark up the text - the stuff that allows me to identify and
categorise the content.
Which is why having a semantic schema (note the small s) is so important.
Marking up slabs of text in a generic DocBook-type schema will not generally
give you adequate ROI. The schema must reflect the types of manipulation you
intend to perform on the text. In general, this means creating your own schema
for your own content (although DITA is looking promising, but not quite ready
yet).
For example, my products have an atypical hierarchy ProductBlock>optional
MultiCloneProduct>Product>Clone that wouldn`t be available in any "standard".
Each of these elements allows me to grab everything to do with a particular
product at a specific level so I can offer all the information about a product,
the details of the thing that is actually for sale, or the clones represent a
product, depending on the context. Choice-type (or enumeration-type) attributes
on those elements allow me to search for known values that categorise them.
In effect, we don`t "feed the system" we mark up the text (or more generally,
use conversion to mass mark it up) to give it structure, then select attribute
values according to need, and viola - metadata.
The "page furniture" of the website could well have the other sort of metadata
(the keyword stuff) but the content that I`m dealing with doesn`t require it.
Anyone else got some real-life examples they want to share?
-Melanie
Joined: 09 Feb 2005
Posts: 4
Rant: Is XML for documentation going the way of SGML? (long)
I think this thread is great! It has everyone discussing possibilities,
pitfalls, and real-life work.
>Anyone else got some real-life examples they want to share?
My first structured Frame project was a self-paced tutorial. I created what
amounts to a one-time EDD with elements such as exercise, task, input,
ui-element, etc. Totally unusable for anything other than a self-paced
tutorial written the way I wrote it. <g> (It was also the first time I ever
wrote a self-paced tutorial. <vbg>
For my next structured project, I analyzed the existing docs (and took into
account other docs, future possibilities, etc.) and developed the structure
in a spreadsheet. Each element has its own tab. On the tab are the allowed
parent elements, allowed child elements, attributes, a description of the
element, and general formatting info for both PDF and HTML. The parent and
child elements are hyperlinks to their tabs.
Once I was comfortable with the structure, I created an initial DTD. I
started with the DTD because I can create a basic DTD more quickly. I
imported the DTD into an EDD file, and finished it off there. Once I had the
EDD "perfected," I generated a new DTD.
The elements are based more on the structure of the docs than pure
semantics. For example, book, chapter, section, procedure, list, figure. I
chose this route to make it easier to port the structure to other projects.
In my limited experience creating highly semantic DTDs, I have found that
they only had limited uses.
I mapped the elements to the paragraph and character styles in an existing
template. I only had to make a few changes to the template, and I probably
would have made them anyway.
Next, I created a conversion table to convert my unstructured docs to
structured. This got the docs most of the way there.
I added a couple read/write rules to handle graphics and XML linefeeds.
I created an XSLT that generates a set of HTML files for online help. In the
help, related topic links are generated dynamically based on a sections
child or sibling sections. X-refs point to the specific HTML files and the
wording is appropriate for the web, i.e., no see page 24. Full size screen
caps are treated as popups. I used the xsl:document element in Saxon to
generate multiple HTML files.
The only nod to unstructured Frame is the use of conditional text for some
of the text and for the Figures/graphics/captions. I could not figure out
how to position the caption under the graphic without putting it in the
anchored frame.
Finally, I created some batch files to test the XML and generate the help
set in case I get hit by a bus.
I am now in the process of setting up the structure for another set of apps.
I will probably get rid of the chapter elements and use the section element
with additional conditions in the EDD to differentiate between introduction,
chapter, appendix, first-level sections, and second-level sections. I will
also add some attributes to mark the nodes for specific products only and
specific media only. I plan to stop generating the HTML locally, and put the
XML on the server. I`ll Java or ASP to perform the XSL processing on the
fly.
I love
Joined: 10 Feb 2005
Posts: 2
Rant: Is XML for documentation going the way of SGML? (long)
For me, the most important lesson in this thread so far is that -- a bit
contrary to conventional XML wisdom -- you should start with existing
deliverables, especially for typeset stuff.
If your existing deliverables are not already well structured, they won`t become
so by magic, just by introducing XML into the workflow. A mess is a mess,
whether XML- or Word-based. Start by cleaning up the layout and logistics of
existing deliverables and source documents.
Given well structured deliverables, it is much easier to build e.g. a FrameMaker
structured application or an XSL-FO stylesheet from the desired output than from
abstract ideas about the deliverable. Typeset stuff is a lot more complex to
reason about than most nerds and pointy-haired bosses acknowledge. To use a
buzzword du jour, you need test-driven development. You know what the result
should look like and how it should behave, and the development process is a
step-wise refinement in which you gradually get wiser about requirements to the
input format.
Of course, the structure that is optimal for typesetting probably isn`t optimal
for storage and re-use in other channels. So what? you don`t have to use it for
storage. The typeset deliverable is likely composed from a number of source
documents/entities, so you need a `compilation` step in the processing pipeline
anyway. A storage-to-almost-presentation transformation step is no big deal.
kind regards
Peter Ring
> -----Original Message-----
> From: mike feimster [mailto:mike.feimster@...]
> Sent: 9. februar 2005 14:51
> To: xml-doc@yahoogroups.com; melanie.kendell@...
> Subject: RE: [xml-doc] Re: Rant: Is XML for documentation
> going the way
> of SGML? (long)
>
>
>
> I think this thread is great! It has everyone discussing
> possibilities,
> pitfalls, and real-life work.
>
> >Anyone else got some real-life examples they want to share?
>
> My first structured Frame project was a self-paced tutorial.
> I created what
> amounts to a one-time EDD with elements such as exercise, task, input,
> ui-element, etc. Totally unusable for anything other than a self-paced
> tutorial written the way I wrote it. <g> (It was also the
> first time I ever
> wrote a self-paced tutorial. <vbg>
>
> For my next structured project, I analyzed the existing docs
> (and took into
> account other docs, future possibilities, etc.) and developed
> the structure
> in a spreadsheet. Each element has its own tab. On the tab
> are the allowed
> parent elements, allowed child elements, attributes, a
> description of the
> element, and general formatting info for both PDF and HTML.
> The parent and
> child elements are hyperlinks to their tabs.
>
<snip/>
Joined: 13 Jun 2003
Posts: 14
Rant: Is XML for documentation going the way of SGML? (long)
Not only the relative structure of your source documents but the other
dimensions as well.
1. Inventory management
What makes you think you can deal with a myriad of "topics,"
"chunks" or what ever you`re in the habit of calling labelled
content when you don`t know what all your product document
deliverables are much less where all source files are stored
and secured?
2. Project management
If tools and people are hard to manage for a relatively few,
basically handcrafted books using a well travelled path carved
out over the past 20-30 years, what makes you believe things
will be easier switching to a new technology with unstable
point tools produced and promoted by "come and go" vendors
and "free" sources plus other challenges in a field with
little wisdom and guidence forged from long experience?
3. Maintenance, change and extensibility
In the enthusiasm to embrace a promised panacea technology,
it is easy to think that everything will remain the same.
If you find these concerns difficult and challenging, they
will become even more so with more sophisticated technology,
changing consumer demands and staff turn over.
Don`t overlook the fact that technology adoption and change comes with
upfront and periodic sunk costs as well as ongoing expense. Like any
investment, do due diligence. Is the expense really worth the actual
(not potential) required benefits? In many cases, it will NOT. If you
don`t know or insist on going ahead despite unfavorable analysis,
you`d better have a good exit strategy in place and don`t burn any
bridges behind you; know how you`ll extract yourself before you commit.
I`m speaking mostly for typical documentation shops which tend to be
small and far from lavishly funded, and for "clear as bell" situations
where data is already understood to be naturally well structured and
process automation is the only feasible solution. Bill Hall previously
gave an excellent description of such a situation. See post 3582 and
others related to it.
Tim
--- In xml-doc@yahoogroups.com, "Peter Ring" <pri@m...> wrote:
> For me, the most important lesson in this thread so far is that -- a
bit contrary to conventional XML wisdom -- you should start with
existing deliverables, especially for typeset stuff.
>
> If your existing deliverables are not already well structured, they
won`t become so by magic, just by introducing XML into the workflow. A
mess is a mess, whether XML- or Word-based. Start by cleaning up the
layout and logistics of existing deliverables and source documents.
>
> Given well structured deliverables, it is much easier to build e.g.
a FrameMaker structured application or an XSL-FO stylesheet from the
desired output than from abstract ideas about the deliverable. Typeset
stuff is a lot more complex to reason about than most nerds and
pointy-haired bosses acknowledge. To use a buzzword du jour, you need
test-driven development. You know what the result should look like and
how it should behave, and the development process is a step-wise
refinement in which you gradually get wiser about requirements to the
input format.
>
> Of course, the structure that is optimal for typesetting probably
isn`t optimal for storage and re-use in other channels. So what? you
don`t have to use it for storage. The typeset deliverable is likely
composed from a number of source documents/entities, so you need a
`compilation` step in the processing pipeline anyway. A
storage-to-almost-presentation transformation step is no big deal.
>
> kind regards
> Peter Ring
>
>
> > -----Original Message-----
> > From: mike feimster [mailto:mike.feimster@a...]
> > Sent: 9. februar 2005 14:51
> > To: xml-doc@yahoogroups.com; melanie.kendell@b...
> > Subject: RE: [xml-doc] Re: Rant: Is XML for documentation
> > going the way
> > of SGML? (long)
> >
> >
> >
> > I think this thread is great! It has everyone discussing
> > possibilities,
> > pitfalls, and real-life work.
> >
> > >Anyone else got some real-life examples they want to share?
> >
> > My first structured Frame project was a self-paced tutorial.
> > I created what
> > amounts to a one-time EDD with elements such as exercise, task, input,
> > ui-element, etc. Totally unusable for anything other than a self-paced
> > tutorial written the way I wrote it. <g> (It was also the
> > first time I ever
> > wrote a self-paced tutorial. <vbg>
> >
> > For my next structured project, I analyzed the existing docs
> > (and took into
> > account other docs, future possibilities, etc.) and developed
> > the structure
> > in a spreadsheet. Each element has its own tab. On the tab
> > are the allowed
> > parent elements, allowed child elements, attributes, a
> > description of the
> > element, and general formatting info for both PDF and HTML.
> > The parent and
> > child elements are hyperlinks to their tabs.
> >
> <snip/>
Joined: 28 Jan 2005
Posts: 2
Rant: Is XML for documentation going the way of SGML? (long)
Ambot Sakoy wrote:
>
> Not only the relative structure of your source documents but the
> other dimensions as well.
[...]
Tim should be known as the Devil`s Advocate >-]
K
All times are GMT
Page 2 of 2
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Freelace Website Designer - Customer web design and software building.
China Wholesale - Electronics Products
Character Studio - Tutorials and Help
China Wholesale - Electronics Products
Character Studio - Tutorials and Help







