freelanceprogrammers.org Forum Index » XML / XSL
DocBook redux
Joined: 13 Jun 2003
Posts: 14
DocBook redux
Let`s see if this post makes it.
Dave Pawson has added his voice on behalf of DocBook,
apparently in response to my noting that it "failed"
in terms of utilizing the technology to provide for
a multi-dimensional framework.
Again, despite this failing, DocBook is a great
success in attracting and helping to focus people
discussion about what documents are and in coming
up with a very useful tool for immediate application.
I especially appreciate the because it was basically
the creation of documentation professionals and
practioners, and that is good for the professional.
However, this success is also a hurdle that has to
be climbed or circumvented (my suggestion with a
DocBook2).
Thank you Bill for supporting and reinforcing the
idea. I appreciate your long and deep involvement in
document production and management and your informed,
intelligent way of thinking and writing about the
subject. Like you, DocBook has given me a focal point
to think about but I haven`t used it and have given
it a qualified recommendation to others because of
its inherent limitations. That could change if the
technology was fully exploited in a revision or new
"framework" version.
I`d like to talk about this, and I think many here
would find that the discussion would draw together
conversations about the nature of meta data and
documentation, and how XML can be applied to reveal
and exploit these things. But, I`m not sure if this
is the right venue. Perhaps to moderator can
clarify before some of us plunge in and take the
time to compose thoughtful posts that will be deemed
inappropriate.
Tim
__________________________________
Do you Yahoo!?
Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
http://calendar.yahoo.com
Joined: 06 Jun 2003
Posts: 5
DocBook redux
Binisiya wrote:
><snip/>
>I`d like to talk about this, and I think many here
>would find that the discussion would draw together
>conversations about the nature of meta data and
>documentation, and how XML can be applied to reveal
>and exploit these things. But, I`m not sure if this
>is the right venue.
>
I suppose none of us would ignore the irony of the discussion`s taking
place on a list other than a DocBook list?
However, I wonder if it isn`t, in many ways, appropriate? One advantage
of discussing a new DocBook in this venue (xml-doc) might be the
opportunity for participation from those who have come and gone from
DocBook or from those who looked and left without adopting or from those
who are currently looking, but are not yet among the converted.
I suggested, on the DocBook list, that a good pruning of its elements to
a new core or "small DocBook", i.e, down to something very close to and
easily mapped to XHTML would be a good idea.
I believe it would be an easier sell and implementation story to be able
to talk about DocBook as super-XHTML and about customizations in terms
of *adding* modules or elements as need arises rather than the current
situation where customization notionally involves *subtracting*
unnecessary elements from huge pools. A generic core would also extend
the potential new user base for DocBook by inviting adoption by people
who may only be on the periphery of the standard (hard core) technical
documentation segment or by people, e.g., small business consultancies,
who need to cover a more fluid application area that is wider (and more
generic) than hard core corporate technical documentation segment.
...edN
Joined: 13 Jun 2003
Posts: 14
DocBook redux
I agree with you Ed since DocBook is one of the few
open, public (non-vendor) experiences with XML that
also produced a useful UNIX/open source like tool but
then, I`m not the monitor of this group.
Twice now, I`ve responded to topics others have been
allowed to post only to have my responses rejected as
"not appropriate for this group." It is disconcerting
to take the time to make reasonable and reasoned
responses only to see them flushed down the toilet as
wasted effort.
When I stated that this topic (what DocBook might
become) might not be the right venue, I did so
looking over my shoulder, not because I think the
topic inappropriate here.
BTW, the pairing down of DocBook to something XHTML
like is a little over-simlification of XHTML but
not a bad analogy. As I`m sure you know, XHTML is
basically a PROMISE to obey HTML rules plus
partitioning for different purposes. DocBook could
use a framework too, and thinking about how that
would be done would expand on people`s understanding
of XML technology and the inherent multi-dimensional
structure of documents. . . a good thing, if nothing
else came of it.
Tim
--- ed nixon <ed.nixon@...> wrote:
> Binisiya wrote:
>
> ><snip/>
> >I`d like to talk about this, and I think many here
> >would find that the discussion would draw together
> >conversations about the nature of meta data and
> >documentation, and how XML can be applied to reveal
> >and exploit these things. But, I`m not sure if this
> >is the right venue.
> >
> I suppose none of us would ignore the irony of the
> discussion`s taking
> place on a list other than a DocBook list?
>
> However, I wonder if it isn`t, in many ways,
> appropriate? One advantage
> of discussing a new DocBook in this venue (xml-doc)
> might be the
> opportunity for participation from those who have
> come and gone from
> DocBook or from those who looked and left without
> adopting or from those
> who are currently looking, but are not yet among the
> converted.
>
> I suggested, on the DocBook list, that a good
> pruning of its elements to
> a new core or "small DocBook", i.e, down to
> something very close to and
> easily mapped to XHTML would be a good idea.
>
> I believe it would be an easier sell and
> implementation story to be able
> to talk about DocBook as super-XHTML and about
> customizations in terms
> of *adding* modules or elements as need arises
> rather than the current
> situation where customization notionally involves
> *subtracting*
> unnecessary elements from huge pools. A generic core
> would also extend
> the potential new user base for DocBook by inviting
> adoption by people
> who may only be on the periphery of the standard
> (hard core) technical
> documentation segment or by people, e.g., small
> business consultancies,
> who need to cover a more fluid application area that
> is wider (and more
> generic) than hard core corporate technical
> documentation segment.
>
> ...edN
__________________________________
Do you Yahoo!?
Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
http://calendar.yahoo.com
Joined: 17 Jun 2003
Posts: 4
DocBook redux
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
/ ed nixon <ed.nixon@...> was heard to say:
| Binisiya wrote:
|
|><snip/>
|>I`d like to talk about this, and I think many here
|>would find that the discussion would draw together
|>conversations about the nature of meta data and
|>documentation, and how XML can be applied to reveal
|>and exploit these things. But, I`m not sure if this
|>is the right venue.
|>
| I suppose none of us would ignore the irony of the discussion`s taking
| place on a list other than a DocBook list?
|
| However, I wonder if it isn`t, in many ways, appropriate? One advantage
| of discussing a new DocBook in this venue (xml-doc) might be the
| opportunity for participation from those who have come and gone from
| DocBook or from those who looked and left without adopting or from those
| who are currently looking, but are not yet among the converted.
I guess I`ll be trying to keep up with this list a little more
aggressively then :-)
| I suggested, on the DocBook list, that a good pruning of its elements to
| a new core or "small DocBook", i.e, down to something very close to and
| easily mapped to XHTML would be a good idea.
How would differ from Simplified DocBook?
| I believe it would be an easier sell and implementation story to be able
| to talk about DocBook as super-XHTML and about customizations in terms
| of *adding* modules or elements as need arises rather than the current
| situation where customization notionally involves *subtracting*
| unnecessary elements from huge pools. A generic core would also extend
| the potential new user base for DocBook by inviting adoption by people
| who may only be on the periphery of the standard (hard core) technical
| documentation segment or by people, e.g., small business consultancies,
| who need to cover a more fluid application area that is wider (and more
| generic) than hard core corporate technical documentation segment.
I guess a good first question is "what constitutes success"?
Suppose it was possible to recast DocBook so that it was easily
adaptable to documents about financial accountancy. And suppose as a
result, the Accountants of America (if such an organization exists)
adopted it as their national standard. Would that be a success?
Suppose it was possible to recast DocBook so that technical writers,
working on computer hardware and software documentation could more
easily and simply customize DocBook to suit their particular needs,
could more easily and simply customize the tools they use to edit and
produce online and printed documentation, that they could, in short,
get their jobs done more easily. Would that be a success?
Should extending DocBook`s reach to new domains be a specific goal of
a refactoring exercise? Why or why not?
Be seeing you,
norm
- --
Norman Walsh <normyahoo@...> | If you run after wit you will
http://nwalsh.com/ | succeed in catching
| folly.--Montesquieu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 <http://mailcrypt.sourceforge.net/>
iD8DBQE+4OeVOyltUcwYWjsRApm2AJoD/uJE+2TPIWet0e9RoQRSQ+nYtQCeN+6S
Fy7Ez4hYkuODdYJiauRGFsk=
=dRz7
-----END PGP SIGNATURE-----
Joined: 06 Jun 2003
Posts: 5
DocBook redux
Norman Walsh wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>/ ed nixon <ed.nixon@...> was heard to say:
>| However, I wonder if it isn`t, in many ways, appropriate? One advantage
>| of discussing a new DocBook in this venue (xml-doc) might be the
>| opportunity for participation from those who have come and gone from
>| DocBook or from those who looked and left without adopting or from those
>| who are currently looking, but are not yet among the converted.
>
>I guess I`ll be trying to keep up with this list a little more
>aggressively then :-)
>
Well, maybe something will happen or not. Who knows? But conceptually I
think broader input might turn out to be useful.
>| I suggested, on the DocBook list, that a good pruning of its elements to
>| a new core or "small DocBook", i.e, down to something very close to and
>| easily mapped to XHTML would be a good idea.
>
>How would differ from Simplified DocBook?
>
It might not differ at all, but it would be the base -- rather than a
derivative -- onto which other modules would be attached if/as needed..
>I guess a good first question is "what constitutes success"?
>
Donno. More uptake? Fewer of the same sorts of customization support
questions? Few lines of stylesheet code? That need to be maintained for
more functionality? Fewer "this is too &*$# big!" statements? ;-)
>Suppose it was possible to recast DocBook so that technical writers,
>working on computer hardware and software documentation could more
>easily and simply customize DocBook to suit their particular needs,
>could more easily and simply customize the tools they use to edit and
>produce online and printed documentation, that they could, in short,
>get their jobs done more easily. Would that be a success?
>
>Should extending DocBook`s reach to new domains be a specific goal of
>a refactoring exercise? Why or why not?
>
I haven`t suggested that DocBook`s reach be changed at all. I did infer
at some point that the base element set, call it Simplified DocBook
Nouveau, can actually cover a very large generic document space. What I
was musing about was the idea that there be a more explicit mapping
between Simplified DocBook Nouveau and XHTML. [One respondent has said
this is either wrong headed or a mis-understanding about XHMTL or
DocBook or design, but hasn`t explained it in a way I can understand.]
Think about how much of the the DocBook content generated today gets
transformed into some form of (X)HTML already: website, html, xhtml,
htmlHelp. Why not make the connection a little more explicit so it makes
DocBook an easier sell for those who might be trying to sell an
sophisticated but incremental solution into new organizational contexts
or organizations that really *should* be using an xml solution but
aren`t because it`s "too hard" and it`s easier to stay with RTF-based
solutions.
In the current exercise, I`m suggesting that the existing DocBook be
rethought so that it becomes easier to understand and use by newbies,
i.e. smaller, simpler, and that it is notionally easier customize, i.e.,
to "add to" rather than "take from", for people who have semantic and/or
structural requirements beyond the base (or Simplified DocBook Nouveau)
foundation. Perhaps, if these goals make sense and they can be achieved
(big if`s) then the scope of DocBook would expand spontaneously as
others decided that there was enough, e.g., that mystical 80%, already
there to make the addition relatively straightforward and "economical",
whatever that might mean to them. But I`m not advocating an expansion.
On the other hand, the question I ask myself is, "if you want `simple`,
why not just do xhtml from the word go? You can customize that too if
you want." I don`t have a good answer for that except to say that
DocBook is already there in terms of semantic richness; it`s just a
longer learning curve than I think it needs to be. If the question of
refactoring or rethinking is open, then I think ease of adoption, ease
of learning and ease of initial implementation should be prime
considerations and then ease of customization might actually fall into
place. But don`t throw out the vast knowledge and experience that has
been accumulated and don`t open the process (in the first instance at
least) to new domains.
I hope this is at least understandable, even if it doesn`t make sense.
...edN
Joined: 14 Jun 2003
Posts: 10
DocBook redux
ed nixon wrote:
> I believe it would be an easier sell and implementation story to be able
> to talk about DocBook as super-XHTML and about customizations in terms
> of *adding* modules or elements as need arises rather than the current
> situation where customization notionally involves *subtracting*
> unnecessary elements from huge pools.
>
This smacks the head of the nail with a mighty big hammer.
I keep wondering what an "extensibility" mechanism would look like
in an XML system, and how extensions might be stylized more easily.
The following is a set of ideas on the subject. (I suspect that some
of them don`t really fit the DocBook mold, so I`m posting them here.)
Extensibility
-------------
It seems to me that at a minimum, supertypes of <struct>
and <inline> need to be defined, so that a new element can
"extend" one or the other.
If <note> were then declared as a subtype of <struct>,
schema validation could ensure that <note> elements only occurred
where <struct> elements were allowed -- and not where <inline>
elements might be allowed to occur, for example.
Style
-----
Similarly, instead of having to edit an existing stylesheet, I`d
like to be able to put a new production rule into a table, where the
right hand side of the production rule used previosly defined elements.
For example, using xhtml for the production rule, I would add a
style declaration for the new <note> element with something like this:
<note>...</note> --> <blockquote><b>Note:</b><br/>...</blockquote>
The big win here would be not having to edit an XSL stylesheet.
(Behind the scenes, the production rules would most likely be
translated into XSL. I just don`t want to have to think about it.)
Nesting Depth?
--------------
To make simple production rules possible, a notion of "nesting depth"
might be desirable, at least for sections and bullet lists.
The existence of a parser-supplied nesting level makes it possible
to define rules of the sort:
<sect 1><title>... --> <h2>...</h2>
<sect 2><title>... --> <h3>...</h3>
On the other hand, I suppose that similar results could be achieved
using XPath statements:
/sect/title/... --> <h2>...</h2>
/sect/sect/title/... --> <h3>...</h3>
So maybe nesting depth really isn`t a requirement. (It just
seems a little cleaner.)
Meta-Structure
--------------
At bottom, the concept of extensibility rests on the existence of
an implied meta-structure. For example, a simple document
structure can be defined with a very small number of elements:
<sect> ::= <title> + <block> + <sect>*
<title> ::= <title>text</title>
<block> ::= (<para>* | <struct>?)*
<para> ::= <para>text<para>
<text> ::= (chars* | <inline>?)*
That structure leaves out summaries and bylines, but it`s pretty
close to an accurate description of a document as a collection of
recursive sections, where text can precede a subsection but
can never follows one.
The majority of elements found in most every document can then
be defined as specializations/extensions of the <inline> and
<struct> elements:
<inline> ::= <a> | <b> | <i> | <u> | <tt> | <sub> | <sup> | <font>
| <size> | <color>
<struct> ::= <BList> | <NList> | <indent> | <inset>
| <image> | <table>
Purple Numbers
--------------
A further advantatge of delineating structural elements is the
ease of generating "purple numbers". Purple numbers are
automatically assigned, sequentially numbered, self-referencing
links that appear as light purple numbers at the end of a
paragraph or title, thereby making it possible to create a link
to any structural element in the page.
Note:
Purple Numbers were originally implemented in Douglas Engelbart`s
NLS system (http://www.bootstrap.org) and were most recently
implemented by Eugene Kim and Chris Dent at BlueOxen Associates
(http://www.blueoxen.org).
Joined: 16 Jun 2003
Posts: 3
DocBook redux
Eric Armstrong wrote:
>ed nixon wrote:
>
>> I believe it would be an easier sell and implementation story to be able
>> to talk about DocBook as super-XHTML and about customizations in terms
>> of *adding* modules or elements as need arises rather than the current
>> situation where customization notionally involves *subtracting*
>> unnecessary elements from huge pools.
>
>This smacks the head of the nail with a mighty big hammer.
>
>I keep wondering what an "extensibility" mechanism would look like
>in an XML system, and how extensions might be stylized more easily.
>
>The following is a set of ideas on the subject.
It seems to me that you`re describing (fairly well, I might add) many of
the pertinent features of DITA specialization, which is the mechanism we
use for extending DITA by adding new information types, new domains of
vocabulary, and (only when necessary) new output mappings or styles.
Specialization lets us define new markup as part of a semantic hierarchy
(so, for example, a <JavaClass> could be defined as a kind of <apiname>
which is a kind of <keyword>), and leverage that hierarchy during
processing (so, for example, a single output rule for <keyword> applies to
all specializations as well, except where overridden).
You can implement specialization with DTDs, as we have in the current DITA
package, or in schemas, or in any other language. There`s no extra tooling
or non-standard stuff involved, it`s just a set of guidelines and
principles that give you semantic extensibility without compromising
interoperability or interchangeability.
In other words, you can have a whole network of DTDs/schemas that still
share a common processing stream, so even though groupA and groupB use
completely different markup they can still produce a book or a website that
includes both their content, as seamlessly as if they were all in the same
DTD.
The mechanism is described in detail at:
http://xml.coverpages.org/DITA-EXTREME-Specialization.pdf
You can get the DITA DTDs and transforms (which use specialization) at:
http://www.ibm.com/developerworks/xml/library/x-dita1/dita10.zip
And you can get a whole bunch more info on DITA in general at:
http://www.oasis-open.org/cover/dita.html
Michael Priestley
DITA Specialization Architect
mpriestl@...
Joined: 14 Jun 2003
Posts: 10
DocBook redux
Thanks, Michael.
I was not aware of this work. From your description, it does
sound like the right technology for the job. I`ll take a
deeper look soon.
Have there been any discussions concerning a DocBook-like
specification using DITA?
Michael Priestley wrote:
> Eric Armstrong wrote:
>
>
>>ed nixon wrote:
>>
>>
>>>I believe it would be an easier sell and implementation story to be able
>
>
>>>to talk about DocBook as super-XHTML and about customizations in terms
>>>of *adding* modules or elements as need arises rather than the current
>>>situation where customization notionally involves *subtracting*
>>>unnecessary elements from huge pools.
>>
>>This smacks the head of the nail with a mighty big hammer.
>>
>>I keep wondering what an "extensibility" mechanism would look like
>>in an XML system, and how extensions might be stylized more easily.
>>
>>The following is a set of ideas on the subject.
>
>
> It seems to me that you`re describing (fairly well, I might add) many of
> the pertinent features of DITA specialization, which is the mechanism we
> use for extending DITA by adding new information types, new domains of
> vocabulary, and (only when necessary) new output mappings or styles.
>
> Specialization lets us define new markup as part of a semantic hierarchy
> (so, for example, a <JavaClass> could be defined as a kind of <apiname>
> which is a kind of <keyword>), and leverage that hierarchy during
> processing (so, for example, a single output rule for <keyword> applies to
> all specializations as well, except where overridden).
>
> You can implement specialization with DTDs, as we have in the current DITA
> package, or in schemas, or in any other language. There`s no extra tooling
> or non-standard stuff involved, it`s just a set of guidelines and
> principles that give you semantic extensibility without compromising
> interoperability or interchangeability.
>
> In other words, you can have a whole network of DTDs/schemas that still
> share a common processing stream, so even though groupA and groupB use
> completely different markup they can still produce a book or a website that
> includes both their content, as seamlessly as if they were all in the same
> DTD.
>
> The mechanism is described in detail at:
> http://xml.coverpages.org/DITA-EXTREME-Specialization.pdf
>
> You can get the DITA DTDs and transforms (which use specialization) at:
> http://www.ibm.com/developerworks/xml/library/x-dita1/dita10.zip
>
> And you can get a whole bunch more info on DITA in general at:
> http://www.oasis-open.org/cover/dita.html
>
> Michael Priestley
> DITA Specialization Architect
> mpriestl@...
>
>
> -------------------------------------------------------------------
> Post a message: mailto:xml-doc@yahoogroups.com
> Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> -------------------------------------------------------------------
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Joined: 13 Jun 2003
Posts: 39
DocBook redux
Eric`s comments also had me thinking, "sounds like DITA from
what I`ve read," and Michael`s response kind of confirmed that.
But the conversation led me back to something I`ve been thinking
about off & on: how much complexity do you *need* to express
a document`s structure?
The JOE outliner[1] uses OPML[2], an extremely simple document
type -- the non-metadata tags are opml, head, body, and outline;
outline uses a "text" attribute for user-visible data (the only
losing feature IMO). Outline also has "isComment" and "isBreakpoint"
attributes, and the specification allows users to create their own
attributes as needed. JOE allows easy toggling of "isComment,"
which lets me use it to specify body text and bulleted lists.
The upshot: I have a 104-line XSLT file, including comments,
that translates an outline into our in-house DTD, giving me an
easy way to get outlines into FrameMaker. When I`m writing a
new manual, I practically *live* in the outliner. You can
imagine how helpful it is to transform an outline & dump it
straight into Frame.
Another simple DTD is HTML, which probably everyone on this
list is familiar with to some degree. :-) It`s ubiquitous, and I
submit that it`s ubiquitous because it`s simple. There were
several SGML DTDs that could have done everything that HTML
did and more when the Web was new, and while I can`t speak for
Tim Berners-Lee I would be surprised if he wasn`t aware of
them. HTML has its flaws, and again I`m sure everyone reading
this list has heard them or enumerated them, but I think the
level of complexity is about right for most documents written
by people for people. In contrast, DocBook is overly complex
for enough people that there`s an official Simplified DocBook
as well (which is unfortunately specific to articles).
My question to those who have been working with XML far far
longer than I have is this: do we really *need* separate tags
for decorative elements with similar layout? This ties back
into the discussion about implied structure late last week:
if the context makes it clear that <foo> is a GUI label in one
place and a part of a CLI command in another, do we really
need two elements (especially if the intended presentation
is similar)? Is there really a need to extract information at
that deep of a level? That`s not a rhetorical question; if you`re
doing it, please enlighten me....
The vision I have, such as it is, may line up pretty well with
DITA (I`m in a busy phase at the moment, but still plan to
give it the study I think it deserves): a core set of tags, no
more then two dozen, to define document structure and some
basic highlighting. These two dozen tags should be chosen
carefully, obviously, because they should be sufficient for
articles and many books (I think the 80/20 principle
applies -- 20% of the tags can do 80% of the work). A short
list is easier to memorize, and faster for an authoring tool
to sort through (definitely a factor for Java-based editors).
However, one important feature would be a <wrapper>
element -- its purpose would be to mark blocks of content
for processing. For example, one could remove background
information from a quick reference or extract it into a
hyperlinked document for an online manual.
Where needed, auxiliary tag sets could provide more complex
metadata, decorative elements for specific interfaces
(such as GUI, CLI, API), structure for information that
appears only in certain types of manuals (such as alarm
messages in troubleshooting guides), and so forth. If all
were put together, the sum might approach DocBook`s 400
tag count, but I suspect that most writers would require
far fewer.
--
Larry Kollar, Senior Technical Writer, ARRIS
"Content creators are the engine that drives
value in the information life cycle."
-- Barry Schaeffer, on XML-Doc
[1] http://outliner.sourceforge.net/
[2] http://www.opml.org/
Joined: 13 Jun 2003
Posts: 5
DocBook redux
> Another simple DTD is HTML, which probably everyone on this
> list is familiar with to some degree. :-) It`s ubiquitous, and I
> submit that it`s ubiquitous because it`s simple. There were
> several SGML DTDs that could have done everything that HTML
> did and more when the Web was new, and while I can`t speak for
> Tim Berners-Lee I would be surprised if he wasn`t aware of
> them. HTML has its flaws, and again I`m sure everyone reading
> this list has heard them or enumerated them, but I think the
> level of complexity is about right for most documents written
> by people for people. In contrast, DocBook is overly complex
> for enough people that there`s an official Simplified DocBook
> as well (which is unfortunately specific to articles).
I don`t have much to comment on this, but imagine HTML without <div/>,
<span/>, and the class attribute. I think that without these "general
purpose" elements and the ability to sub-class them, HTML would only work
for the simplest of documents. As somebody pointed out on the DocBook list,
<div class="note"> is really no less complex than <note>, and doesn`t allow
for any additional constraints on notes when compared with other "div"s.
> My question to those who have been working with XML far far
> longer than I have is this: do we really *need* separate tags
> for decorative elements with similar layout? This ties back
> into the discussion about implied structure late last week:
> if the context makes it clear that <foo> is a GUI label in one
> place and a part of a CLI command in another, do we really
> need two elements (especially if the intended presentation
> is similar)? Is there really a need to extract information at
> that deep of a level? That`s not a rhetorical question; if you`re
> doing it, please enlighten me....
Let me give you an example of where very specific semantic tagging (to the
level you are discussing) saved us hours upon hours of work, and even more
specific semantic tagging could have saved us even more. The application we
are documenting has been around for close to 30 years. It was originally
purely command-driven, but at some point (before I came to the company),
they added a GUI. Throughout our documentation, wherever we discuss a
procedure, we give both the command to perform a task, and a list of
possible menu paths.
When I came to the company about a year ago, they completely re-designed the
GUI, including a major reorganization of their menu structure. I was able
to write a script to find every occurence of a <guimenu/> element (we use
<guimenu/> to describe the full menu path, rather than the more appropriate
<menuchoice/>). Our writers then went through this list and found and
changed those that needed to be changed. My XSLT skills have since
developed sufficiently that I may be able to automate many of the changes if
I had it to do again. If we had tagged our menupaths appropriately using
the DocBook <menuchoice/> element, that automation could even be quite
simple.
Of course, we then have to ask ourselves, was the extra time spent in
tagging our menu paths as <guimenu/> instead of some more generic markup
justified by the time savings in this one instance? To someone who is
really knowledgeable about the vocabulary (DocBook in our case) and the
editing tool, I think the amount of additional time spent in using one tag
as opposed to another is pretty minimal. The problem comes when you have to
look up the best element to use in each case.
Jeff Beal
Documentation Tools Specialist
ANSYS, Inc.
Joined: 06 Jun 2003
Posts: 5
DocBook redux
larry.kollar@... wrote:
><snip/>But the conversation led me back to something I`ve been thinking
>about off & on: how much complexity do you *need* to express
>a document`s structure?
>
Yes. Good question.
><snip/>
>Another simple DTD is HTML, which probably everyone on this
>list is familiar with to some degree. :-)
>
I think I was heard to say something similar, but with respect to XHTML,
or perhaps more precisely XHTML base.
><snip/>
>
>My question to those who have been working with XML far far
>longer than I have is this: do we really *need* separate tags
>for decorative elements with similar layout?
><snip/>
>
>Is there really a need to extract information at
>that deep of a level? That`s not a rhetorical question; if you`re
>doing it, please enlighten me....
>
I guess if presentation is the *only* requirement, there is only so much
need for semantic distinctions as there is for visual or layout
distinction. On the DocBook list, I communicated a point made my Edward
Tufte {http://www.edwardtufte.com/} in his essay called "The Cognitive
Style of Powerpoint" that Feynmann was able to do a pretty good job of
writing a 500+ page book on the complexities of physics using only
chapter titles and section headings. (Tufte forgets to mention the
diagrams and graphics running down the outer 1/3rd margin area of the
page example from Feynmann that he uses.)
But there are other requirements that are related to and may demand
richer semantics, e.g., those that might be needed for a content
management system containing standardized chunks or multiple versions. I
think of these types of requirements being addressed via relatively
easier, more modular incorporation as need arises.
><snip/>
>a core set of tags, no
>more then two dozen, to define document structure and some
>basic highlighting.
>
Simplified DocBook might be a good place to start. Previously, I`ve
suggested a re-visiting of sDb with an eye to more clearly recognizing
the large amount of processing that gets done converting DB to various
forms of (X}HTML. There have been objections about this being too Web
centric an approach, not "print" enough. But I agree with Tufte when he
says it`s entirely possible to create respectable, even highly
professional, print output using the (display-oriented) semantics
available in XHTML. Perhaps 80% of generic documents need nothing more.
The added benefit of allowing CSS a more prominent role along with XSLT
and XSL-FO would be in the relative ease of learning, rate of uptake in
new organizational contexts and slackening of support requirements on
the XSLT and XSL-FO lists.
rgrds. ...edN
Joined: 14 Jun 2003
Posts: 4
DocBook redux
--- In xml-doc@yahoogroups.com, larry.kollar@a... wrote:
> But the conversation led me back to something I`ve been thinking
> about off & on: how much complexity do you *need* to express
> a document`s structure?
I`m tempted to repeat that too-obvious come-back, How long is a piece
of string? But even it is too weak for the difficulties implicit in
your conjecture.
Actually, I think your question and observations are a great opening
for a much larger scope of discussion about overlapping hierarchies
in text[1]. The DITA design experience is worth relating on this
point.
If what you mean by "structure" is adequate rendering on paper or on
displays, that`s one thing. Most DTDs will provide at least a title
and paragraph with nesting to represent hierarchy and containment.
Style guides for content introduce the need for basic layout
structures and typography (inline titles, basic lists, tables,
phrases) and relationships (ids, indexing and linking). Visual
affordances may require additional markup (inline and block quotes,
definition lists). Content meta-discourse such as footnotes, blurbs,
annotations, etc., introduce yet more markup.
Now we bring on information architectures: how you organize and
relate that basic content into the form encountered by the user.
Higher level organization represents the typical processing layer
that needs to be applied to fully express a comprehensive, finished
printable product, such as book-oriented part separators, table of
contents, front and back matter, notices, etc.. An online
deliverable may require simpler overhead, but there`s still a ToC and
overall handle (project file) upon which to initiate the
compile/build process, and these are necessarily part of your
publishing architecture represented by markup (even if stored
separately from the content itself).
Now we can bring on the semantic layers--any markup that represents
meaningful distinctions that need to be expressed for whatever
reason. Semantic distinction through markup can be applied at the
structural level (top-down analysis) or pervasively at the discourse
level (bottom-up analysis). It can be given either typographic or
functional behaviors, or both, depending on output needs. Its hard
at first to convince writers that semantic markup is needful; it
tends to look like extraneous markup that does not directly enhance
the writing task. But if your end users can benefit from the
usability affordances that can be attached to semantic markup,
writers are usually willing to accept using (and exploiting) that
additional markup.
Surely there are other overlapping considerations that could also add
to the tag soup. In the DITA experience, we trimmed down the basic
topic to around 100 elements, one-third of which were metadata and
the rest being basic containers and structures. The non-metadata
tags were truly much more terse than HTML at that point. Topic and
domain specialization allow tailored introduction of 100-or-so
semantic or infotyping elements (mostly software oriented so far, but
we are looking into additional domains such as hardware and
terminology). We isolated the structures that represent business
rules (processing for particular deliverables) from topics themselves
by envisioning a Delivery Context layer that uses topics by reference
rather than by direct containment--another handful of tags.
So the total number of tags you see depends on how you slice and dice
your authoring and delivery requirements. But the topic DTD that
represents the simplest DITA structure of four elements:
<topic>
<title>Hello</title>
<body><p>world</p></body>
</topic>
and its 60-or-so base classes of content structures is as spare as
we`ve been able to go. I`m excited now for next leg of the journey:
specializing on this base to explore taxonomies of markup
vocabularies and how they can be applied to the Semantic Web.
[1] Refining Our Notion of What Text Really Is: The Problem of
Overlapping Hierarchies,
http://www.stg.brown.edu/resources/stg/monographs/ohco.html
(and many papers and discussions that have been inspired by this
piece!)
--
Don Day
Lead DITA Architect
IBM Corp.
Joined: 14 Jun 2003
Posts: 10
DocBook redux
This may well be the most cogent summary of DITA yet produced.
Worth capturing as an overview/summary document on your site,
I think.
Don R. Day wrote:
> --- In xml-doc@yahoogroups.com, larry.kollar@a... wrote:
>
>>But the conversation led me back to something I`ve been thinking
>>about off & on: how much complexity do you *need* to express
>>a document`s structure?
>
>
> I`m tempted to repeat that too-obvious come-back, How long is a piece
> of string? But even it is too weak for the difficulties implicit in
> your conjecture.
>
> Actually, I think your question and observations are a great opening
> for a much larger scope of discussion about overlapping hierarchies
> in text[1]. The DITA design experience is worth relating on this
> point.
>
> If what you mean by "structure" is adequate rendering on paper or on
> displays, that`s one thing. Most DTDs will provide at least a title
> and paragraph with nesting to represent hierarchy and containment.
> Style guides for content introduce the need for basic layout
> structures and typography (inline titles, basic lists, tables,
> phrases) and relationships (ids, indexing and linking). Visual
> affordances may require additional markup (inline and block quotes,
> definition lists). Content meta-discourse such as footnotes, blurbs,
> annotations, etc., introduce yet more markup.
>
> Now we bring on information architectures: how you organize and
> relate that basic content into the form encountered by the user.
> Higher level organization represents the typical processing layer
> that needs to be applied to fully express a comprehensive, finished
> printable product, such as book-oriented part separators, table of
> contents, front and back matter, notices, etc.. An online
> deliverable may require simpler overhead, but there`s still a ToC and
> overall handle (project file) upon which to initiate the
> compile/build process, and these are necessarily part of your
> publishing architecture represented by markup (even if stored
> separately from the content itself).
>
> Now we can bring on the semantic layers--any markup that represents
> meaningful distinctions that need to be expressed for whatever
> reason. Semantic distinction through markup can be applied at the
> structural level (top-down analysis) or pervasively at the discourse
> level (bottom-up analysis). It can be given either typographic or
> functional behaviors, or both, depending on output needs. Its hard
> at first to convince writers that semantic markup is needful; it
> tends to look like extraneous markup that does not directly enhance
> the writing task. But if your end users can benefit from the
> usability affordances that can be attached to semantic markup,
> writers are usually willing to accept using (and exploiting) that
> additional markup.
>
> Surely there are other overlapping considerations that could also add
> to the tag soup. In the DITA experience, we trimmed down the basic
> topic to around 100 elements, one-third of which were metadata and
> the rest being basic containers and structures. The non-metadata
> tags were truly much more terse than HTML at that point. Topic and
> domain specialization allow tailored introduction of 100-or-so
> semantic or infotyping elements (mostly software oriented so far, but
> we are looking into additional domains such as hardware and
> terminology). We isolated the structures that represent business
> rules (processing for particular deliverables) from topics themselves
> by envisioning a Delivery Context layer that uses topics by reference
> rather than by direct containment--another handful of tags.
>
> So the total number of tags you see depends on how you slice and dice
> your authoring and delivery requirements. But the topic DTD that
> represents the simplest DITA structure of four elements:
> <topic>
> <title>Hello</title>
> <body><p>world</p></body>
> </topic>
> and its 60-or-so base classes of content structures is as spare as
> we`ve been able to go. I`m excited now for next leg of the journey:
> specializing on this base to explore taxonomies of markup
> vocabularies and how they can be applied to the Semantic Web.
>
> [1] Refining Our Notion of What Text Really Is: The Problem of
> Overlapping Hierarchies,
> http://www.stg.brown.edu/resources/stg/monographs/ohco.html
>
> (and many papers and discussions that have been inspired by this
> piece!)
>
> --
> Don Day
> Lead DITA Architect
> IBM Corp.
>
>
>
> -------------------------------------------------------------------
> Post a message: mailto:xml-doc@yahoogroups.com
> Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> -------------------------------------------------------------------
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Joined: 17 Jun 2003
Posts: 4
DocBook redux
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
/ ed nixon <ed.nixon@...> was heard to say:
[...]
| distinction. On the DocBook list, I communicated a point made my Edward
| Tufte {http://www.edwardtufte.com/} in his essay called "The Cognitive
| Style of Powerpoint" that Feynmann was able to do a pretty good job of
I look forward to seeing that. I`m hundreds of messages behind but I
will catch up. Later this week, I hope.
[...]
| centric an approach, not "print" enough. But I agree with Tufte when he
| says it`s entirely possible to create respectable, even highly
| professional, print output using the (display-oriented) semantics
| available in XHTML. Perhaps 80% of generic documents need nothing more.
I have no trouble believing that. But markup like DocBook isn`t about
respectable print output, it`s about identifying what things are. One
reason to do this is so that you can easily change the respectable
print output. Another is for better navigation.
For example, even though <procedure>s and <orderedlist>s look the same
by default, I still want to mark them up differently so that I can
distinguish between procedures and lists if I want to. And I use
markup like <variable> and <function>, again even though they render
the same, so that I can make indexes of variables and indexes of
functions.
Dragging markup up hill (that is, adding semantics after the fact) is
difficult, tedious, and error prone. Discarding information, going
down hill by analogy, is easy.
| The added benefit of allowing CSS a more prominent role along with XSLT
| and XSL-FO would be in the relative ease of learning, rate of uptake in
| new organizational contexts and slackening of support requirements on
| the XSLT and XSL-FO lists.
Maybe. Maybe not. CSS isn`t really powerful enough to do the
formatting you`d like so you have to use some transformation. I
(personally) find it easier to think of it in two stages: first you
transform it, then you style it.
But lots of folks, especially folks with editors that use CSS for
online styling, probably disagree.
Be seeing you,
norm
- --
Norman Walsh <normyahoo@...> | The sudden disappointment of hope
http://nwalsh.com/ | leaves a scar which the ultimate
| fulfillment of that hope never
| entirely removes.--Thomas Hardy
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 <http://mailcrypt.sourceforge.net/>
iD8DBQE+7jFkOyltUcwYWjsRAisRAKCnAFhmqm5SvZLENcdhzGztJ7NTSACePONR
oLRMNBrFhsoQICCKHNDzfv8=
=LWBp
-----END PGP SIGNATURE-----
Joined: 06 Jun 2003
Posts: 5
DocBook redux
Norman Walsh wrote:
><snip/>
>
>Dragging markup up hill (that is, adding semantics after the fact) is
>difficult, tedious, and error prone. Discarding information, going
>down hill by analogy, is easy.
>
I`m sure you are right about this at least from a designer`s and
maintainer`s point of view. The point of view I`ve been advocating is
that of the new user. In that view, the absolute volume of current
markup that the new user has to "discard" or, more appropriately,
"ignore" in order to reach a first comfortable cognitive level with
DocBook is enormous. In fact, it may be that the amount of ignoring
defeats a large number of potential new users in new organizations. If
true, this is a shame, I think.
Yes, Simplified DocBook is available and extremely useful, but the next
step, the first customization takes the rooky right back into the deep
end to confront the same cognitive swamping problem. What I`ve been
trying to advocate is an approach that allows the user to incrementally
expand into DocBook as need requires by plugging in a collection of
markup elements that seem most appropriate for the newly evolved
requirements.
I guess in simple speak I`m saying the current schema/DTD customization
process is too complicated, too obscure, too difficult and maybe too
time consuming given the learning curve, false starts and actual work
involved.
To try to be clear, I`m *not* advocating tossing out any markup that
currently exists (unless it falls into the cruft, ambiguous, redundant
or not so useful categories.) I`m suggesting that it be waiting
patiently out of sight, just off stage within easy hail.
At this point, I`m asking myself whether I shouldn`t just pipe down,
wait, watch and read. I`m thinking, as you are, that a migration to
RELAX NG will make a huge amount of difference to DocBooks apparent and
real complexity, and hopefully to its cognitive intimidation factor.
Rgrds and thanks. ...edN
All times are GMT
Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Freelace Website Designer - Customer web design and software building.
China Wholesale - Electronics Products
Character Studio - Tutorials and Help
China Wholesale - Electronics Products
Character Studio - Tutorials and Help







