freelanceprogrammers.org Forum Index » XML / XSL

Ryan Germann - Inherent Structure in Documents


View user's profile Post To page top
binisiya Posted: Thu Jun 12, 2003 10:12 pm


Joined: 13 Jun 2003

Posts: 14
Ryan Germann - Inherent Structure in Documents
Thank you for the Slocombe-Boyd reference titled
"There are no unstructured documents." I`ve had
exchanges with Slocombe, and we`re very much in
agreement about this thesis.

In line with it, one of the exercises I`m like to
assign my students is to write a simple set of
layout rules and a program that recognizes them
and does something useful. . . in much the same
way a writer adds visual clues to help impart
meaning, and a reader uses them to recognize that
meaning. For instance, here`s an illustration of
simple rules, and I`m sure no one would fail to
recognize it as a valid document form:

1. Clearly a header
A bunch of text that makes up a paragraph.

With a blank line delimiting a second paragraph.
followed by a list.

o Item 1 - preceeded with the `o` character.
o Item 2 - preceeded with the `o` character.

Within XML, this notion is already so deeply accepted
that few think about things like periods as sentence
delimiters much less propose their replacement with
sentence tagging or even a more radical elimination
of "presenation" artifacts, like initial capitals
and periods to mark sentences, space to delimit words
and so on, used and understood in normal writing.

eg The dog barked.

<sentence>
<word>
<character encoding="ASCII">116</character>
<character encoding="ASCII">104</character>
<character encoding="ASCII">101</character>
</word>
<word>
:
etc.

Futhermore, besides documents being structured by
human choice using visual and character clues, all
the document authoring tools are based on object
models that include both content and presentation
elements.

Lest this confuse anyone into wondering "why then
XML," the two things it offers is a mechanism for
adding additional levels of semantic meaning to
reflect various domains a document may reside in
AND the opportunity to create a universal vocabulary
for expressing fundamental document elements. If
this last thing is achieved, perhaps authoring tools
can move away from propriatary schemes or, at least,
render them moot by a common medium of exchange.

Tim


> I would like to direct you to a white paper:
>
> There are no unstructured documents, Presented at
> XML Europe 2002 by
> David Slocombe and Rodney Boyd, Exegenix
>
>
http://www.exegenix.com/media/pdf/exegenix_xmleurope2002_paper.pdf


__________________________________
Do you Yahoo!?
Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
http://calendar.yahoo.com
Reply with quote
Send private message
View user's profile Post To page top
ericsilverlight Posted: Sat Jun 14, 2003 12:07 am


Joined: 14 Jun 2003

Posts: 10
Ryan Germann - Inherent Structure in Documents
Marvelous, eye-opening analysis.

I am struck by the observation that we choose an
"appropriate level" of presentation-level markup
in order to get the right balance of readability
and processability -- and the balance may be farther
to one side or the other, depending on the project.


Binisiya wrote:

> Thank you for the Slocombe-Boyd reference titled
> "There are no unstructured documents." I`ve had
> exchanges with Slocombe, and we`re very much in
> agreement about this thesis.
>
> In line with it, one of the exercises I`m like to
> assign my students is to write a simple set of
> layout rules and a program that recognizes them
> and does something useful. . . in much the same
> way a writer adds visual clues to help impart
> meaning, and a reader uses them to recognize that
> meaning. For instance, here`s an illustration of
> simple rules, and I`m sure no one would fail to
> recognize it as a valid document form:
>
> 1. Clearly a header
> A bunch of text that makes up a paragraph.
>
> With a blank line delimiting a second paragraph.
> followed by a list.
>
> o Item 1 - preceeded with the `o` character.
> o Item 2 - preceeded with the `o` character.
>
> Within XML, this notion is already so deeply accepted
> that few think about things like periods as sentence
> delimiters much less propose their replacement with
> sentence tagging or even a more radical elimination
> of "presenation" artifacts, like initial capitals
> and periods to mark sentences, space to delimit words
> and so on, used and understood in normal writing.
>
> eg The dog barked.
>
> <sentence>
> <word>
> <character encoding="ASCII">116</character>
> <character encoding="ASCII">104</character>
> <character encoding="ASCII">101</character>
> </word>
> <word>
> :
> etc.
>
> Futhermore, besides documents being structured by
> human choice using visual and character clues, all
> the document authoring tools are based on object
> models that include both content and presentation
> elements.
>
> Lest this confuse anyone into wondering "why then
> XML," the two things it offers is a mechanism for
> adding additional levels of semantic meaning to
> reflect various domains a document may reside in
> AND the opportunity to create a universal vocabulary
> for expressing fundamental document elements. If
> this last thing is achieved, perhaps authoring tools
> can move away from propriatary schemes or, at least,
> render them moot by a common medium of exchange.
>
> Tim
>
>
>
>>I would like to direct you to a white paper:
>>
>>There are no unstructured documents, Presented at
>>XML Europe 2002 by
>>David Slocombe and Rodney Boyd, Exegenix
>>
>>
>
> http://www.exegenix.com/media/pdf/exegenix_xmleurope2002_paper.pdf
>
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
> http://calendar.yahoo.com
>
>
> -------------------------------------------------------------------
> Post a message: mailto:xml-doc@yahoogroups.com
> Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> -------------------------------------------------------------------
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Reply with quote
Send private message
View user's profile Post To page top
ericsilverlight Posted: Sat Jun 14, 2003 12:28 am


Joined: 14 Jun 2003

Posts: 10
Ryan Germann - Inherent Structure in Documents
Binisiya wrote:
>
> (XML offers) the opportunity to create a universal vocabulary
> for expressing fundamental document elements. If
> this last thing is achieved, perhaps authoring tools
> can move away from propriatary schemes or, at least,
> render them moot by a common medium of exchange.
>
Oustanding.

Were I able to pass around document structures as easily as,
say, ASCII text, the eventually I`d stop getting mail attachments with
papers, spreadsheets, and prsentations in a proprietary format
that I can`t read.

And if the document became the message, then a more powerful
knowledge-capable version of an email client (knowledge client,
kmail client) would make it possible to store documents, add
metadata for retreival, create links, and transclude material
for reuse.

That is a lot of potential there for a "common document format"
(CDF?).

There is probably a matrix of use cases here, some more content
oriented, and others more presentation oriented, going from
relatively simple structures to more complex forms. The
presentation-oriented progression might look something like
this:
* email messages (title, paragraphs)
* notes and outlines (lists, headers)
* specs and internal writeups (images, tables)
* articles (centering, indented quotes, sidebars, bylines)
* journals (dual column formatting, superscripts)
* books (auto-numbering cross references, indexing, TOC)

The content-oriented progression, on the other hand, might
look something like:
* email messages (simple content)
* metadata tagging (storage and retreival)
* quotation and reference (transclude and link)
* collaboration (continually reorganizing structure)

It would be a fascinating project to start with a simple,
extensible framework that allowed users to choose their
own balance of simplicity and complexity with respect
to presentation capabilities and content capabilitities.
Reply with quote
Send private message
View user's profile Post To page top
schaeffer.barry@... Posted: Sat Jun 14, 2003 1:29 am


Joined: 14 Jun 2003

Posts: 2
Ryan Germann - Inherent Structure in Documents
Eric was heard to say;

"And if the document became the message, then a more powerful
knowledge-capable version of an email client (knowledge client,
kmail client) would make it possible to store documents, add
metadata for retreival, create links, and transclude material
for reuse."

One of the most difficult challenges in the structured information world
has been the development of a means by which authors can relate to
concepts and content so that they may perform at an equivalent level of
excellence. Up to now, proprietary software (editors, etc.) have been the
industry`s answer, of course with a series of attendant problems.
However, at least a part of the answer to this may be in front of us
through XUI the XML user interface markup language. While it doesn`t
solve every problem, the concept of author interface to content defined by
an XML document itself can be the initial step in a process that has gone
a long way on the delivery side but has been almost completely lacking on
the creation side. We work with the Arbortext products that support it
and I would hope that every piece of code that sits between the creator
will end up doing the same.

Regrads,

Barry
__________________________________________
Barry Schaeffer
X.Systems
703-330-1645 ext. 109
fax 703-330-0189
www.xsystems.com
"Content Solutions for the Internet Age"
__________________________________________



Eric Armstrong <eric.armstrong@...>
06/13/2003 03:28 PM
Please respond to
xml-doc@yahoogroups.com


To
xml-doc@yahoogroups.com
cc

Subject
Re: [xml-doc] XML for a common medium of exchange






Binisiya wrote:
>
> (XML offers) the opportunity to create a universal vocabulary
> for expressing fundamental document elements. If
> this last thing is achieved, perhaps authoring tools
> can move away from propriatary schemes or, at least,
> render them moot by a common medium of exchange.
>
Oustanding.

Were I able to pass around document structures as easily as,
say, ASCII text, the eventually I`d stop getting mail attachments with
papers, spreadsheets, and prsentations in a proprietary format
that I can`t read.

And if the document became the message, then a more powerful
knowledge-capable version of an email client (knowledge client,
kmail client) would make it possible to store documents, add
metadata for retreival, create links, and transclude material
for reuse.

That is a lot of potential there for a "common document format"
(CDF?).

There is probably a matrix of use cases here, some more content
oriented, and others more presentation oriented, going from
relatively simple structures to more complex forms. The
presentation-oriented progression might look something like
this:
* email messages (title, paragraphs)
* notes and outlines (lists, headers)
* specs and internal writeups (images, tables)
* articles (centering, indented quotes, sidebars, bylines)
* journals (dual column formatting, superscripts)
* books (auto-numbering cross references, indexing, TOC)

The content-oriented progression, on the other hand, might
look something like:
* email messages (simple content)
* metadata tagging (storage and retreival)
* quotation and reference (transclude and link)
* collaboration (continually reorganizing structure)

It would be a fascinating project to start with a simple,
extensible framework that allowed users to choose their
own balance of simplicity and complexity with respect
to presentation capabilities and content capabilitities.



-------------------------------------------------------------------
Post a message: mailto:xml-doc@yahoogroups.com
Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
Switch to digest: mailto:xml-doc-digest@yahoogroups.com
Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
Read archived messages: http://groups.yahoo.com/messages/xml-doc/
-------------------------------------------------------------------

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/





[Non-text portions of this message have been removed]
Reply with quote
Send private message
View user's profile Post To page top
ericsilverlight Posted: Sat Jun 14, 2003 2:17 am


Joined: 14 Jun 2003

Posts: 10
Ryan Germann - Inherent Structure in Documents
Could you expand on that with an example or two, Barry?

I`m not sure what you meant by:
* "means by which authors can relate to concepts and content"
* "XUI....author interface to content"
* "Arbortext products that support (it)"

I know of Adept, but given it`s price point, haven`t had a
chance to play with it, so I`m unfamiliar with the kind of
thing you`re talking about.

It sounds interesting though.

Note:
The pointer Don Day gave to The Problem of Overlapping Hierarchies
was an eye-opener. Interestingly, it said in a more formal way
what Don said more succinctly (and with more motivating examples)
in his post.
http://www.stg.brown.edu/resources/stg/monographs/ohco.html

I`m now loking at the whole metadata/reuse issue in a slightly
new way.

For example, one common form of reuse I`ve seen is in documents
written with a product name. It`s nice to put <product/> or
some such in the document and let it auto-configure itself
with the appropriate name.

Sometimes, though, the product name will change its part of
speech, say from "Washing" to "Scrub". Wham. Nice descriptions
like, "When Washing your data..." get clobbered when attempting
to substitue the new name.

Essentially, a substitutable <product> element would need
different forms -- a noun verb, verb form, infinitive form,
etc.

The document would then need to be written as
"When <get data="product/infinitive"> your data...".

If the infinitive form didn`t exist, a "broken link" would
occur at document-production time. After a while it would
become second-nature to add all possible variants of the
product name, so that documents could be written that were
name-ready, no matter what name was eventually chosen.

Now, that is an extremely simple case of the general
content-reuse problem, where the same content has multiple
forms in different contexts. Such systems might become more
usable, given the ability to identify the forms, generate an
error when a required form was not present, and then fix the
problem by adding the required form.

(I`mnot sure that the problems are entirely related, but the
probkem of "overlapping hierachies" stimulated that line of
thinking.)

schaeffer.barry@... wrote:

> Eric was heard to say;
>
> "And if the document became the message, then a more powerful
> knowledge-capable version of an email client (knowledge client,
> kmail client) would make it possible to store documents, add
> metadata for retreival, create links, and transclude material
> for reuse."
>
> One of the most difficult challenges in the structured information world
> has been the development of a means by which authors can relate to
> concepts and content so that they may perform at an equivalent level of
> excellence. Up to now, proprietary software (editors, etc.) have been the
> industry`s answer, of course with a series of attendant problems.
> However, at least a part of the answer to this may be in front of us
> through XUI the XML user interface markup language. While it doesn`t
> solve every problem, the concept of author interface to content defined by
> an XML document itself can be the initial step in a process that has gone
> a long way on the delivery side but has been almost completely lacking on
> the creation side. We work with the Arbortext products that support it
> and I would hope that every piece of code that sits between the creator
> will end up doing the same.
>
> Regrads,
>
> Barry
> __________________________________________
> Barry Schaeffer
> X.Systems
> 703-330-1645 ext. 109
> fax 703-330-0189
> www.xsystems.com
> "Content Solutions for the Internet Age"
> __________________________________________
>
>
>
> Eric Armstrong <eric.armstrong@...>
> 06/13/2003 03:28 PM
> Please respond to
> xml-doc@yahoogroups.com
>
>
> To
> xml-doc@yahoogroups.com
> cc
>
> Subject
> Re: [xml-doc] XML for a common medium of exchange
>
>
>
>
>
>
> Binisiya wrote:
> >
>
>>(XML offers) the opportunity to create a universal vocabulary
>>for expressing fundamental document elements. If
>>this last thing is achieved, perhaps authoring tools
>>can move away from propriatary schemes or, at least,
>>render them moot by a common medium of exchange.
>>
>
> Oustanding.
>
> Were I able to pass around document structures as easily as,
> say, ASCII text, the eventually I`d stop getting mail attachments with
> papers, spreadsheets, and prsentations in a proprietary format
> that I can`t read.
>
> And if the document became the message, then a more powerful
> knowledge-capable version of an email client (knowledge client,
> kmail client) would make it possible to store documents, add
> metadata for retreival, create links, and transclude material
> for reuse.
>
> That is a lot of potential there for a "common document format"
> (CDF?).
>
> There is probably a matrix of use cases here, some more content
> oriented, and others more presentation oriented, going from
> relatively simple structures to more complex forms. The
> presentation-oriented progression might look something like
> this:
> * email messages (title, paragraphs)
> * notes and outlines (lists, headers)
> * specs and internal writeups (images, tables)
> * articles (centering, indented quotes, sidebars, bylines)
> * journals (dual column formatting, superscripts)
> * books (auto-numbering cross references, indexing, TOC)
>
> The content-oriented progression, on the other hand, might
> look something like:
> * email messages (simple content)
> * metadata tagging (storage and retreival)
> * quotation and reference (transclude and link)
> * collaboration (continually reorganizing structure)
>
> It would be a fascinating project to start with a simple,
> extensible framework that allowed users to choose their
> own balance of simplicity and complexity with respect
> to presentation capabilities and content capabilitities.
>
>
>
> -------------------------------------------------------------------
> Post a message: mailto:xml-doc@yahoogroups.com
> Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> -------------------------------------------------------------------
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>
>
>
> [Non-text portions of this message have been removed]
>
>
> -------------------------------------------------------------------
> Post a message: mailto:xml-doc@yahoogroups.com
> Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> -------------------------------------------------------------------
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Reply with quote
Send private message
View user's profile Post To page top
robincover2002 Posted: Sat Jun 14, 2003 2:33 am


Joined: 14 Jun 2003

Posts: 1
Ryan Germann - Inherent Structure in Documents
> The pointer Don Day gave to The Problem of Overlapping Hierarchies
> was an eye-opener.

See also:

"Markup Languages and (Non-) Hierarchies"
http://xml.coverpages.org/hierarchies.html

But this merely scratches the surface: imagine how one
"marks up" a movie which encodes plot, movement on the
stage, overt human gestures, furtive glances, etc... A
written text is simple by comparison, even if it`s
a 5-times edited manuscript by Wittgenstein.

In about (hmmm...) 1994 or so, Yuri Rubinsky [1] invited me
to give a keynote at an SGML conference, largely on the
strength of an observation I made about writing a DTD for
my house (carpets, wires, plumbing, walls, ceiling joists, etc)
-- without any people walking about from room-to-room, of
course... This is just an analogy: when one considers "text"
in multiple dimensions (rhetoric; lexicon; genre) it becomes
clear that the layers revealed visibly by orthography are just
the surface features...

Cheers,

Robin

[1] http://xml.coverpages.org/yuriMemColl.html

-----------------------------------------------------
Robin Cover
XML Cover Pages
WWW: http://xml.coverpages.org
Newsletter: http://xml.coverpages.org/newsletter.html


On Fri, 13 Jun 2003, Eric Armstrong wrote:

> Could you expand on that with an example or two, Barry?
>
> I`m not sure what you meant by:
> * "means by which authors can relate to concepts and content"
> * "XUI....author interface to content"
> * "Arbortext products that support (it)"
>
> I know of Adept, but given it`s price point, haven`t had a
> chance to play with it, so I`m unfamiliar with the kind of
> thing you`re talking about.
>
> It sounds interesting though.
>
> Note:
> The pointer Don Day gave to The Problem of Overlapping Hierarchies
> was an eye-opener. Interestingly, it said in a more formal way
> what Don said more succinctly (and with more motivating examples)
> in his post.
> http://www.stg.brown.edu/resources/stg/monographs/ohco.html
>
> I`m now loking at the whole metadata/reuse issue in a slightly
> new way.
>
> For example, one common form of reuse I`ve seen is in documents
> written with a product name. It`s nice to put <product/> or
> some such in the document and let it auto-configure itself
> with the appropriate name.
>
> Sometimes, though, the product name will change its part of
> speech, say from "Washing" to "Scrub". Wham. Nice descriptions
> like, "When Washing your data..." get clobbered when attempting
> to substitue the new name.
>
> Essentially, a substitutable <product> element would need
> different forms -- a noun verb, verb form, infinitive form,
> etc.
>
> The document would then need to be written as
> "When <get data="product/infinitive"> your data...".
>
> If the infinitive form didn`t exist, a "broken link" would
> occur at document-production time. After a while it would
> become second-nature to add all possible variants of the
> product name, so that documents could be written that were
> name-ready, no matter what name was eventually chosen.
>
> Now, that is an extremely simple case of the general
> content-reuse problem, where the same content has multiple
> forms in different contexts. Such systems might become more
> usable, given the ability to identify the forms, generate an
> error when a required form was not present, and then fix the
> problem by adding the required form.
>
> (I`mnot sure that the problems are entirely related, but the
> probkem of "overlapping hierachies" stimulated that line of
> thinking.)
>
> schaeffer.barry@... wrote:
>
> > Eric was heard to say;
> >
> > "And if the document became the message, then a more powerful
> > knowledge-capable version of an email client (knowledge client,
> > kmail client) would make it possible to store documents, add
> > metadata for retreival, create links, and transclude material
> > for reuse."
> >
> > One of the most difficult challenges in the structured information world
> > has been the development of a means by which authors can relate to
> > concepts and content so that they may perform at an equivalent level of
> > excellence. Up to now, proprietary software (editors, etc.) have been the
> > industry`s answer, of course with a series of attendant problems.
> > However, at least a part of the answer to this may be in front of us
> > through XUI the XML user interface markup language. While it doesn`t
> > solve every problem, the concept of author interface to content defined by
> > an XML document itself can be the initial step in a process that has gone
> > a long way on the delivery side but has been almost completely lacking on
> > the creation side. We work with the Arbortext products that support it
> > and I would hope that every piece of code that sits between the creator
> > will end up doing the same.
> >
> > Regrads,
> >
> > Barry
> > __________________________________________
> > Barry Schaeffer
> > X.Systems
> > 703-330-1645 ext. 109
> > fax 703-330-0189
> > www.xsystems.com
> > "Content Solutions for the Internet Age"
> > __________________________________________
> >
> >
> >
> > Eric Armstrong <eric.armstrong@...>
> > 06/13/2003 03:28 PM
> > Please respond to
> > xml-doc@yahoogroups.com
> >
> >
> > To
> > xml-doc@yahoogroups.com
> > cc
> >
> > Subject
> > Re: [xml-doc] XML for a common medium of exchange
> >
> >
> >
> >
> >
> >
> > Binisiya wrote:
> > >
> >
> >>(XML offers) the opportunity to create a universal vocabulary
> >>for expressing fundamental document elements. If
> >>this last thing is achieved, perhaps authoring tools
> >>can move away from propriatary schemes or, at least,
> >>render them moot by a common medium of exchange.
> >>
> >
> > Oustanding.
> >
> > Were I able to pass around document structures as easily as,
> > say, ASCII text, the eventually I`d stop getting mail attachments with
> > papers, spreadsheets, and prsentations in a proprietary format
> > that I can`t read.
> >
> > And if the document became the message, then a more powerful
> > knowledge-capable version of an email client (knowledge client,
> > kmail client) would make it possible to store documents, add
> > metadata for retreival, create links, and transclude material
> > for reuse.
> >
> > That is a lot of potential there for a "common document format"
> > (CDF?).
> >
> > There is probably a matrix of use cases here, some more content
> > oriented, and others more presentation oriented, going from
> > relatively simple structures to more complex forms. The
> > presentation-oriented progression might look something like
> > this:
> > * email messages (title, paragraphs)
> > * notes and outlines (lists, headers)
> > * specs and internal writeups (images, tables)
> > * articles (centering, indented quotes, sidebars, bylines)
> > * journals (dual column formatting, superscripts)
> > * books (auto-numbering cross references, indexing, TOC)
> >
> > The content-oriented progression, on the other hand, might
> > look something like:
> > * email messages (simple content)
> > * metadata tagging (storage and retreival)
> > * quotation and reference (transclude and link)
> > * collaboration (continually reorganizing structure)
> >
> > It would be a fascinating project to start with a simple,
> > extensible framework that allowed users to choose their
> > own balance of simplicity and complexity with respect
> > to presentation capabilities and content capabilitities.
> >
> >
> >
> > -------------------------------------------------------------------
> > Post a message: mailto:xml-doc@yahoogroups.com
> > Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> > Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> > Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> > Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> > Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> > Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> > -------------------------------------------------------------------
> >
> > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
> >
> >
> >
> >
> >
> > [Non-text portions of this message have been removed]
> >
> >
> > -------------------------------------------------------------------
> > Post a message: mailto:xml-doc@yahoogroups.com
> > Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> > Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> > Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> > Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> > Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> > Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> > -------------------------------------------------------------------
> >
> > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
> >
> >
>
>
>
> -------------------------------------------------------------------
> Post a message: mailto:xml-doc@yahoogroups.com
> Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
> Switch to digest: mailto:xml-doc-digest@yahoogroups.com
> Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
> Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
> Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
> Read archived messages: http://groups.yahoo.com/messages/xml-doc/
> -------------------------------------------------------------------
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>
Reply with quote
Send private message
View user's profile Post To page top
ericsilverlight Posted: Sat Jun 14, 2003 3:12 am


Joined: 14 Jun 2003

Posts: 10
Ryan Germann - Inherent Structure in Documents
Robin Cover wrote:

> See also:
>
> "Markup Languages and (Non-) Hierarchies"
> http://xml.coverpages.org/hierarchies.html
>
> But this merely scratches the surface: imagine how one
> "marks up" a movie which encodes plot, movement on the
> stage, overt human gestures, furtive glances, etc... A
> written text is simple by comparison, even if it`s
> a 5-times edited manuscript by Wittgenstein.
>
> In about (hmmm...) 1994 or so, Yuri Rubinsky [1] invited me
> to give a keynote at an SGML conference, largely on the
> strength of an observation I made about writing a DTD for
> my house (carpets, wires, plumbing, walls, ceiling joists, etc)
> -- without any people walking about from room-to-room, of
> course... This is just an analogy: when one considers "text"
> in multiple dimensions (rhetoric; lexicon; genre) it becomes
> clear that the layers revealed visibly by orthography are just
> the surface features...
>
Hmmm. But by necessity, the *presentation* layer of any
document is necessarily sequential and hierarchical

Arguably, it can also be "framed", as for example the dialog
in a play, with corresponding stage directions in a separate frame,
linked by location to the dialog they introduce or describe. But
in each "frame" (aka frame of reference or analytical framework)
the presentation at least, does correspond to a hierarch of
content objects.

The problems of content management and effective knowledge
reuse can probably be accurately summarized as a problem of
creating a mapping from a semantic network to such a hierarchy.

Similary, the problem for creating slices from a play (stage
manager`s directions, individual roles (rolls) and a complete
printed version) is once again a problem of mapping abstract
relationships into hierachies of content objects in different
frames.

One of the questions posed in Day`s writeup, in fact, was whether
hierarchy was intrinsic to the subject, or necessary for human
comprehension. I argue for the later. Our 5 plus-or-minus 2
limitation forces us to deal with things hierarchically.

The fundamental issue in all writing, in fact, is to disentangle
the circular, nested, and interlinked relationships inherent in
any complex subject, and tease out a "root concept" from which
a hierarchical succession of topics can be derived that has
mimimum overlap, minimum circularity, and the minimum number of
forward references -- all of which add up to the maximum in
comprehensibility, because the result is the kind of "mental
hiearachy" we are equipped to deal with.

(The initial perception of HTML documents as a huge morass
of circular, undirected links resulted largely from that
fact -- so that the use of links is now mostly for reference,
rather than for direct exposition -- at least until transclusion
becomes possible.)

So, while I agree that information structures do indeed have
multiple overlapping hierarchries -- and semantic network
structures, as well -- it seems to me that the end result
for a reader *has* to become an "ordered hierarchy of content
objects" at the end of the process.

For example, in the writing process the next step after
determining the sequence of topics is to eliminate overlapping
content and manage the transitions between those topics. And
that is where content management systems undoubtedly run into
trouble. Because *not* doing those things will produce documents
with abrupt jumps and material that partially repeats itself.

Then there is the problem of voice (3rd person, 2nd person) and
other matters of style (emphasis, use of passive voide,
capitalization, spelling), plus questions of "what the author
chose to call it" -- a conveyance in that segment, a car in
another, a Dodge in a 3rd, because each was working at a different
layer of abstraction.

To put these things together into a coherent document, it is
not only necessary to identify them with semantic structures,
it is also necessary to perform the modifications necessary
to make them flow. (Automating that process would be an
intriguing AI challenge.)

My point (I believe, at least at this point in time) is that
even if overlapping content structures are the norm, presentation
in a particular frame must still a sequential hierarchy, for
the sake of comprehension. So, if is necessary to move away from
hierarchical structures to make progress, it will also be
necessary, in the end, to come back to them.
Reply with quote
Send private message
View user's profile Post To page top
dpawson2002 Posted: Sat Jun 14, 2003 3:38 pm


Joined: 14 Jun 2003

Posts: 6
Ryan Germann - Inherent Structure in Documents
At 15:12 13/06/2003 -0700, Eric Armstrong wrote:

>Hmmm. But by necessity, the *presentation* layer of any
>document is necessarily sequential and hierarchical
>
>Arguably, it can also be "framed", as for example the dialog
>in a play, with corresponding stage directions in a separate frame,
>linked by location to the dialog they introduce or describe. But
>in each "frame" (aka frame of reference or analytical framework)
>the presentation at least, does correspond to a hierarch of
>content objects.

That`s what smil [1] is all about.
Synchronising multiple media streams.
Works nicely for text and audio.
I guess others would add reasonably.

regards DaveP

[1] http://www.w3.org/TR/smil20/
Reply with quote
Send private message
Post new topic Reply to topic
Display posts from previous:   
 

All times are GMT
Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Freelace Website Designer - Customer web design and software building.
China Wholesale - Electronics Products
Character Studio - Tutorials and Help