freelanceprogrammers.org Forum Index » XML / XSL
XML (or similar) from C?
Joined: 10 Mar 2005
Posts: 11
XML (or similar) from C?
I was wondering --- wouldn`t it be great if it were
possible to process XML (or something similar to it)
from C?
Yes, I know it`s in theory possible --- but I mean
if there were a library out there to make it easier?
Here`s why I ask.
XSLT is a bondage-and-discipline language really.
That is, it`s a language which is so high-level that
it sets not only mechanism, but policy as well.
On the other hand, if you had a simple function that
just takes an XML file, and loads it into a structure
tree in the program`s core memory -- then you can
write your own program to output it in whatever way
you (the programmer) feel. No limits, no restrictions.
Designint a set of structures that can cover
everything that is covered in a markup language
is the easy part. (As a matter of fact, I`m almost
done with that part.) The part that is hard is
writing a function that put all the XML data *into*
that structure scheme.
--
Adam Ophir Shapira
Virtual Stoa Discussion Forum -- http://virtualstoa.org
Joined: 14 Oct 2005
Posts: 2
XML (or similar) from C?
"Adam Ophir Shapira" <red_angel@...>
> I was wondering --- wouldn`t it be great if it were
> possible to process XML (or something similar to it)
> from C?
It would be great! It would be especially great if someone implemented a
standard C library for processing XML! It would also be great if someone
included that in a standard Linux module, like GNOME! And it would also
be great if it were widely available on other platforms, too, like Mac OS
X and Windows!
Maybe it could be called libxml2, and maybe it could be found at <URL:
http://xmlsoft.org/ >.
~Chris
And maybe one poster could do a little more research before posting, and
another could be a little less sarcastic. Sorry.
--
Christopher R. Maden, Principal Consultant, crism consulting
XML-SGML-HTML-DTDs-schemas-XSL-DSSSL-conversion-training-ebooks-B2B
<URL: http://crism.maden.org/consulting/ >
PGP Fingerprint: BBA6 4085 DED0 E176 D6D4 5DFC AC52 F825 AFEC 58DA
Joined: 13 Apr 2005
Posts: 5
XML (or similar) from C?
Christopher R. Maden wrote:
> "Adam Ophir Shapira" <red_angel@...>
>
>>I was wondering --- wouldn`t it be great if it were
>>possible to process XML (or something similar to it)
>>from C?
>
>
> It would be great! It would be especially great if someone implemented a
> standard C library for processing XML! It would also be great if someone
> included that in a standard Linux module, like GNOME! And it would also
> be great if it were widely available on other platforms, too, like Mac OS
> X and Windows!
>
> Maybe it could be called libxml2, and maybe it could be found at <URL:
> http://xmlsoft.org/ >.
>
> ~Chris
>
> And maybe one poster could do a little more research before posting, and
> another could be a little less sarcastic. Sorry.
Or the OP could roll their own. It involves a state engine, a stack, a
tree (or map), and a function to traverse the elements in the results
(or, if you are only interested in extracting data, a function that
takes a string of a known path and traverses the tree for you).
A lightweight XML parser can be up to five times faster than heavyweight
parsers like libxml2. XML isn`t known for its performance, so every
little bit helps if you don`t need anything beyond data storage and
retrieval.
--
Thomas Hruska
Joined: 24 Jun 2003
Posts: 5
XML (or similar) from C?
> A lightweight XML parser can be up to five times faster than heavyweight
> parsers like libxml2. XML isn`t known for its performance, so every
> little bit helps if you don`t need anything beyond data storage and
> retrieval.
It is worth confirming that a "lightweight" parser still actually conforms
to the XML specification; if you are only working with data you create
yourself you might not mind, but if you need to process XML from other
people then you will probably need an XML parser that fully supports the
XML specification (eg. UNICODE, DTDs, external entities) and that does not
silently accept non-well-formed XML.
Cheers,
Michael
--
Print XML with Prince!
http://www.princexml.com
Joined: 14 Oct 2005
Posts: 2
XML (or similar) from C?
Expat is also highly recommended:
http://www.jclark.com/xml/expat.html
Michael.
> -----Original Message-----
> From: xml-doc@yahoogroups.com
> [mailto:xml-doc@yahoogroups.com] On Behalf Of Christopher R. Maden
> Sent: Thursday, October 13, 2005 17:02
> To: xml-doc@yahoogroups.com
> Subject: Re: [xml-doc] XML (or similar) from C?
>
> "Adam Ophir Shapira" <red_angel@...>
> > I was wondering --- wouldn`t it be great if it were possible to
> > process XML (or something similar to it) from C?
>
> It would be great! It would be especially great if someone
> implemented a standard C library for processing XML! It
> would also be great if someone included that in a standard
> Linux module, like GNOME! And it would also be great if it
> were widely available on other platforms, too, like Mac OS X
> and Windows!
>
> Maybe it could be called libxml2, and maybe it could be found at <URL:
> http://xmlsoft.org/ >.
>
> ~Chris
>
> And maybe one poster could do a little more research before
> posting, and another could be a little less sarcastic. Sorry.
> --
> Christopher R. Maden, Principal Consultant, crism consulting
> XML-SGML-HTML-DTDs-schemas-XSL-DSSSL-conversion-training-ebooks-B2B
> <URL: http://crism.maden.org/consulting/ > PGP Fingerprint:
> BBA6 4085 DED0 E176 D6D4 5DFC AC52 F825 AFEC 58DA
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
Joined: 10 Mar 2005
Posts: 11
XML (or similar) from C?
Michael Day wrote:
>
> It is worth confirming that a "lightweight" parser still actually conforms
> to the XML specification; if you are only working with data you create
> yourself you might not mind, but if you need to process XML from other
> people then you will probably need an XML parser that fully supports the
> XML specification (eg. UNICODE, DTDs, external entities)
Yes -- I was thinking about a multated version of the
concept of a C string that can support unicode -- as
well as (in addition to that) a more intuitive way of
supporting alternative writing systems.
I just need to see which 2 or 3 byte-values are most
expendable for that purpose (out of the 255 that are
remaining if you discount the null-terminator).
I`ll be very honest. I don`t like unicode -- I think
it was a colossal mistake. But I see that any project
of the kind that I`m talking about would have to have
at least fossil support for it. But I would like it
to also have the ability to support a better system
of internationalization.
What don`t I like about unicode?
Well -- my *main* gripe against it is that it treats
all the different writing systems as though they were
merely different character sets. It completey ignores
the fact that each alphabet has it`s own personality
and quirks and should be supported as such.
> and that does not
> silently accept non-well-formed XML.
Of course, I`m talking about a function to load
data into a structure-sceme in memory -- not
a command too.
But how `bout write the function in such a way
as to be able to put one of any three error codes
in a variable (a pointer to which is passed to it
as an argument). These could be one value if the
XML document was propperly formed -- another one
if it was impropperly formed yet still decipherable
-- and a third one if it failed due to it being
impropperly formed.
Christopher R. Maden wrote:
>
> It would be great! It would be
> especially great if someone implemented a
> standard C library for processing
> XML! It would also be great if someone
> included that in a standard Linux module,
> like GNOME! And it would also
> be great if it were widely available
> on other platforms, too, like Mac OS
> X and Windows!
>
> Maybe it could be called libxml2, and
> maybe it could be found at <URL:
> http://xmlsoft.org/ >.
Yes, it would be great if it that library
you point to didn`t have trouble installing
on a macintosh last time I tried.
And it would also be great if the structure
scheme were a lot simpler than what it is
-- wile still having the ability to store
in full everything that could possibly
be included in a markup document.
--
Adam Ophir Shapira
Virtual Stoa Discussion Forum -- http://virtualstoa.org
Joined: 14 Oct 2005
Posts: 2
XML (or similar) from C?
On 14/10/2005, at 07:16, Adam Ophir Shapira wrote:
> Michael Day wrote:
>
>>
>> It is worth confirming that a "lightweight" parser still actually
>> conforms
>> to the XML specification;
>>
>
> Yes -- I was thinking about a multated version of the
> concept of a C string that can support unicode -- as
> well as (in addition to that) a more intuitive way of
> supporting alternative writing systems.
>
> I just need to see which 2 or 3 byte-values are most
> expendable for that purpose (out of the 255 that are
> remaining if you discount the null-terminator).
Hi, I`d suggest expat, (expat.sourceforge.net); it is fast,
conforming, parses external entities, and used by many libraries and
tools as the underlying implementation, Python is just one example.
David
Joined: 13 Apr 2005
Posts: 5
XML (or similar) from C?
Adam Ophir Shapira wrote:
> I`ll be very honest. I don`t like unicode -- I think
> it was a colossal mistake. But I see that any project
> of the kind that I`m talking about would have to have
> at least fossil support for it. But I would like it
> to also have the ability to support a better system
> of internationalization.
>
> What don`t I like about unicode?
>
> Well -- my *main* gripe against it is that it treats
> all the different writing systems as though they were
> merely different character sets. It completey ignores
> the fact that each alphabet has it`s own personality
> and quirks and should be supported as such.
Actually, I don`t like UNICODE at ALL. Most Asian-speaking peoples see
it as a mutilation. The Japanese actually hate it. It was the source
of the weakness in IIS that allowed the CodeRed/CodeRed II/Nimda worms
to exist in the first place (not to mention other numerous exploits
involving IE and the OS itself). And it simply doesn`t work properly
under Win95/98/Me with the crappy MSLU (Microsoft Layer for Unicode).
And still has issues under the latest OSes.
So, you aren`t alone. I`ve actually devised a _MUCH_ better system.
While it would consume considerably more memory than UNICODE, it would
actually be universal - and then we could actually get Klingon added as
a real language. K`plah!
--
Thomas Hruska
Joined: 10 Mar 2005
Posts: 11
XML (or similar) from C?
Thomas J. Hruska wrote:
> Adam Ophir Shapira wrote:
>
>>I`ll be very honest. I don`t like unicode -- I think
>>it was a colossal mistake. But I see that any project
>>of the kind that I`m talking about would have to have
>>at least fossil support for it. But I would like it
>>to also have the ability to support a better system
>>of internationalization.
>>
>>What don`t I like about unicode?
>>
>>Well -- my *main* gripe against it is that it treats
>>all the different writing systems as though they were
>>merely different character sets. It completey ignores
>>the fact that each alphabet has it`s own personality
>>and quirks and should be supported as such.
>
>
> Actually, I don`t like UNICODE at ALL. Most Asian-speaking peoples see
> it as a mutilation. The Japanese actually hate it.
[snip]
>
> So, you aren`t alone.
And it`s good to know that. I was beginning to think
that I was the only person who was aware of the very
obviously flaw in Unicode`s very premise.
> I`ve actually devised a _MUCH_ better system.
> While it would consume considerably more memory than UNICODE, it would
> actually be universal - and then we could actually get Klingon added as
> a real language. K`plah!
Cool! I`d like to see more of what you`ve got.
If it`s enough along the lines of what I was thinking,
I might even be able to stop what I was working on.
What I was thinking of is having a few byte-values
(probably no more than 5) designated for punctuating
sequences that I would call "transition sequences".
A transition sequence would be a code that lasts a
few bytes long that tells the processor to stop
interpreting the string as one writing system, and
to start interpreting it as a different one.
I could have somewhere a document that lists each
writing system, specifying (a) the code for that
language that would be used in a transition sequence
and (b) information on who is in charge of writing
the standard for that particular writing system. (I
would have information set -b- because I believe that
each writing system`s specification should be outsourced
to a group that is intimately familiar with languages
that use that writing system - rather than letting one
group specify it for all languages).
Each group would be allowed to use the remaining 250
byte values in whatever way that group chooses. They
can have one character per byte - or they can have
each character take up multiple bytes (of fixed or
variable length) however they choose. However, all
these groups (with the possible exception of whoever
does the fossil support for unicode) would be advised
to always leave room for expantion in the future.
So -- what`s your system like?
Do you have a specification published on the web yet?
--
Adam Ophir Shapira
Virtual Stoa Discussion Forum -- http://virtualstoa.org
All times are GMT
Page 1 of 1
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Freelace Website Designer - Customer web design and software building.
China Wholesale - Electronics Products
Character Studio - Tutorials and Help
China Wholesale - Electronics Products
Character Studio - Tutorials and Help







