mirror of
https://github.com/michaelrsweet/mxml.git
synced 2024-11-08 13:39:58 +00:00
bf73da6782
Add SAX load APIs. Add man page output for mxmldoc. Add types for various callback functions.
457 lines
14 KiB
HTML
457 lines
14 KiB
HTML
<html>
|
|
<body>
|
|
|
|
<h1 align='right'><a name='BASICS'>2 - Getting Started with
|
|
Mini-XML</a></h1>
|
|
|
|
<p>This chapter describes how to write programs that use
|
|
Mini-XML to access data in an XML file.</p>
|
|
|
|
<h2>The Basics</h2>
|
|
|
|
<p>Mini-XML provides a single header file which you include:</p>
|
|
|
|
<pre>
|
|
#include <mxml.h>
|
|
</pre>
|
|
|
|
<p>The Mini-XML library is included with your program using the
|
|
<kbd>-lmxml</kbd> option:</p>
|
|
|
|
<pre>
|
|
<kbd>gcc -o myprogram myprogram.c -lmxml ENTER</kbd>
|
|
</pre>
|
|
|
|
<p>If you have the <tt>pkg-config(1)</tt> software installed,
|
|
you can use it to determine the proper compiler and linker options
|
|
for your installation:</p>
|
|
|
|
<pre>
|
|
<kbd>pkg-config --cflags mxml ENTER</kbd>
|
|
<kbd>pkg-config --libs mxml ENTER</kbd>
|
|
</pre>
|
|
|
|
<h2>Nodes</h2>
|
|
|
|
<p>Every piece of information in an XML file (elements, text,
|
|
numbers) is stored in memory in "nodes". Nodes are defined by
|
|
the <a
|
|
href='#mxml_node_t'><tt>mxml_node_t</tt></a>
|
|
structure. The <a
|
|
href='#mxml_type_t'><tt>type</tt></a> member
|
|
defines the node type (element, integer, opaque, real, or text)
|
|
which determines which value you want to look at in the <a
|
|
href='#mxml_value_t'><tt>value</tt></a> union.</p>
|
|
|
|
<p>New nodes can be created using the <a
|
|
href='#mxmlNewElement'><tt>mxmlNewElement()</tt></a>, <a
|
|
href='#mxmlNewInteger'><tt>mxmlNewInteger()</tt></a>, <a
|
|
href='#mxmlNewOpaque'><tt>mxmlNewOpaque()</tt></a>, <a
|
|
href='#mxmlNewReal'><tt>mxmlNewReal()</tt></a>, <a
|
|
href='#mxmlNewText'><tt>mxmlNewText()</tt></a> <a
|
|
href='#mxmlNewTextf'><tt>mxmlNewTextf()</tt></a> <a
|
|
href='#mxmlNewXML'><tt>mxmlNewXML()</tt></a> functions. Only
|
|
elements can have child nodes, and the top node must be an
|
|
element, usually <tt><?xml version="1.0"?></tt>.</p>
|
|
|
|
<p>Each node has a <tt>user_data</tt> member which allows you to
|
|
associate application-specific data with each node as needed.</p>
|
|
|
|
<p>Node also have pointers for the node above (<tt>parent</tt>),
|
|
below (<tt>child</tt>), to the left (<tt>prev</tt>), and to the
|
|
right (<tt>next</tt>) of the current node. If you have an XML
|
|
file like the following:</p>
|
|
|
|
<pre>
|
|
<?xml version="1.0"?>
|
|
<data>
|
|
<node>val1</node>
|
|
<node>val2</node>
|
|
<node>val3</node>
|
|
<group>
|
|
<node>val4</node>
|
|
<node>val5</node>
|
|
<node>val6</node>
|
|
</group>
|
|
<node>val7</node>
|
|
<node>val8</node>
|
|
</data>
|
|
</pre>
|
|
|
|
<p>the node tree for the file would look like the following in
|
|
memory:</p>
|
|
|
|
<pre>
|
|
?xml
|
|
|
|
|
data
|
|
|
|
|
node - node - node - group - node - node
|
|
| | | | | |
|
|
val1 val2 val3 | val7 val8
|
|
|
|
|
node - node - node
|
|
| | |
|
|
val4 val5 val6
|
|
</pre>
|
|
|
|
<p>where "-" is a pointer to the next node and "|" is a pointer
|
|
to the first child node.</p>
|
|
|
|
<p>Once you are done with the XML data, use the <a
|
|
href='#mxmlDelete'><tt>mxmlDelete()</tt></a>
|
|
function to recursively free the memory that is used for a
|
|
particular node or the entire tree:</p>
|
|
|
|
<pre>
|
|
mxmlDelete(tree);
|
|
</pre>
|
|
|
|
<!-- NEW PAGE -->
|
|
<h2>Creating XML Documents</h2>
|
|
|
|
<p>You can create and update XML documents in memory using the
|
|
various <tt>mxmlNew</tt> functions. The following code will
|
|
create the XML document described in the previous section:</p>
|
|
|
|
<pre>
|
|
mxml_node_t *xml; /* <?xml ... ?> */
|
|
mxml_node_t *data; /* <data> */
|
|
mxml_node_t *node; /* <node> */
|
|
mxml_node_t *group; /* <group> */
|
|
|
|
xml = mxmlNewXML("1.0");
|
|
|
|
data = mxmlNewElement(xml, "data");
|
|
|
|
node = mxmlNewElement(data, "node");
|
|
mxmlNewText(node, 0, "val1");
|
|
node = mxmlNewElement(data, "node");
|
|
mxmlNewText(node, 0, "val2");
|
|
node = mxmlNewElement(data, "node");
|
|
mxmlNewText(node, 0, "val3");
|
|
|
|
group = mxmlNewElement(data, "group");
|
|
|
|
node = mxmlNewElement(group, "node");
|
|
mxmlNewText(node, 0, "val4");
|
|
node = mxmlNewElement(group, "node");
|
|
mxmlNewText(node, 0, "val5");
|
|
node = mxmlNewElement(group, "node");
|
|
mxmlNewText(node, 0, "val6");
|
|
|
|
node = mxmlNewElement(data, "node");
|
|
mxmlNewText(node, 0, "val7");
|
|
node = mxmlNewElement(data, "node");
|
|
mxmlNewText(node, 0, "val8");
|
|
</pre>
|
|
|
|
<p>We start by creating the <tt><?xml version="1.0"?></tt>
|
|
node common to all XML files using the <a
|
|
href="#mxmlNewXML"><tt>mxmlNewXML</tt></a> function:</p>
|
|
|
|
<pre>
|
|
xml = mxmlNewXML("1.0");
|
|
</pre>
|
|
|
|
<p>We then create the <tt><data></tt> node used for this
|
|
document using the <a
|
|
href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
|
|
first argument specifies the parent node (<tt>xml</tt>) while the
|
|
second specifies the element name (<tt>data</tt>):</p>
|
|
|
|
<pre>
|
|
data = mxmlNewElement(xml, "data");
|
|
</pre>
|
|
|
|
<p>Each <tt><node>...</node></tt> in the file is
|
|
created using the <tt>mxmlNewElement</tt> and <a
|
|
href="#mxmlNewText"><tt>mxmlNewText</tt></a> functions. The first
|
|
argument of <tt>mxmlNewText</tt> specifies the parent node
|
|
(<tt>node</tt>). The second argument specifies whether whitespace
|
|
appears before the text - 0 or false in this case. The last
|
|
argument specifies the actual text to add:</p>
|
|
|
|
<pre>
|
|
node = mxmlNewElement(data, "node");
|
|
mxmlNewText(node, 0, "val1");
|
|
</pre>
|
|
|
|
<p>The resulting in-memory XML document can then be saved or
|
|
processed just like one loaded from disk or a string.</p>
|
|
|
|
<!-- NEW PAGE -->
|
|
<h2>Loading XML</h2>
|
|
|
|
<p>You load an XML file using the <a
|
|
href='#mxmlLoadFile'><tt>mxmlLoadFile()</tt></a>
|
|
function:</p>
|
|
|
|
<pre>
|
|
FILE *fp;
|
|
mxml_node_t *tree;
|
|
|
|
fp = fopen("filename.xml", "r");
|
|
tree = mxmlLoadFile(NULL, fp,
|
|
MXML_TEXT_CALLBACK);
|
|
fclose(fp);
|
|
</pre>
|
|
|
|
<p>The first argument specifies an existing XML parent node, if
|
|
any. Normally you will pass <tt>NULL</tt> for this argument
|
|
unless you are combining multiple XML sources. The XML file must
|
|
contain a complete XML document including the <tt>?xml</tt>
|
|
element if the parent node is <tt>NULL</tt>.</p>
|
|
|
|
<p>The second argument specifies the stdio file to read from, as
|
|
opened by <tt>fopen()</tt> or <tt>popen()</tt>. You can also use
|
|
<tt>stdin</tt> if you are implementing an XML filter
|
|
program.</p>
|
|
|
|
<p>The third argument specifies a callback function which returns
|
|
the value type of the immediate children for a new element node:
|
|
<tt>MXML_CUSTOM</tt>, <tt>MXML_IGNORE</tt>,
|
|
<tt>MXML_INTEGER</tt>, <tt>MXML_OPAQUE</tt>, <tt>MXML_REAL</tt>,
|
|
or <tt>MXML_TEXT</tt>. Load callbacks are described in detail in
|
|
<a href='#LOAD_CALLBACKS'>Chapter 3</a>. The example code uses
|
|
the <tt>MXML_TEXT_CALLBACK</tt> constant which specifies that all
|
|
data nodes in the document contain whitespace-separated text
|
|
values. Other standard callbacks include
|
|
<tt>MXML_IGNORE_CALLBACK</tt>, <tt>MXML_INTEGER_CALLBACK</tt>,
|
|
<tt>MXML_OPAQUE_CALLBACK</tt>, and
|
|
<tt>MXML_REAL_CALLBACK</tt>.</p>
|
|
|
|
<p>The <a href='#mxmlLoadString'><tt>mxmlLoadString()</tt></a>
|
|
function loads XML node trees from a string:</p>
|
|
|
|
<!-- NEED 10 -->
|
|
<pre>
|
|
char buffer[8192];
|
|
mxml_node_t *tree;
|
|
|
|
...
|
|
tree = mxmlLoadString(NULL, buffer,
|
|
MXML_TEXT_CALLBACK);
|
|
</pre>
|
|
|
|
<p>The first and third arguments are the same as used for
|
|
<tt>mxmlLoadFile()</tt>. The second argument specifies the
|
|
string or character buffer to load and must be a complete XML
|
|
document including the <tt>?xml</tt> element if the parent node
|
|
is <tt>NULL</tt>.</p>
|
|
|
|
|
|
<!-- NEW PAGE -->
|
|
<h2>Saving XML</h2>
|
|
|
|
<p>You save an XML file using the <a
|
|
href='#mxmlSaveFile'><tt>mxmlSaveFile()</tt></a> function:</p>
|
|
|
|
<pre>
|
|
FILE *fp;
|
|
mxml_node_t *tree;
|
|
|
|
fp = fopen("filename.xml", "w");
|
|
mxmlSaveFile(tree, fp, MXML_NO_CALLBACK);
|
|
fclose(fp);
|
|
</pre>
|
|
|
|
<p>The first argument is the XML node tree to save. It should
|
|
normally be a pointer to the top-level <tt>?xml</tt> node in
|
|
your XML document.</p>
|
|
|
|
<p>The second argument is the stdio file to write to, as opened
|
|
by <tt>fopen()</tt> or <tt>popen()</tt>. You can also use
|
|
<tt>stdout</tt> if you are implementing an XML filter
|
|
program.</p>
|
|
|
|
<p>The third argument is the whitespace callback to use when
|
|
saving the file. Whitespace callbacks are covered in detail in <a
|
|
href='SAVE_CALLBACKS'>Chapter 3</a>. The previous example code
|
|
uses the <tt>MXML_NO_CALLBACK</tt> constant to specify that no
|
|
special whitespace handling is required.</p>
|
|
|
|
<p>The <a
|
|
href='#mxmlSaveAllocString'><tt>mxmlSaveAllocString()</tt></a>,
|
|
and <a href='#mxmlSaveString'><tt>mxmlSaveString()</tt></a>
|
|
functions save XML node trees to strings:</p>
|
|
|
|
<pre>
|
|
char buffer[8192];
|
|
char *ptr;
|
|
mxml_node_t *tree;
|
|
|
|
...
|
|
mxmlSaveString(tree, buffer, sizeof(buffer),
|
|
MXML_NO_CALLBACK);
|
|
|
|
...
|
|
ptr = mxmlSaveAllocString(tree, MXML_NO_CALLBACK);
|
|
</pre>
|
|
|
|
<p>The first and last arguments are the same as used for
|
|
<tt>mxmlSaveFile()</tt>. The <tt>mxmlSaveString()</tt> function
|
|
takes pointer and size arguments for saving the XML document to
|
|
a fixed-size buffer, while <tt>mxmlSaveAllocString()</tt>
|
|
returns a string buffer that was allocated using
|
|
<tt>malloc()</tt>.</p>
|
|
|
|
<h3>Controlling Line Wrapping</h3>
|
|
|
|
<p>When saving XML documents, Mini-XML normally wraps output
|
|
lines at column 75 so that the text is readable in terminal
|
|
windows. The <a
|
|
href='#mxmlSetWrapMargin'><tt>mxmlSetWrapMargin</tt></a> function
|
|
overrides the default wrap margin:</p>
|
|
|
|
<pre>
|
|
/* Set the margin to 132 columns */
|
|
mxmlSetWrapMargin(132);
|
|
|
|
/* Disable wrapping */
|
|
mxmlSetWrapMargin(0);
|
|
</pre>
|
|
|
|
|
|
<!-- NEW PAGE-->
|
|
<h2>Finding and Iterating Nodes</h2>
|
|
|
|
<p>The <a
|
|
href='#mxmlWalkPrev'><tt>mxmlWalkPrev()</tt></a>
|
|
and <a
|
|
href='#mxmlWalkNext'><tt>mxmlWalkNext()</tt></a>functions
|
|
can be used to iterate through the XML node tree:</p>
|
|
|
|
<pre>
|
|
mxml_node_t *node;
|
|
|
|
node = mxmlWalkPrev(current, tree,
|
|
MXML_DESCEND);
|
|
|
|
node = mxmlWalkNext(current, tree,
|
|
MXML_DESCEND);
|
|
</pre>
|
|
|
|
<p>In addition, you can find a named element/node using the <a
|
|
href='#mxmlFindElement'><tt>mxmlFindElement()</tt></a>
|
|
function:</p>
|
|
|
|
<pre>
|
|
mxml_node_t *node;
|
|
|
|
node = mxmlFindElement(tree, tree, "name",
|
|
"attr", "value",
|
|
MXML_DESCEND);
|
|
</pre>
|
|
|
|
<p>The <tt>name</tt>, <tt>attr</tt>, and <tt>value</tt>
|
|
arguments can be passed as <tt>NULL</tt> to act as wildcards,
|
|
e.g.:</p>
|
|
|
|
<!-- NEED 4 -->
|
|
<pre>
|
|
/* Find the first "a" element */
|
|
node = mxmlFindElement(tree, tree, "a",
|
|
NULL, NULL,
|
|
MXML_DESCEND);
|
|
</pre>
|
|
<!-- NEED 5 -->
|
|
<pre>
|
|
/* Find the first "a" element with "href"
|
|
attribute */
|
|
node = mxmlFindElement(tree, tree, "a",
|
|
"href", NULL,
|
|
MXML_DESCEND);
|
|
</pre>
|
|
<!-- NEED 6 -->
|
|
<pre>
|
|
/* Find the first "a" element with "href"
|
|
to a URL */
|
|
node = mxmlFindElement(tree, tree, "a",
|
|
"href",
|
|
"http://www.easysw.com/",
|
|
MXML_DESCEND);
|
|
</pre>
|
|
<!-- NEED 5 -->
|
|
<pre>
|
|
/* Find the first element with a "src"
|
|
attribute */
|
|
node = mxmlFindElement(tree, tree, NULL,
|
|
"src", NULL,
|
|
MXML_DESCEND);
|
|
</pre>
|
|
<!-- NEED 5 -->
|
|
<pre>
|
|
/* Find the first element with a "src"
|
|
= "foo.jpg" */
|
|
node = mxmlFindElement(tree, tree, NULL,
|
|
"src", "foo.jpg",
|
|
MXML_DESCEND);
|
|
</pre>
|
|
|
|
<p>You can also iterate with the same function:</p>
|
|
|
|
<pre>
|
|
mxml_node_t *node;
|
|
|
|
for (node = mxmlFindElement(tree, tree,
|
|
"name",
|
|
NULL, NULL,
|
|
MXML_DESCEND);
|
|
node != NULL;
|
|
node = mxmlFindElement(node, tree,
|
|
"name",
|
|
NULL, NULL,
|
|
MXML_DESCEND))
|
|
{
|
|
... do something ...
|
|
}
|
|
</pre>
|
|
|
|
<!-- NEED 10 -->
|
|
<p>The <tt>MXML_DESCEND</tt> argument can actually be one of
|
|
three constants:</p>
|
|
|
|
<ul>
|
|
|
|
<li><tt>MXML_NO_DESCEND</tt> means to not to look at any
|
|
child nodes in the element hierarchy, just look at
|
|
siblings at the same level or parent nodes until the top
|
|
node or top-of-tree is reached.
|
|
|
|
<p>The previous node from "group" would be the "node"
|
|
element to the left, while the next node from "group" would
|
|
be the "node" element to the right.<br><br></p></li>
|
|
|
|
<li><tt>MXML_DESCEND_FIRST</tt> means that it is OK to
|
|
descend to the first child of a node, but not to descend
|
|
further when searching. You'll normally use this when
|
|
iterating through direct children of a parent node, e.g. all
|
|
of the "node" and "group" elements under the "?xml" parent
|
|
node in the example above.
|
|
|
|
<p>This mode is only applicable to the search function; the
|
|
walk functions treat this as <tt>MXML_DESCEND</tt> since
|
|
every call is a first time.<br><br></p></li>
|
|
|
|
<li><tt>MXML_DESCEND</tt> means to keep descending until
|
|
you hit the bottom of the tree. The previous node from
|
|
"group" would be the "val3" node and the next node would
|
|
be the first node element under "group".
|
|
|
|
<p>If you were to walk from the root node "?xml" to the end
|
|
of the tree with <tt>mxmlWalkNext()</tt>, the order would
|
|
be:</p>
|
|
|
|
<p><tt>?xml data node val1 node val2 node val3 group node
|
|
val4 node val5 node val6 node val7 node val8</tt></p>
|
|
|
|
<p>If you started at "val8" and walked using
|
|
<tt>mxmlWalkPrev()</tt>, the order would be reversed,
|
|
ending at "?xml".</p></li>
|
|
|
|
</ul>
|
|
|
|
</body>
|
|
</html>
|