Gentoo GuideXML Guide
1.
GuideXML basics
GuideXML design goals
The guideXML syntax is lightweight yet expressive, so that it is easy to
learn yet also provides all the features we need for the creation of web
documentation. The number of tags is kept to a minimum -- just those we need.
This makes it easy to transform guide into other formats, such as DocBook
XML/SGML or web-ready HTML.
The goal is to make it easy to create and transform guideXML
documents.
Further Resources
If you are planning on contributing documentation to Gentoo, or you want to
test GuideXML, please read our Doc Tips 'n' Tricks guide
which contains tips and tricks for documentation development.
You may want to look at the XML source of this
document while you read it.
2.
GuideXML
Basic structure
Let's start learning the GuideXML syntax. We'll start with the the initial
tags used in a GuideXML document:
Code Listing 2.1: The initial part of a guide XML document |
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
<!-- $Header$ -->
<guide link="/doc/en/guide.xml" lang="en">
<title>Gentoo Documentation Guide</title>
<author title="Author">
<mail link="yourname@gentoo.org">Your Name</mail>
</author>
<abstract>
This guide shows you how to compose web documentation using
our new lightweight Gentoo GuideXML syntax. This syntax is the official
format for Gentoo web documentation, and this document itself was created
using GuideXML.
</abstract>
<!-- The content of this document is licensed under the CC-BY-SA license -->
<!-- See http://creativecommons.org/licenses/by-sa/2.5 -->
<license/>
<version>1.0</version>
<date>2004-12-25</date>
|
On the first lines, we see the requisite tag that identifies this as an XML
document and specifies its DTD. The <!-- $Header$ --> line
will be automatically modified by the CVS server and helps to track revisions.
Next, there's a <guide> tag -- the entire guide document is
enclosed within a <guide> </guide> pair.
The link attribute is optional and should preferably contain the
absolute path to the document relatively to the document root even though the
file name alone will work. It is only used to generate a link to a
printer-friendly version of your document and check whether a translation is
up-to-date. Our XSL back-engine passes the actual path to our XSL stylesheet.
The link attribute is only used as a fall-back value in case the XML is
processed by other means.
The lang attribute should be used to specify the language code of your
document. It is used to format the date and insert strings like "Note",
"Content", etc. in the specified language. The default is English.
Next, there's a <title> tag, used to set the title for the entire
guide document.
Then, we come to the <author> tags, which contain information
about the various authors of the document. Each <author> tag
allows for an optional title element, used to specify the author's
relationship to the document (author, co-author, editor, etc.). In this
particular example, the authors' names are enclosed in another tag -- a
<mail> tag, used to specify an email address for this particular
person. The <mail> tag is optional and can be omitted, and at
least one <author> element is required per guide document.
Next, we come to the <abstract>, <version> and
<date> tags, used to specify a summary of the document, the
current version number, and the current version date (in YYYY-MM-DD format)
respectively. Dates that are invalid or not in the YYYY-MM-DD format will
appear verbatim in the rendered document.
This sums up the tags that should appear at the beginning of a guide document.
Besides the <title> and <mail> tags, these tags
shouldn't appear anywhere else except immediately inside the
<guide> tag, and for consistency it's recommended (but not
required) that these tags appear before the content of the document.
Finally we have the <license/> tag, used to publish the document
under the Creative
Commons - Attribution / Share Alike license as required by the Documentation Policy.
Chapters and sections
Once the initial tags have been specified, you're ready to start adding the
structural elements of the document. Guide documents are divided into
chapters, and each chapter can hold one or more sections. Every chapter and
section has a title. Here's an example chapter with a single section,
consisting of a paragraph. If you append this XML to the XML in the previous excerpt and append a
</guide> to the end of the file, you'll have a valid (if minimal)
guide document:
Code Listing 2.2: Minimal guide example |
<chapter>
<title>This is my chapter</title>
<section>
<title>This is section one of my chapter</title>
<body>
<p>
This is the actual text content of my section.
</p>
</body>
</section>
</chapter>
|
Above, I set the chapter title by adding a child <title>
element to the <chapter> element. Then, I created a section by
adding a <section> element. If you look inside the
<section> element, you'll see that it has two child elements -- a
<title> and a <body>. While the <title>
is nothing new, the <body> is -- it contains the actual text
content of this particular section. We'll look at the tags that are allowed
inside a <body> element in a bit.
Note:
A <guide> element must contain at least one <chapter>
elements, a <chapter> must contain at least one
<section> elements and a <section> element must
contain at least one <body> element.
|
An example <body>
Now, it's time to learn how to mark up actual content. Here's the XML code for
an example <body> element:
Code Listing 2.3: Example of a body element |
<p>
This is a paragraph. <path>/etc/passwd</path> is a file.
<uri>http://forums.gentoo.org</uri> is my favorite website.
Type <c>ls</c> if you feel like it. I <e>really</e> want to go to sleep now.
</p>
<pre caption="Code Sample">
This is text output or code.
# <i>this is user input</i>
Make HTML/XML easier to read by using selective emphasis:
<foo><i>bar</i></foo>
<comment>(This is how to insert a comment into a code block)</comment>
</pre>
<note>
This is a note.
</note>
<warn>
This is a warning.
</warn>
<impo>
This is important.
</impo>
|
Now, here's how the <body> element above is rendered:
This is a paragraph. /etc/passwd is a file.
http://forums.gentoo.org is my favorite web site.
Type ls if you feel like it. I really want to go to sleep now.
Code Listing 2.4: Code Sample |
This is text output or code.
# this is user input
Make HTML/XML easier to read by using selective emphasis:
<foo>bar</foo>
|
Warning:
This is a warning.
|
Important:
This is important.
|
The <body> tags
We introduced a lot of new tags in the previous section -- here's what you need
to know. The <p> (paragraph), <pre> (code block),
<note>, <warn> (warning) and <impo>
(important) tags all can contain one or more lines of text. Besides the
<table>, <ul>, <ol> and
<dl> elements (which we'll cover in just a bit), these are the
only tags that should appear immediately inside a <body> element.
Another thing -- these tags should not be stacked -- in other words,
don't put a <note> element inside a <p> element. As
you might guess, the <pre> element preserves its whitespace
exactly, making it well-suited for code excerpts. You must name the
<pre> tag with a caption attribute:
Code Listing 2.5: Named <pre> |
<pre caption="Output of uptime">
# <i>uptime</i>
16:50:47 up 164 days, 2:06, 5 users, load average: 0.23, 0.20, 0.25
</pre>
|
Epigraphs
Delegates from the original 13 states formed the Contented Congress. Thomas
Jefferson, a Virgin, and Benjamin Franklin were two singers of the Declaration
of Independence. Franklin discovered electricity by rubbing two cats backwards
and declared, "A horse divided against itself cannot stand." Franklin died in
1790 and is still dead.
—Anonymous student
Epigraphs are sometimes used at the beginning of chapters to illustrate what is
to follow. It is simply a paragraph with a by attribute that contains
the signature.
Code Listing 2.6: Short epigraph |
<p by="Anonymous student">
Delegates from the original 13 states formed the...
</p>
|
<path>, <c>, <b>, <e>, <sub> and <sup>
The <path>, <c>, <b>, <e>,
<sub> and <sup> elements can be used inside any child
<body> tag, except for <pre>.
The <path> element is used to mark text that refers to an
on-disk file -- either an absolute or relative path, or a
simple filename. This element is generally rendered with a mono spaced
font to offset it from the standard paragraph type.
The <c> element is used to mark up a command or user
input. Think of <c> as a way to alert the reader to something
that they can type in that will perform some kind of action. For example, all
the XML tags displayed in this document are enclosed in a <c>
element because they represent something that the user could type in that is
not a path. By using <c> elements, you'll help your readers
quickly identify commands that they need to type in. Also, because
<c> elements are already offset from regular text, it is rarely
necessary to surround user input with double-quotes. For example, don't
refer to a "<c>" element like I did in this sentence. Avoiding
the use of unnecessary double-quotes makes a document more readable -- and
adorable!
As you might have guessed, <b> is used to boldface some
text.
<e> is used to apply emphasis to a word or phrase; for example:
I really should use semicolons more often. As you can see, this text is
offset from the regular paragraph type for emphasis. This helps to give your
prose more punch!
The <sub> and <sup> elements are used to specify
subscript and superscript.
Code samples and colour-coding
To improve the readability of code samples, the following tags are allowed
inside <pre> blocks:
- <i>
- Distinguishes user input from displayed text
- <comment>
- Comments relevant to the action(s) that appear after the comment
- <keyword>
- Denotes a keyword in the language used in the code sample
- <ident>
- Used for an identifier
- <const>
- Used for a constant
- <stmt>
- Used for a statement
- <var>
- Used for a variable
Note:
Remember that all leading and trailing spaces, and line breaks in
<pre> blocks will appear in the displayed html page.
|
Sample colour-coded <pre> block:
Code Listing 2.7: My first ebuild |
DESCRIPTION="Exuberant ctags generates tags files for quick source navigation"
HOMEPAGE="http://ctags.sourceforge.net"
SRC_URI="mirror://sourceforge/ctags/${P}.tar.gz"
LICENSE="GPL-2"
SLOT="0"
KEYWORDS="~mips ~sparc ~x86"
IUSE=""
src_compile() {
econf --with-posix-regex
emake || die "emake failed"
}
src_install() {
make DESTDIR="${D}" install || die "install failed"
dodoc FAQ NEWS README
dohtml EXTENDING.html ctags.html
}
|
<mail> and <uri>
We've taken a look at the <mail> tag earlier; it's used to link
some text with a particular email address, and takes the form <mail
link="foo.bar@example.com">Mr. Foo Bar</mail>. If you want to display the
email address, you can use <mail>foo.bar@example.com</mail>, this
would be displayed as foo.bar@example.com.
Shorter forms make it easier to use names and emails of Gentoo developers. Both
<mail>neysx</mail> and <mail link="neysx"/>
would appear as Xavier Neys. If you want to use a Gentoo dev's email
with a different content than his full name, use the second form with some
content. For instance, use a dev's first name: <mail
link="neysx">Xavier</mail> appears as Xavier.
This is particularly useful when you want to name a developer whose name
contains "funny" characters that you can't type.
The <uri> tag is used to point to files/locations on the Internet.
It has two forms -- the first can be used when you want to have the actual URI
displayed in the body text, such as this link to
http://forums.gentoo.org/. To create this link, I typed
<uri>http://forums.gentoo.org/</uri>. The alternate form is
when you want to associate a URI with some other text -- for example, the Gentoo Forums. To create
this link, I typed <uri link="http://forums.gentoo.org/">the
Gentoo Forums</uri>. You don't need to write
http://www.gentoo.org/ to link to other parts of the Gentoo web site.
For instance, a link to the documentation main index
should be simply <uri link="/doc/en/index.xml">documentation main
index</uri>. You can even omit index.xml when you link to a
directory index, e.g. <uri link="/doc/en/">documentation main
index</uri>. Leaving the trailing slash saves an extra HTTP request.
You should not use a <uri> tag with a link attribute that
starts with mailto:. In this case, use a <mail> tag.
Please avoid the click here
syndrome as recommended by the W3C.
Figures
Here's how to insert a figure into a document -- <figure
link="mygfx.png" short="my picture" caption="my favorite picture of all
time"/>. The link attribute points to the actual graphic image,
the short attribute specifies a short description (currently used for
the image's HTML alt attribute), and a caption. Not too difficult
:) We also support the standard HTML-style <img src="foo.gif"/> tag
for adding images without captions, borders, etc.
Tables
GuideXML supports a simplified table syntax similar to that of HTML. To start a
table, use a <table> tag. Start a row with a <tr>
tag. However, for inserting actual table data, we don't support the HTML
<td> tag; instead, use the <th> if you are inserting a
header, and <ti> if you are inserting a normal informational
block. You can use a <th> anywhere you can use a <ti>
-- there's no requirement that <th> elements appear only in the
first row.
Besides, both table headers (<th>) and table items
(<ti>) accept the colspan and rowspan attributes to
span their content across rows, columns or both.
Furthermore, table cells (<ti> & <th>) can be
right-aligned, left-aligned or centered with the align attribute.
| This title spans 4 columns |
| This title spans 6 rows |
Item A1 |
Item A2 |
Item A3 |
| Item B1 |
Blocky 2x2 title |
| Item C1 |
| Item D1..D3 |
| Item E1..F1 |
Item E2..E3 |
| Item F2..F3 |
Lists
To create ordered or unordered lists, simply use the XHTML-style
<ol>, <ul> and <li> tags. Lists may only
appear inside the <body> and <li> tags which means
that you can have lists inside lists. Don't forget that you are writing XML and
that you must close all tags including list items unlike in HTML.
Definition lists (<dl>) are also supported. Please note that
neither the definition term tag (<dt>) nor the definition data tag
(<dd>) accept any other block level tag such as paragraphs or
admonitions. A definition list comprises:
- <dl>
- A Definition List Tag containing
- <dt>
- Pairs of Definition Term Tags
- <dd>
- and Definition Data Tags
The following list copied from w3.org shows
that a definition list can contain ordered and unordered lists. It may not
contain another definition list though.
- The ingredients:
-
- 100 g. flour
- 10 g. sugar
- 1 cup water
- 2 eggs
- salt, pepper
- The procedure:
-
- Mix dry ingredients thoroughly
- Pour in wet ingredients
- Mix for 10 minutes
- Bake for one hour at 300 degrees
- Notes:
- The recipe may be improved by adding raisins
Intra-document references
GuideXML makes it really easy to reference other parts of the document using
hyperlinks. You can create a link pointing to Chapter
One by typing <uri link="#doc_chap1">Chapter
One</uri>. To point to section two of
Chapter One, type <uri link="#doc_chap1_sect2">section two of
Chapter One</uri>. To refer to figure 3 in chapter 1, type
<uri link="#doc_chap1_fig3">figure 1.3</uri>. Or, to refer
to code listing 2 in chapter 2, type
<uri link="#doc_chap2_pre2">code listing 2.2</uri>.
However, some guides change often and using such "counting" can lead to broken
links. In order to cope with this, you can define a name for a
<chapter>, <section> or a <tr> by using
the id attribute, and then point to that attribute, like this:
Code Listing 2.8: Using the id attribute |
<chapter id="foo">
<title>This is foo!</title>
...
<p>
More information can be found in the <uri link="#foo">foo chapter</uri>
</p>
|
Disclaimers and obsolete documents
A disclaimer attribute can be applied to guides and handbooks to display
a predefined disclaimer at the top of the document. The available disclaimers
are:
-
articles is used for republished
articles
-
draft is used to indicate a document is still being worked on and
should not be considered official
-
oldbook is used on old handbooks to indicate they are not maintained
anymore
-
obsolete is used to mark a document as obsolete.
When marking a document as obsolete, you might want to add a link to a new
version. The redirect attribute does just that. The user might be
automatically redirected to the new page but you should not rely on that
behaviour.
Code Listing 2.9: Disclaimer sample |
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
<!-- $Header$ -->
<guide disclaimer="obsolete" redirect="/doc/en/handbook/handbook-x86.xml">
<title>Gentoo x86 Installation Guide</title>
<author title="Author">
...
|
FAQs
FAQ documents need to start with a list of questions with links to their
answers. Creating such a list is both time-consuming and error-prone. The list
can be created automatically if you use a faqindex element as the first
chapter of your document. This element has the same structure as a
chapter to allow some introductory text. The structure of the document
is expected to be split into chapters (at least one chapter) containing
sections, each section containing one question specified in its title
element with the answer in its body. The FAQ index will appear as one
section per chapter and one link per question.
A quick look at a FAQ and its source should make the above
obvious.
3.
Handbook Format
Guide vs Book
For high-volume documentation, such as the Installation Instructions, a
broader format was needed. We designed a GuideXML-compatible enhancement that
allows us to write modular and multi-page documentation.
Main File
The first change is the need for a "master" document. This document contains no
real content, but links to the individual documentation modules. The syntax
doesn't differ much from GuideXML:
Code Listing 3.1: Example book usage |
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE book SYSTEM "/dtd/book.dtd">
<!-- $Header$ -->
<book>
<title>Example Book Usage</title>
<author...>
...
</author>
<abstract>
...
</abstract>
<!-- The content of this document is licensed under the CC-BY-SA license -->
<!-- See http://creativecommons.org/licenses/by-sa/2.5 -->
<license/>
<version>...</version>
<date>...</date>
|
So far no real differences (except for the <book> instead of
<guide> tag). Instead of starting with the individual
<chapter>s, you define a <part>, which is the
equivalent of a separate part in a book:
Code Listing 3.2: Defining a part |
<part>
<title>Part One</title>
<abstract>
...
</abstract>
</part>
|
Each part is accompanied by a <title> and an
<abstract> which gives a small introduction to the part.
Inside each part, you define the individual <chapter>s. Each
chapter must be a separate document. As a result it is no surprise that
a special tag (<include>) is added to allow including the separate
document.
Code Listing 3.3: Defining a chapter |
<chapter>
<title>Chapter One</title>
<include href="path/to/chapter-one.xml"/>
</chapter>
|
Designing the Individual Chapters
The content of an individual chapter is structured as follows:
Code Listing 3.4: Chapter Syntax |
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE sections SYSTEM "/dtd/book.dtd">
<!-- $Header$ -->
<!-- The content of this document is licensed under the CC-BY-SA license -->
<!-- See http://creativecommons.org/licenses/by-sa/2.5 -->
<sections>
<abstract>
This is a small explanation on chapter one.
</abstract>
<version>...</version>
<date>...</date>
</sections>
|
Inside each chapter you can define <section>s (equivalent of
<chapter> in a Guide) and <subsection>s (equivalent
of <section> in a Guide).
Each individual chapter should have its own date and version elements. The
latest date of all chapters and master document will be displayed when a user
browses through all parts of the book.
4.
Advanced Handbook Features
Global Values
Sometimes, the same values are repeated many times in several parts of a
handbook. Global search and replace operations tend to forget some or introduce
unwanted changes. Besides, it can be useful to define different values to be
used in shared chapters depending on which handbook includes the chapter.
Global values can be defined in a handbook master file and used in all included
chapters.
To define global values, add a <values> element to the handbook
master file. Each value is then defined in a <key> element whose
id attribute identifies the value, i.e. it is the name of your variable.
The content of the <key> is its value.
The following example defines three values in a handbook master file:
Code Listing 4.1: Define values in a handbook |
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE book SYSTEM "/dtd/book.dtd">
<!-- $Header$ -->
<book>
<title>Example Book Usage</title>
<values>
<key id="arch">x86</key>
<key id="min-cd-name">install-x86-minimal-2007.0-r1.iso</key>
<key id="min-cd-size">57</key>
</values>
<author...>
...
</author>
...
|
The defined values can then be used throughout the handbook with the in-line
<keyval id="key_id"/> element. Specify the name of the key in its
id attribute, e.g. <keyval id="min-cd-name"/> would be replaced by
"install-x86-minimal-2007.0-r1.iso" in our example.
Code Listing 4.2: Using defined values |
<p>
The Minimal Installation CD is called <c><keyval id="min-cd-name"/></c>
and takes up only <keyval id="min-cd-size"/> MB of diskspace. You can use this
Installation CD to install Gentoo, but <e>only</e> with a working Internet
connection.
</p>
|
To make life easier on our translators, only use actual values, i.e. content
that does not need to be translated. For instance, we defined the
min-cd-size value to 57 and not 57 MB.
Conditional Elements
Chapters that are shared by several handbooks such as our Installation Handbooks often have small
differences depending on which handbook includes them. Instead of adding
content that is irrelevant to some handbooks, authors can add a condition to
the following elements: <section>, <subsection>,
<body>, <note>, <impo>,
<warn>, <pre>, <p>,
<table>, <tr>, <ul>, <ol>
and <li>.
The condition must be an XPATH expression that will be
evaluated when transforming the XML. If it evaluates to true, the
element is processed, if not, it is ignored. The condition is specified in a
test attribute.
The following example uses the arch value that is defined in each
handbook master file to condition some content:
Code Listing 4.3: Using conditional elements |
<body test="contains('AMD64 x86',func:keyval('arch'))">
<p>
This paragraph applies to both x86 and AMD64 architectures.
</p>
<p test="func:keyval('arch')='x86'">
This paragraph only applies to the x86 architecture.
</p>
<p test="func:keyval('arch')='AMD64'">
This paragraph only applies to the AMD64 architecture.
</p>
<p test="func:keyval('arch')='PPC'">
This paragraph will never be seen!
The whole body is skipped because of the first condition.
</p>
</body>
<body test="contains('AMD64 PPC64',func:keyval('arch'))">
<p>
This paragraph applies to the AMD64, PPC64 architectures because
the 'AMD64 PPC64' string does contain 'PPC'.
</p>
<note test="func:keyval('arch')='AMD64' or func:keyval('arch')='PPC64'">
This note only applies to the AMD64 and PPC64 architectures.
</note>
</body>
|
5.
Coding Style
Introduction
Since all Gentoo Documentation is a joint effort and several people will
most likely change existing documentation, a coding style is needed.
A coding style contains two sections. The first one is regarding
internal coding - how the XML-tags are placed. The second one is
regarding the content - how not to confuse the reader.
Both sections are described next.
Internal Coding Style
Newlines must be placed immediately after every
GuideXML-tag (both opening as closing), except for:
<version>, <date>, <title>,
<th>, <ti>,
<li>, <i>, <e>,
<uri>, <path>, <b>, <c>,
<comment>, <mail>.
Blank lines must be placed immediately after every
<body> (opening tag only) and before every
<chapter>, <p>, <table>,
<author> (set), <pre>, <ul>,
<ol>, <warn>, <note> and
<impo> (opening tags only).
Word-wrapping must be applied at 80 characters except inside
<pre>. You may only deviate from this rule when there is no other
choice (for instance when a URL exceeds the maximum amount of characters). The
editor must then wrap whenever the first whitespace occurs. You should try to
keep the rendered content of <pre> elements within 80
columns to help console users.
Indentation may not be used, except with the XML-constructs of which the
parent XML-tags are <tr> (from <table>),
<ul>, <ol>, <dl>, and
<author>. If indentation is used, it must be two spaces for
each indentation. That means no tabs and not more spaces.
Besides, tabs are not allowed in GuideXML documents.
In case word-wrapping happens in <ti>, <th>,
<li> or <dd> constructs, indentation must be used for
the content.
An example for indentation is:
Code Listing 5.1: Indentation Example |
<table>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
<tr>
<ti>This is an example for indentation</ti>
<ti>
In case text cannot be shown within an 80-character wide line, you
must use indentation if the parent tag allows it
</ti>
</tr>
</table>
<ul>
<li>First option</li>
<li>Second option</li>
</ul>
|
Attributes may not have spaces in between the attribute, the "=" mark,
and the attribute value. As an example:
Code Listing 5.2: Attributes |
<pre caption = "Attributes">
<pre caption="Attributes">
|
External Coding Style
Inside tables (<table>) and listings (<ul>,
<ol>) and <dl>, periods (".") should not be used
unless multiple sentences are used. In that case, every sentence should end
with a period (or other reading marks).
Every sentence, including those inside tables and listings, should start
with a capital letter.
Code Listing 5.3: Periods and capital letters |
<ul>
<li>No period</li>
<li>With period. Multiple sentences, remember?</li>
</ul>
|
Code Listings should always have a caption.
Try to use <uri> with the link attribute as much as
possible. In other words, the Gentoo
Forums is preferred over http://forums.gentoo.org.
When you comment something inside a <pre> construct, use
<comment> and parentheses or the comment marker for the language
that is being used (# for bash scripts and many other things, //
for C code, etc.) Also place the comment before the subject of the
comment.
Code Listing 5.4: Comment example |
# id john
|
6.
Resources
Start writing
GuideXML has been specially designed to be "lean and mean" so that developers
can spend more time writing documentation and less time learning the actual XML
syntax. Hopefully, this will allow developers who aren't unusually "doc-savvy"
to start writing quality Gentoo documentation. You might be interested in our
Documentation Development Tips
& Tricks. If you'd like to help (or have any questions about
GuideXML), please post a message to the gentoo-doc mailing list stating what you'd like
to tackle. Have fun!
The contents of this document are licensed under the Creative Commons -
Attribution / Share Alike license.
|