DITA2InDesign Plugin Project

DITA Project Gutenberg Samples

The the DITA2InDesign project materials include a number of books from Project Gutenberg that have been converted to DITA format. These books serve primarily as realistic test documents used to exercise the DITA-to-InDesign process, but they also serve more generally as non-trivial, interesting, unencumbered sample DITA documents that can be used for demonstration, product testing, and experimentation. The materials served here are provided as-is. They have been converted as quickly as possible and the markup, while valid, is not necessarily optimal.

Contributions to this data set are welcomed: if you convert a Gutenberg document to DITA and would like to add it to this collection simply contact one of the project members or post a note to the dita-users Yahoo group. The only requirements are that the documents be syntactically valid DITA (using either standard doctypes or, if using non-standard shells or specializations, includes the necessary non-standard declaration sets) and include all required Project Gutenberg license and attribution statements. You can use the existing documents as models for how to represent the Project Gutenberg statements.

The samples are available in source form from the dita2indesign code repository within the trunk/dita_gutenberg_samples folder as well as in packages available through the project file download area. The HTML and PDF renderings are served through the project's Web site.

The available samples are:

Table 1. DITA Project Gutenberg Sample Documents
Document Title Source Location Renditions Notes
Encyclopaedia Britannica 11th Edition, Vol 4, Part 3 of 4 eb_vol_04_part_03_of_04

HTML, no frames

HTML with frames and Toc

PDF

Uses one topic for each entry. Contains about 450 entries, renderes to about 575 pages.
The Outline of Science, Volume 1 of 4 outline_of_science

HTML, no frames

HTML with frames and Toc

PDF

Uses one top-level topic per chapter, with nested topics, representing a typical "narrative" document. Nine chapters, renders to about 150 pages. Uses a bookmap rather than a generic map.
T. S. Eliot's The Waste Land ts_eliot_wasteland

HTML, no frames

HTML with frames and Toc

PDF

Entire poem is one topic, with a nested topic for Eliot's notes. Uses a generic map. Uses outputclass= to indicate formatting specifics for the poem itself. Note that these outputclassed elements would be natural candidates for specialization in a "poetry" domain or "poem" topic type.
20,000 Leagues Under the Sea by Jules Verne 20000_leagues

HTML, no frames

HTML with frames and Toc

PDF

A translation from the original French of the 19th Century novel.