The Darwin Information Typing Architecture (DITA) is an XML-based architecture for authoring, producing, and delivering technical information.

The name of the architecture was derived as follows:

Darwin: Named for the naturalist Charles Darwin, DITA uses the principles of specialization and inheritance.
Information Typing: DITA capitalizes on the semantics of topics (concept, task, reference) and of content (messages, typed phrases, semantic tables).
Architecture: DITA provides vertical headroom (new applications) and edgewise extension (specialization into new types) for information.

DITA divides content into small, self-contained topics that can be reused in different deliverables. The extensibility of DITA permits organizations to define specific information structures and still use standard tools to work with them. The ability to define company-specific and even group-specific information architectures enables DITA to support content reuse and reduce information redundancy.

DITA specifies three basic topic types: Task, Concept and Reference.

Each of the three basic topic types is a specialization of a generic Topic type, which contains a title element, a prolog element for metadata, and a body element. The body element contains paragraph, table, and list elements, similar to HTML.

A Task topic is intended for a procedure that describes how to accomplish a task. A Task topic lists a series of steps that users follow to produce an intended outcome. The steps are contained in a taskbody element, which is a specialization of the generic body element. The steps element is a specialization of an ordered list element.

Concept information is more objective, containing definitions, rules, and guidelines.

A Reference topic is for topics that describe command syntax, programming instructions, and other reference material, and usually contains detailed, factual material.

The DITA architecture and a related DTD and XML Schema was originally developed by IBM. DITA is now an OASIS standard.


  • DITA encourages writing of content as modular topics, as opposed to long "book-oriented" files. Topics can be easily reused in different deliverables.
  • Fragments of content within topics (or less commonly, the topics themselves) can be reused through the use of content references.
  • Conditional text, which allows filtering or styling content based on attributes for audience, platform, product, and other properties.
  • Extensive metadata, which makes topics easier to find.
  • The element types and structures in DITA topics are similar to popular languages like HTML. For example, a bulleted or numbered list can be copied and pasted directly from HTML to DITA.
  • DITA allows adding new elements through specialization of base DITA elements. Through specialization, DITA can accommodate new topic types and element types as needed for specific industries or companies.

Creating DITA content consists of writing topics and maps. A map contains links to topics, organized in the sequence (which may be hierarchical) in which they are intended to appear in finished documents. A DITA map defines the table of contents for deliverables, and can also specify which topics link to each other.

DITA map and topic documents are XML files. Any images, video files, or other files which need to appear in output are inserted via reference. Any XML editor can therefore be used to write DITA content, with the exception of editors that support only a limited set of XML schemas (such as XHTML editors). DITA-compliant XML editors validate documents against multiple schemas and DTDs.

DITA is conceived as an end-to-end architecture. In addition to indicating what elements, attributes, and rules are part of the DITA language, the DITA specification includes rules for publishing DITA content in print, HTML, online Help, and other formats. For example, the DITA specification indicates that if the conref attribute of element A contains a path to element B, the contents of element B will be displayed in the location of element A. DITA-compliant publishing solutions, known as DITA processors, must handle the conref attribute according to the specified behaviour. Rules also exist for processing other rich features such as conditional text, index markers, and topic-to-topic links.

When DITA was released as a public XML standard in 2001, IBM contributed the DITA Open Toolkit, the first DITA-compliant processor. The toolkit transforms DITA content into output formats like PDF, HTML, and Online Help, and can be extended to handle arbitrary specializations and arbitrary output formats. Out of the box, it handles all valid DITA specializations and several output formats, including:

The toolkit continues to be the foundation of most publishing of DITA content. Many DITA users use it directly, and some DITA authoring tools and content management tools now integrate parts of the toolkit into their own publishing workflows.

The DITA Open Toolkit is an active open-source project, with contributions from several companies.

  • March 2001 Introduction by IBM
  • May 2002 Domain specialization added to topic specialization
  • April 2004 OASIS Technical Committee for DITA formed
  • February 2005 SourceForge begins DITA Open Toolkit support
  • June 2005 DITA v1.0 approved as an OASIS standard
  • August 2005 DITA Open Toolkit v1.1 is released
  • March 2006 OASIS launches
  • August 2007 DITA V1.1 is approved by OASIS, including Bookmap specialization

