Privacy Policy Cookie Policy Terms and Conditions Microsoft Office Open XML - Wikipedia, the free encyclopedia

Microsoft Office Open XML

From Wikipedia, the free encyclopedia

WinZip screenshot showing the files of .docx document
Enlarge
WinZip screenshot showing the files of .docx document

Microsoft Office Open XML (OOXML) is a file format developed by Microsoft to be used by the upcoming release of Microsoft Office 2007.

Microsoft's Office Open XML format uses a ZIP container for packaging XML and other data files. The resulting files are smaller than the binary files created by the previous Office formats. Microsoft maintains that its primary goal has to be backward compatibility with existing documents and full support of its extensive feature set. The Microsoft Office Open XML format is Microsoft's direct answer to the OpenDocument format (ISO/IEC DIS 26300) which was created by the OASIS foundation and uses similar technologies (XML contained in ZIP). A comparison can be found in Comparison of OpenDocument and Microsoft XML formats.

Contents

[edit] Standardization

Microsoft has stated it will be an open standard, and has submitted it to the Ecma standardization process. The charter of the Ecma Technical Committee requires it to submit the completed standard to the ISO. Ecma announced on December 9, 2005 that it had accepted Microsoft's proposal to document the format as a proposed standard. It will be referred to as Ecma Office Open XML.

The Ecma technical committee developing the proposal includes representatives from Apple, the British Library, Canon, Intel, Microsoft, NextPage, Novell, Pioneer, Statoil ASA,Toshiba and The United States Library of Congress.[1]

The final draft of the Office Open XML standard has been submitted it to the Ecma Secretary General (October 6th, 2006) and the General Assembly of Ecma International will vote on this proposal during their meeting December 7-8, 2006.[2]

A liaison from the ISO/IEC from SC34 has been helping during the standardization proces with the technical committee of Ecma to prepare Open XML submission to ISO/IEC.

[edit] Licensing

The Microsoft Office Open XML format will be available under a free and perpetual license from Microsoft.[3]

There has been a lot of argument about the ability for OSS software to use the format even under this fairly open license. Microsoft has tried to diminish these concerns by officially stating in a covenant not to sue[4] [5] that it will not sue any organisation for using the format if the implementation complies to the official OOXML Ecma standard file formats. This has led to a greater reassurance that the OOXML formats will also be available for use in OSS software as even expressed by OSS licensing expert Larry Rosen.[6]

A further indication of the free and open use of the format was given by Microsoft XML program manager Brian Jones as he presents a legal analysis on the convenant not to sue and also states that there is "no license needed to use the Office Open XML formats."[7]

[edit] File format and structure

The Open XML files consist of a ZIP package in which a set of individual XML files are placed that together form the basis of the Office document. Also included in the ZIP package will be embedded (binary) files like PNG, JPEG OR GIF images. A basic Open XML file contains an XML file called [Content_Types].xml at the root level of the ZIP package, along with three folders: _rels, docProps, and a directory specific for the document type (i.e. in a .docx wordprocessing file that would be a word directory). The word directory will contain the basic wordDocument.xml file which is the basis for the Office document. The directory in basic document wil vary depending on the type of office file created.

[Content_Types].xml file 
This file describes the content of the ZIP package. It also contains a mapping for file extensions and overrides for specific URIs.
_rels Folder
The _rels folders are where one goes to find the relationships for any given part within the package. To find the relationships for a specific part, one looks for the _rels folder that is a sibling of one's part. If the part has relationships, the _rels folder will contain a file that has one's original part name with a .rels appended to it. For example, if the content types part had any relationships, there would be a file called [Content_Types.xml.rels] inside the _rels folder.
_rels/.rel 
The root level _rels folder always contains a part called .rels. This URI (/_rels/.rels) and /[Content_Types].xml are the only two reserved URIs for parts in files that adhere to Open XML conventions. This is where the "package relationships" are located. Whenever one opens a file using these conventions, one always starts by going to the _rels/.rels file. All relationship files are represented with XML. If one opens it in a text editor, one will see a bunch of XML that outlines each relationship for that part. In a minimal word document containing only the basic wordDocument.xml, the top level parts are two metadata parts, and the wordDocument.xml part.
word/wordDocument.xml 
This is the main part for any Word document. If one views it in an XML editor, one will see a pretty basic XML file. The body of the wordprocessing document is contained in this part.

[edit] Relationship files in Open XML

An example relationship file in Open XML (for example word/wordDocument.xml.rels)

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Relationships xmlns="http://schemas.microsoft.com/package/2005/06/relationships">
        <Relationship Id="rId1" Type="http://schemas.microsoft.com/office/2006/relationships/image"
                Target="http://en.wikipedia.org//images/wiki-en.png" TargetMode="External" />
        <Relationship Id="rId2" Type="http://schemas.microsoft.com/office/2006/relationships/hyperlink"
                Target="http://www.wikipedia.org" TargetMode="External" />
</Relationships>

The relationships files allow one to quickly navigate through the package without having to open up each part. If someone wanted to find all images that are referenced in a wordDocument, he or she wouldn't even need to open the wordDocument.xml part. Just open the relationships file and look for all relationships that are of type http://schemas.microsoft.com/office/2006/relationships/image. If you want to change this to point at a different image, you just edit the relationship, and don't need to modify the application level XML. This is especially useful for external relationships.

[edit] Hyperlink relations

The tag for inline markup for the hyperlink is

<w:hyperlink w:rel="rId2" w:history="1"> 

It doesn't actually have the URL inline. Just like references to other parts in the ZIP use relationships, so to external references. If you go back to the relationships file for wordDocument.xml, you'll see can see if it is a internal relationship or for instance a relationship of type hyperlink. This work simular not just for hyperlinks, but for any external reference. Linked images, templates, etc. The relationship file containing all references makes it much easier to do link fix-up if you're moving files from one server to another. Or if you want to remove all external references for security reasons, you just edit the relationships.

[edit] Embedded or linked media file relations

Pictures can be embedded or linked in the XML files using a tag:

<v:imagedata w:rel="rId1" o:title="example" />

This is the reference to the image file. In Open XML, all references are done via relationships. For example a wordDocument.xml part has a relationship to the image part. In order to find the image, you would need to go to the relationships file for wordDocument.xml and find the relationship id "rId1". Looking back at the ZIP package, notice that there is a _rels folder in the same directory as the wordDocument.xml part. Open that folder and you'll see a file called wordDocument.xml.rels. In this file there will be a relationship definition that contains a type, an ID and a location. The ID is the referenced ID used in the XML document. The type will be a reference schema definition for the media type and the location will be an internal location within the ZIP package or an external location defined with an URL.

[edit] Document markup languages

It has been suggested that WordprocessingML be merged into this article or section. (Discuss)

Office Open XML contains 3 main markup languages for the creation of the main XML office documenttypes. These are:

  • WordProcessingML
  • SpreadsheetML
  • PresentationML

Next to these main markup languages MS Office Open XML contains several supporting markup languages. The most important of these are:

  • DrawingML
  • VML

[edit] Market adoption

Microsoft will be the main software developer making use of the Office Open XML format beginning with its office suite Microsoft Office 2007 that is scheduled to launch at the end of 2006 or the beginning of 2007. Currently Microsoft Office 2007 beta 2 support the most recent format. Microsoft also has released a compatibility pack for older versions on November 6, 2006. Using the compatibility pack users can create or edit Office Open XML files from within MS Office versions 2000/XP/2003. Microsoft Office 2003 also supports the predecessor Office 2003 XML formats.

Due to Microsoft Office being the current market leader in Office products, the other main Office suites are expected to support the Office Open XML format via an import function at the least. OpenOffice.org already supports the predecessor of the Open XML wordprocessing format namely WordProcessingML 2003. Corel has already indicated its Wordperfect Office suite will also support Open XML.[8]

The OSS Gnumeric spreadsheet is the first program to have (limited) Open XML support in a final software version.[9]

[edit] Notes

  1. ^ TC45 - Office Open XML Formats. Ecma International. Retrieved on 2006-07-28.
  2. ^ Ecma Office Open XML File Formats Standard - Final draft - 9th of October 2006. Ecma International (2006-06-21). Retrieved on 2006-07-28.
  3. ^ Paoli, Jean. Clarification of License Terms for Office XML Schema. Microsoft. Retrieved on 2006-07-28.
  4. ^ Microsoft Covenant Regarding Office 2003 XML Reference Schemas. Microsoft. Retrieved on 2006-07-11.
  5. ^ Microsoft Open Specification Promise. Microsoft (2006-10-23). Retrieved on 2006-11-06.
  6. ^ Blankenhorn, Dana (2005-11-29). Rosen approves Microsoft Office format license. ZDNet. Retrieved on 2006-07-28.
  7. ^ [http://blogs.msdn.com/brian_jones/archive/2006/08/04/688932.aspx - No license needed to use the Office Open XML formats -]. Brian Jones: Open XML Formats -. Brian Jones. Retrieved on [[04-08-2006 -]].
  8. ^ Akass, Clive (2006-01-17). New Wordperfect will support Office 12 formats. The Testbed. PCW. Retrieved on 2006-07-28.
  9. ^ GNOME Office / Gnumeric. GNOME.org. Retrieved on 2006-07-28.

[edit] See also

[edit] External links

[edit] Microsoft documents and blog postings

In other languages
THIS WEB:

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - be - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - closed_zh_tw - co - cr - cs - csb - cu - cv - cy - da - de - diq - dv - dz - ee - el - eml - en - eo - es - et - eu - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gd - gl - glk - gn - got - gu - gv - ha - haw - he - hi - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mg - mh - mi - mk - ml - mn - mo - mr - ms - mt - mus - my - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - rm - rmy - rn - ro - roa_rup - roa_tara - ru - ru_sib - rw - sa - sc - scn - sco - sd - se - searchcom - sg - sh - si - simple - sk - sl - sm - sn - so - sq - sr - ss - st - su - sv - sw - ta - te - test - tet - tg - th - ti - tk - tl - tlh - tn - to - tokipona - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu

Static Wikipedia 2008 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -

Static Wikipedia 2007:

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - be - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - closed_zh_tw - co - cr - cs - csb - cu - cv - cy - da - de - diq - dv - dz - ee - el - eml - en - eo - es - et - eu - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gd - gl - glk - gn - got - gu - gv - ha - haw - he - hi - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mg - mh - mi - mk - ml - mn - mo - mr - ms - mt - mus - my - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - rm - rmy - rn - ro - roa_rup - roa_tara - ru - ru_sib - rw - sa - sc - scn - sco - sd - se - searchcom - sg - sh - si - simple - sk - sl - sm - sn - so - sq - sr - ss - st - su - sv - sw - ta - te - test - tet - tg - th - ti - tk - tl - tlh - tn - to - tokipona - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu

Static Wikipedia 2006:

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - be - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - closed_zh_tw - co - cr - cs - csb - cu - cv - cy - da - de - diq - dv - dz - ee - el - eml - en - eo - es - et - eu - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gd - gl - glk - gn - got - gu - gv - ha - haw - he - hi - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mg - mh - mi - mk - ml - mn - mo - mr - ms - mt - mus - my - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - rm - rmy - rn - ro - roa_rup - roa_tara - ru - ru_sib - rw - sa - sc - scn - sco - sd - se - searchcom - sg - sh - si - simple - sk - sl - sm - sn - so - sq - sr - ss - st - su - sv - sw - ta - te - test - tet - tg - th - ti - tk - tl - tlh - tn - to - tokipona - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu