Open Document Format (ODT)
The Open Document Format for Office Applications (ODF), also known as
OpenDocument
, is an open file format for word processing documents, spreadsheets, presentations and graphics and using ZIP-compressed XML files. It was developed with the aim of providing an open, XML-based file format specification for office applications.
The standard is developed and maintained by a technical committee in the Organization for the Advancement of Structured Information Standards (
OASIS
) consortium. It was based on the Sun Microsystems specification for OpenOffice.org XML, the default format forOpenOffice.org
andLibreOffice
. It was originally developed forStarOffice
"to provide an open standard for office documents."
The UnstructuredODTLoader
is used to load Open Office ODT
files.
from langchain_community.document_loaders import UnstructuredODTLoader
loader = UnstructuredODTLoader("example_data/fake.odt", mode="elements")
docs = loader.load()
docs[0]
Document(page_content='Lorem ipsum dolor sit amet.', metadata={'source': 'example_data/fake.odt', 'category_depth': 0, 'file_directory': 'example_data', 'filename': 'fake.odt', 'last_modified': '2023-12-19T13:42:18', 'languages': ['por', 'cat'], 'filetype': 'application/vnd.oasis.opendocument.text', 'category': 'Title'})
Relatedโ
- Document loader conceptual guide
- Document loader how-to guides