Cheshire3 Object Model - DocumentFactory¶
API¶
- class cheshire3.baseObjects.DocumentFactory(session, config, parent=None)[source]¶
A DocumentFactory takes raw data, returns one or more Documents.
A DocumentFacory can be used to return Documents from e.g. a file, a directory containing many files, archive files, a URL, or a web-based API.
- load(session, data, cache=None, format=None, tagName=None, codec='')[source]¶
Load documents into the document factory from data.
Returns the DocumentFactory itself which acts as an iterator DocumentFactory’s load function takes session, plus:
- data := the data to load. Could be a filename, a directory name,
- the data as a string, a URL to the data etc.
- cache := setting for how to cache documents in memory when reading
- them in.
- format := format of the data parameter. Many options, most common:
- xml – XML file. May contain multiple records
- dir – a directory containing files to load
- tar – a tar file containing files to load
- zip – a zip file containing files to load
- marc – a file with MARC records (library catalogue data)
- http – a base HTTP URL to retrieve
tagName := name of the tag which starts (and ends!) a Record.
codec := name of the codec in which the data is encoded.
Implementations¶
The following implementations are included in the distribution by default: