程序包 | 说明 |
---|---|
org.opencms.search.extractors |
Contains a generic, low-level framework for extration of plain text content out of various popular file formats.
|
限定符和类型 | 类和说明 |
---|---|
class |
CmsExtractorHtml
Extracts the text from an HTML document.
|
class |
CmsExtractorMsOfficeOLE2
Extracts text data from a VFS resource that is an OLE 2 MS Office document.
|
class |
CmsExtractorMsOfficeOOXML
Extracts text data from a VFS resource that is an OOXML MS Office document.
|
class |
CmsExtractorOpenOffice
Extracts the text from OpenOffice documents (.ods, .odf).
|
class |
CmsExtractorPdf
Extracts the text from a PDF document.
|
class |
CmsExtractorRtf
Extracts the text from a RTF document.
|