POI操作office文档
上一篇 /
下一篇 2014-05-09 11:11:11
/ 个人分类:学习笔记
office有两种标准:
1.(office open XML standars)OOXML,为office 2007以上版本的文档格式。后缀名为xlsx,docx,pptxMFC serialization API based file formats,等。可以使用
2.OLE2 compound document format,为office 97格式。后缀名为xls,docx,pptx等。
Excel (SS=HSSF+XSSF)
Word (HWPF+XWPF)
PowerPoint (HSLF+XSLF)
OpenXML4J (OOXML)
OLE2 Filesystem (POIFS)
OLE2 Document Props (HPSF)
Outlook (HSMF)
Visio (HDGF)
TNEF (HMEF)
Publisher (HPBF)
low level访问OLE2文档可以使用POIFS和HPSF,如果访问OOXML文档可以使用openxml4j。
high level访问对应关系:
xls:HSSF
xlsx:XSSF
SS=HSSF+XSSF
doc:HWPF
docx:XWPF
ppt:HSLF
pptx:XSLF
下面的代码为两种格式转换。
FileInputStream fis = new FileInputStream(inputFile);
POIFSFileSystem fileSystem = new POIFSFileSystem(fis);
// Firstly, get an extractor for the Workbook
POIOLE2TextExtractor leTextExtractor =
ExtractorFactory.createExtractor(fileSystem);
// Then a List of extractors for any embedded Excel, Word, PowerPoint
// or Visio objects embedded into it.
POITextExtractor[] embeddedExtractors =
ExtractorFactory.getEmbededDocsTextExtractors(oleTextExtractor);
for (POITextExtractor textExtractor : embeddedExtractors) {
// If the embedded object was an Excel spreadsheet.
if (textExtractor instanceof ExcelExtractor) {
ExcelExtractor excelExtractor = (ExcelExtractor) textExtractor;
System.out.println(excelExtractor.getText());
}
// A Word Document
else if (textExtractor instanceof WordExtractor) {
WordExtractor wordExtractor = (WordExtractor) textExtractor;
String[] paragraphText = wordExtractor.getParagraphText();
for (String paragraph : paragraphText) {
System.out.println(paragraph);
}
// Display the document's header and footer text
System.out.println("Footer text: " + wordExtractor.getFooterText());
System.out.println("Header text: " + wordExtractor.getHeaderText());
}
// PowerPoint Presentation.
else if (textExtractor instanceof PowerPointExtractor) {
PowerPointExtractor powerPointExtractor =
(PowerPointExtractor) textExtractor;
System.out.println("Text: " + powerPointExtractor.getText());
System.out.println("Notes: " + powerPointExtractor.getNotes());
}
// Visio Drawing
else if (textExtractor instanceof VisioTextExtractor) {
VisioTextExtractor visioTextExtractor =
(VisioTextExtractor) textExtractor;
System.out.println("Text: " + visioTextExtractor.getText());
}
}
收藏
举报
TAG: