POI操作office文档

上一篇 / 下一篇  2014-05-09 11:11:11 / 个人分类:学习笔记

office有两种标准:
1.(office open XML standars)OOXML,为office 2007以上版本的文档格式。后缀名为xlsx,docx,pptxMFC serialization API based file formats,等。可以使用
2.OLE2 compound document format,为office 97格式。后缀名为xls,docx,pptx等。
Excel (SS=HSSF+XSSF) 
Word (HWPF+XWPF)
PowerPoint (HSLF+XSLF)
OpenXML4J (OOXML)
OLE2 Filesystem (POIFS)
OLE2 Document Props (HPSF)
Outlook (HSMF)
Visio (HDGF)
TNEF (HMEF)
Publisher (HPBF)
low level访问OLE2文档可以使用POIFS和HPSF,如果访问OOXML文档可以使用openxml4j。
high level访问对应关系:
xls:HSSF 
xlsx:XSSF
SS=HSSF+XSSF
doc:HWPF
docx:XWPF
ppt:HSLF
pptx:XSLF
 
下面的代码为两种格式转换。
FileInputStream fis = new FileInputStream(inputFile);
POIFSFileSystem fileSystem = new POIFSFileSystem(fis);
// Firstly, get an extractor for the Workbook
POIOLE2TextExtractor leTextExtractor = 
   ExtractorFactory.createExtractor(fileSystem);
// Then a List of extractors for any embedded Excel, Word, PowerPoint
// or Visio objects embedded into it.
POITextExtractor[] embeddedExtractors =
   ExtractorFactory.getEmbededDocsTextExtractors(oleTextExtractor);
for (POITextExtractor textExtractor : embeddedExtractors) {
   // If the embedded object was an Excel spreadsheet.
   if (textExtractor instanceof ExcelExtractor) {
      ExcelExtractor excelExtractor = (ExcelExtractor) textExtractor;
      System.out.println(excelExtractor.getText());
   }
   // A Word Document
   else if (textExtractor instanceof WordExtractor) {
      WordExtractor wordExtractor = (WordExtractor) textExtractor;
      String[] paragraphText = wordExtractor.getParagraphText();
      for (String paragraph : paragraphText) {
         System.out.println(paragraph);
      }
      // Display the document's header and footer text
      System.out.println("Footer text: " + wordExtractor.getFooterText());
      System.out.println("Header text: " + wordExtractor.getHeaderText());
   }
   // PowerPoint Presentation.
   else if (textExtractor instanceof PowerPointExtractor) {
      PowerPointExtractor powerPointExtractor =
         (PowerPointExtractor) textExtractor;
      System.out.println("Text: " + powerPointExtractor.getText());
      System.out.println("Notes: " + powerPointExtractor.getNotes());
   }
   // Visio Drawing
   else if (textExtractor instanceof VisioTextExtractor) {
      VisioTextExtractor visioTextExtractor = 
         (VisioTextExtractor) textExtractor;
      System.out.println("Text: " + visioTextExtractor.getText());
   }
}

TAG:

 

评分:0

我来说两句

Open Toolbar