Código:
Dónde el método readParagraphs:fs = new POIFSFileSystem(new FileInputStream("c:\\Data.docx")); HWPFDocument doc = new HWPFDocument(fs); readParagraphs(doc);
Código:
Al ejecutarlo me salta ésta excepción:public static void readParagraphs(HWPFDocument doc) throws Exception{ WordExtractor we = new WordExtractor(doc); /**Get the total number of paragraphs**/ String[] paragraphs = we.getParagraphText(); System.out.println("Total Paragraphs: "+paragraphs.length); for (int i = 0;i < paragraphs.length; i++) { System.out.println("Length of paragraph "+(i +1)+": "+ paragraphs[i].length()); System.out.println(paragraphs[i].toString()); } }
Código:
Alguien sabe cómo arreglarlo?? org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF) at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131) at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104) at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138) at anagram.Anagram.main(Anagram.java:55)
He conseguido leer el documento usando el método getText del XWPFWordExtractor, pero necesito extraer los datos línea a línea para procesarlos, hay algún otro método?
Saludos y gracias!