DEV Community

CodeSharing
CodeSharing

Posted on

Java/ Extract Text from a Word Document

TXT is a common text format that can be used on many computers and mobile devices. The TXT document is known for its small size, and it makes the storage of text content more convenient. This article will demonstrate how to extract the text content in a Word document and save it as .txt format by using Free Spire.Doc for Java.

Import JAR Dependency to Your Java Application
Method 1: Download the Free Spire.Doc for Java and unzip it. Then add the Spire.Doc.jar file to your Java application as dependency.
Method 2: You can also add the jar dependency to maven project by adding the following configurations to the pom.xml.

<repositories>
   <repository>
      <id>com.e-iceblue</id>
      <name>e-iceblue</name>
      <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>
   </repository>
</repositories>
<dependencies>
   <dependency>
      <groupId>e-iceblue</groupId>
      <artifactId>spire.doc.free</artifactId>
      <version>3.9.0</version>
   </dependency>
</dependencies>
Enter fullscreen mode Exit fullscreen mode

Extract Text

import com.spire.doc.Document;

        import java.io.FileWriter;
        import java.io.IOException;

public class ExtractText {

    public static void main(String[] args) throws IOException {

        //Load Word document
        Document document = new Document();
        document.loadFromFile("Island.docx");

        //Get text from document as string
        String text=document.getText();

        //Write string to a .txt file
        writeStringToTxt(text," Extracted.txt");
    }

    public static void writeStringToTxt(String content, String txtFileName) throws IOException{

        FileWriter fWriter= new FileWriter(txtFileName,true);
        try {
            fWriter.write(content);
        }catch(IOException ex){
            ex.printStackTrace();
        }finally{
            try{
                fWriter.flush();
                fWriter.close();
            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Extract text

Top comments (0)