DEV Community

asposewords
asposewords

Posted on

5 4

Extracting images from a document using Aspose.Words Cloud API (C# / .NET)

The sample code below discusses a few methods for extracting existing images from a DOCX, or other document file, using the Aspose.Words Cloud API.

You can connect directly and use the Aspose.Words REST API but to make the code simpler we are looking at the Aspose.Words Cloud SDK for .NET. There are also SDKs for other languages.

Before we can extract image data from a document file file using the Cloud API the file needs to be available in the Cloud Storage. The sample code below assumes the document has been uploaded. See Uploading a document to Cloud Storage (C# / .NET).

The sample code relies on having the Storage and Words SDKs both downloaded (either as code or DLLs) and referenced via the C# project.

My solution in Visual Studio looks like this:

Solution

The code below shows a sample ExtractImages method.

1) Sets up a connection to the WordsApi by passing in your AppSid and AppKey obtained from the Cloud Dashboard.

2) In the next step I have created a GetDocumentRequest and called the GetDocument method to just verify that the file exists in Storage and can be opened by the API.

The Console.WriteLine call that follows just shows accessing properties of the document … SourceFormat in this case.

The simple example below just has a for / next loop to grab each image based on an index. A better approach might be to call the GetDocumentDrawingObjects method to retrieve a list of drawing objects in the document if you need to retrieve other information related to the image.

3) Creates a GetDocumentDrawingObjectImageDataRequest object with the details of the image we want (based on index) and then

4) We call GetDocumentDrawingObjectImageData to open a Stream to the object

5) Saves the Stream to a file (with a unique name based on the index).

It then loops back to 3) to grab the next image.

The code will save image data as PNG files.

In the sample DOCX file that I used there is a SmartArt object that can be seen by the GetDocumentDrawingObjectImageData call but which is not an image … and so generates an error and is skipped by this sample code.

As an alternative to the above call you can use GetDocumentDrawingObjectByIndex to specify the format of the returned image.

using System;
using System.IO;
using Aspose.Words.Cloud.Sdk;
using Aspose.Words.Cloud.Sdk.Model;
using Aspose.Words.Cloud.Sdk.Model.Requests;
namespace AsposeSamples
{
public class ExtractImageSample
{
private bool ExtractImages(string appSid, string appKey, string extractFolder,
string remoteFileName, string remoteFolderName, string storageName)
{
bool ok = false;
// 1 - Connect to API
var wordsApi = new WordsApi(appKey, appSid);
try
{
// 2 - Retrieve info about existing document (if it exists)
GetDocumentRequest request = new GetDocumentRequest(remoteFileName, remoteFolderName, storageName);
DocumentResponse exampleDocument = wordsApi.GetDocument(request);
if (exampleDocument == null)
Console.WriteLine("NULL returned from GetDocument");
else
{
Console.WriteLine($"Example document available in storage - format: {exampleDocument.Document.SourceFormat}");
// Simple example ... just cycle through and extract all images based on index and save to file ...
Console.WriteLine("Saving images based on index ...");
for (int imageIndex = 0; imageIndex < 99; imageIndex++)
{
try
{
// 3 - Get a pointer to an image based in index
GetDocumentDrawingObjectImageDataRequest imageRequest =
new GetDocumentDrawingObjectImageDataRequest(remoteFileName, imageIndex, remoteFolderName, storageName);
// 4- open a stream to the image
Stream imageStream = wordsApi.GetDocumentDrawingObjectImageData(imageRequest);
string saveFile = "imageViaIndex" + imageIndex + ".png";
saveFile = Path.Combine(extractFolder, saveFile);
Console.WriteLine($"\tSaving image {imageIndex} to file: {saveFile} ...");
// 5 - save the image to a file
SaveFileStream(saveFile, imageStream);
}
catch (Exception fetchEx)
{
if (fetchEx.Message.ToLower().Contains("not found"))
break;
Console.WriteLine($"Error fetching image via index: {fetchEx.Message}");
}
}
ok = true;
}
}
catch (Exception ex)
{
Console.WriteLine($"Error getting document: {ex.Message}");
}
return ok;
}
private static void SaveFileStream(String path, Stream stream)
{
// Save image to file ...
var fileStream = new FileStream(path, FileMode.Create, FileAccess.Write);
stream.CopyTo(fileStream);
fileStream.Dispose();
}
}
}

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay