Skip to content

January 31, 2015

Notes on exporting HTML document with images to Word on Linux

by Joe Kuan

We have a web application (under Ubuntu webserver) with a WYSIWYG html editor which let users to create a report template with custom tags. These custom tags are then resolved to technical graph images and embedded into the report. This application allows users to

  1. preview the report with graphs on the web browsers
  2. schedule to generate the report which then export to Word document and deliver via email attachment.

In order to generate the HTML document, the obvious choice is to format the IMG tag with embedded base64 image content, especially for preview purpose. So that we can easily send the whole HTML document to client web browsers without worrying how to resolve the IMG src paths under the document root. However, this approach raises another issue. Currently, none of the convert tools in Linux such as Abiword, Libreoffice, OpenOfficewkhtmltopdf can fully export HTML document with embedded images, i.e. the images are missing when opened in MS Word (even directly open the HTML document in MS Word, it won’t show any images). This type of HTML documents is only supported by the web browsers.

The only alternative is to construct the HTML document with IMG tags in relative path. This approach works much better with Abiword that the exported Word document shows the images in MS Word. Even MS Word can display this type of HTML document with images.

As a result, we create two versions of HTML documents with different type of IMG tags. We use embedded IMG for preview purpose and IMG src link for export to other documents and archive purposes.

Read more from Web

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Note: HTML is allowed. Your email address will never be published.

Subscribe to comments

%d bloggers like this: