Translates pdf documents into html format.Translates pdf files into HTML or XML formats, combined with png images. Supports encrypted pdf files.There is a program called pdftohtml to convert pdf to html file.In ubuntu gutsy this package in bundled with poppler-utils so we need to install this package.
Install poppler-utils in Ubuntu
sudo aptitude install poppler-utils
This will complete the installation
Using pdftohtml
pdftohtml Syntax
pdftohtml [options] [pdf file] [html file]
Available options
A summary of options are included below.
-h, -help - Show summary of options.
-f - first page to print
-l - last page to print
-q - don’t print any messages or errors
-v - print copyright and version info
-p - exchange .pdf links with .html
-c - generate complex output
-i - ignore images
-noframes - generate no frames. Not supported in complex output mode.
-stdout - use standard output
-zoom - zoom the pdf document (default 1.5)
-xml - output for XML post-processing
-enc - output text encoding name
-opw - owner password (for encrypted files)
-upw - user password (for encrypted files)
-hidden - force hidden text extraction
-dev - output device name for Ghostscript (png16m, jpeg etc)
-nomerge - do not merge paragraphs
-nodrm - override document DRM settings
pdftohtml Examples
pdftohtml test.pdf test.html
This command gives you a simple HTML file suitable for reading or copying the textual content of the PDF file. You can actually grab the text from your browser and paste it into other applications. It doesn’t produce any PNG files, so you won’t be able to see any embedded graphics. It’s a great utility if you just want to extract the text from an Adobe file.
If you want to see graphics, you’ll need to use the -c (as in “complex”) option:
pdftohtml -c test.pdf test.html
This option produces individual HTML files, one for each page of the PDF file, with the PNG references mixed in. The graphics in the original PDF file show up in a browser and the text part can be cut and pasted. The total size of the HTML and PNG files generated with the -c option tend to be roughly equivalent to that of the original PDF.
How to Convert PDF files to HTML files
Related Posts:
Yahoo Mail free POP access in ubuntuFor a long time, Yahoo allowed POP access only for Premium accounts, so people used custom applications, like Y!Pops to have this feature. This is not… Read More
Howto install amarok 1.4 in Ubuntu JauntySome users are not happy with amarok 2 and they want to install amarok 1.4.This tutorial will help them to install amarok1.4 version. First you nee… Read More
How to Install Mplayer in UbuntuMPlayer is a movie and animation player that supports a wide range of codecs and file formats, including MPEG 1/2/4,DivX 3/4/5, Windows Media 7/8/9, R… Read More
How to view CHM (Microsoft Compiled HTML Help) files in UbuntuMicrosoft Compiled HTML Help is a proprietary format for online help files, developed by Microsoft and first released in 1997 as a successor to the Mi… Read More
How to Create a custom keyboard shortcut in Ubuntuxbindkeys is a program that allows you to launch shell commands with your keyboard or your mouse under X Window. It links commands to keys or mouse bu… Read More
0 comments:
Post a Comment