Discussion:
Print to PDF in Firefox creates non-searchable PDF - please help
(too old to reply)
Adam
2012-03-27 20:07:39 UTC
Permalink
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox


The following was originally posted to "mozilla.support.firefox" ...

Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686

The problem does not occur with IE but I prefer to find a fix for Firefox.

Any ideas?
Ghostrider <"
2012-03-27 20:31:42 UTC
Permalink
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
Unless the user directs the Adobe PDF printer to send the
*.pdf file to a specific folder, it would end up in a default
WinXP folder in Documents and Settings. I had no problem in
printing the link as a file to a location of my choice and
then finding it.

GR
Adam
2012-03-27 20:36:07 UTC
Permalink
Post by Ghostrider <"
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
Unless the user directs the Adobe PDF printer to send the
*.pdf file to a specific folder, it would end up in a default
WinXP folder in Documents and Settings. I had no problem in
printing the link as a file to a location of my choice and
then finding it.
GR
Are you able to "search" for text in the newly generated PDF?
Paul
2012-03-27 20:49:30 UTC
Permalink
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
This is what I see.

Loading Image...

Method:

1) Firefox print to Postscript.
2) Toss PostScript in Acrobat Distiller.
3) Open in Reader. Search for the word "fundamental" and it is located.

The file did have some image content, but I think that's the logo that
was on the first page.

I can't see a good reason, for a print of a web page, to have the
"do not copy" security setting in PDF enabled. The file can be
virtually untouchable, if all the security flags are enabled.
Check the Distiller settings, and see if something in there has
broken loose. Check while viewing the document in Reader, and
see what security settings in there are properties of the
document.

Paul
Adam
2012-03-27 22:44:08 UTC
Permalink
Post by Paul
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
This is what I see.
http://img696.imageshack.us/img696/2702/searchable.gif
1) Firefox print to Postscript.
2) Toss PostScript in Acrobat Distiller.
3) Open in Reader. Search for the word "fundamental" and it is located.
The file did have some image content, but I think that's the logo that
was on the first page.
I can't see a good reason, for a print of a web page, to have the
"do not copy" security setting in PDF enabled. The file can be
virtually untouchable, if all the security flags are enabled.
Check the Distiller settings, and see if something in there has
broken loose. Check while viewing the document in Reader, and
see what security settings in there are properties of the
document.
Paul
Thanks (Guru Paul), but I am still not able to search for text in
the PDF generated from PS file in Distiller. I get the "crosshair" cursor
when
cursor is positioned over the PDF generated.

Here's the Adobe PDF Document Properties (from Print Properties) ...
Loading Image...

Here's the Adobe PDF Settings (from Distiller) ...
Loading Image...

Here's the Adobe PDF Security (from distiller) ...
Loading Image...

Where do you see "do not copy" security setting in PDF enabled?

Also, I wonder if any of the following has to do with my troubles ...
=====================================================================
The ANSI University outreach program is now being distributed through the
ANSI Site License Portal.
The following components are required to access your documents:
1. Internet Access - http://slportal.ansi.org/
2. Adobe Reader 5.0 or newer - http://get.adobe.com/reader/
3. FileOpen DRM Plug-in for Acrobat - http://plugin.fileopen.com/
=====================================================================
David H. Lipman
2012-03-27 23:18:13 UTC
Permalink
Post by Adam
Post by Paul
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
This is what I see.
http://img696.imageshack.us/img696/2702/searchable.gif
1) Firefox print to Postscript.
2) Toss PostScript in Acrobat Distiller.
3) Open in Reader. Search for the word "fundamental" and it is located.
The file did have some image content, but I think that's the logo that
was on the first page.
I can't see a good reason, for a print of a web page, to have the
"do not copy" security setting in PDF enabled. The file can be
virtually untouchable, if all the security flags are enabled.
Check the Distiller settings, and see if something in there has
broken loose. Check while viewing the document in Reader, and
see what security settings in there are properties of the
document.
Paul
Thanks (Guru Paul), but I am still not able to search for text in
the PDF generated from PS file in Distiller. I get the "crosshair" cursor
when
cursor is positioned over the PDF generated.
< snip >

Why have you not tried Firefox v11 instead of v3.6.28 ?
--
Dave
Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
http://www.pctipp.ch/downloads/dl/35905.asp
BillW50
2012-03-28 00:19:38 UTC
Permalink
Post by David H. Lipman
Why have you not tried Firefox v11 instead of v3.6.28 ?
".... because newer is not always better!"
http://www.oldversion.com/
--
Bill
Gateway M465e ('06 era) - Windows Live Mail 2009
Centrino Core2 Duo T7400 2.16 GHz - 1.5GB - Windows 8 CP
Adam
2012-03-28 00:31:53 UTC
Permalink
Post by BillW50
Post by David H. Lipman
Why have you not tried Firefox v11 instead of v3.6.28 ?
".... because newer is not always better!"
http://www.oldversion.com/
--
Bill
Gateway M465e ('06 era) - Windows Live Mail 2009
Centrino Core2 Duo T7400 2.16 GHz - 1.5GB - Windows 8 CP
Thanks! Also, ...

Some interesting stats here ...
http://gs.statcounter.com/#browser_version-ww-monthly-201109-201202-bar

Firefox 3.6 is Mozilla's Windows XP ...
http://www.zdnet.com/blog/hardware/firefox-36-is-mozillas-windows-xp/16098

The frequent unnecessary/cosmetic updates/upgrades do not seem very
appealing.
Bill in Co
2012-03-28 03:00:20 UTC
Permalink
Post by BillW50
Post by David H. Lipman
Why have you not tried Firefox v11 instead of v3.6.28 ?
".... because newer is not always better!"
http://www.oldversion.com/
And that's the understatement of the year.
(Newer is quite often worse, I might add. More bloat, more useless
features, and changing the GUI to impress some bimbos with cosmetic glitter.
Bill in Co
2012-03-28 02:58:06 UTC
Permalink
Post by David H. Lipman
Post by Adam
Post by Paul
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
This is what I see.
http://img696.imageshack.us/img696/2702/searchable.gif
1) Firefox print to Postscript.
2) Toss PostScript in Acrobat Distiller.
3) Open in Reader. Search for the word "fundamental" and it is located.
The file did have some image content, but I think that's the logo that
was on the first page.
I can't see a good reason, for a print of a web page, to have the
"do not copy" security setting in PDF enabled. The file can be
virtually untouchable, if all the security flags are enabled.
Check the Distiller settings, and see if something in there has
broken loose. Check while viewing the document in Reader, and
see what security settings in there are properties of the
document.
Paul
Thanks (Guru Paul), but I am still not able to search for text in
the PDF generated from PS file in Distiller. I get the "crosshair" cursor
when
cursor is positioned over the PDF generated.
< snip >
Why have you not tried Firefox v11 instead of v3.6.28 ?
But WHY would one want to?
Why would someone want to change something that is already fine as it is??
To change, for change's sake? Thanks, but no thanks. :-)

Sounds like the same logic as going to the newer versions of Office. More
bloat, and more useless features.
David H. Lipman
2012-03-28 10:52:01 UTC
Permalink
Post by Bill in Co
Post by David H. Lipman
< snip >
Why have you not tried Firefox v11 instead of v3.6.28 ?
But WHY would one want to?
Why would someone want to change something that is already fine as it is?? To change,
for change's sake? Thanks, but no thanks. :-)
Sounds like the same logic as going to the newer versions of Office. More bloat, and
more useless features.
Because when you print from FF v11 to PDFCreator or Adobe Printer the PDF is searchable as
Adam requires.
--
Dave
Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
http://www.pctipp.ch/downloads/dl/35905.asp
David H. Lipman
2012-03-27 21:27:25 UTC
Permalink
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
To add to Adams question, I will state my findings previously provided in
his initial query in the Mozilla Firefox news group.

If printed to Adobe Professional v9.5.0 or PDFCreator or to a PostScript
file and distilled to a PDF from Firefox v3.6.28 the PDF is rendered as a
graphic and is not searchable.

If printed to PDFCreator from Firefox v11 the PDF is searchable.

I believe this to be a FF v3.6.28 rendering issue.
--
Dave
Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
http://www.pctipp.ch/downloads/dl/35905.asp
Paul
2012-03-29 17:28:42 UTC
Permalink
Post by David H. Lipman
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
To add to Adams question, I will state my findings previously provided
in his initial query in the Mozilla Firefox news group.
If printed to Adobe Professional v9.5.0 or PDFCreator or to a PostScript
file and distilled to a PDF from Firefox v3.6.28 the PDF is rendered as
a graphic and is not searchable.
If printed to PDFCreator from Firefox v11 the PDF is searchable.
I believe this to be a FF v3.6.28 rendering issue.
After investigating a bit, it seems Firefox v3 switched to cairographics.
When Cairo runs into situations it cannot handle with simple primitives
(letter uses letter primitive, line uses line primitive, a straight mapping),
it uses bitmap rendering as a fallback. If you get a solid image going through
a PDF printer output, it could be something like that. Purely a guess, as
I can't really see in this situation, how Cairo would help. You'd be
doing something like HTML ---> Cairo ---> GDI??? ---> AdobePDFprinter ---> PDF.
I don't see how Cairo really helps in a major way. Must be missing the point.

The workaround is to try "PrintPDF" add-on, which did yield a searchable
PDF for the ucsc-extension.edu web page. Using this, adds File : Print To PDF
to the Firefox menu, after installation and a restart of Firefox.

https://addons.mozilla.org/en-US/firefox/addon/printpdf/?src=api

"PrintPDF 0.76 by Pavlov"

I actually built up (compiled from source) v3.6.28 in Visual C++ 2005 Express
on a Win2K virtual machine. (I was using that for debugging.) When I
added "PrintPDF 0.76 by Pavlov", I was seeing an Assert failure ("float
manager state") from v3.6.28. I don't expect that affects the real Firefox,
but it was curious nonetheless. I think that add-on is still worth a try. My
debug build emits output into an MSDOS-like window as it runs, and that
plus the debugger in the IDE is what I was using to watch how it works.
Too damn complicated to figure out how it works with a debugger though
(like, how the print architecture actually works, instead of my guess).

If anyone else wants to make a debug build of Firefox, this is the
"mozconfig" file I used in the mozilla-1.9.2 folder. The disable-ipc
was added to stop the build from breaking, in some code hooks for
connecting a debugger and collecting a stack trace of some sort.
The parallel compilation is set to "j1" since my virtual environment
only has one computing core.

mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/obj-@CONFIG_GUESS@
mk_add_options MOZ_MAKE_FLAGS="-j1"
ac_add_options --with-windows-version=502
ac_add_options --enable-debug
ac_add_options --enable-application=browser
ac_add_options --disable-ipc

Loading Image...

Have fun,
Paul
David H. Lipman
2012-03-29 18:28:47 UTC
Permalink
Post by Paul
Post by David H. Lipman
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
To add to Adams question, I will state my findings previously provided
in his initial query in the Mozilla Firefox news group.
If printed to Adobe Professional v9.5.0 or PDFCreator or to a PostScript
file and distilled to a PDF from Firefox v3.6.28 the PDF is rendered as
a graphic and is not searchable.
If printed to PDFCreator from Firefox v11 the PDF is searchable.
I believe this to be a FF v3.6.28 rendering issue.
After investigating a bit, it seems Firefox v3 switched to cairographics.
When Cairo runs into situations it cannot handle with simple primitives
(letter uses letter primitive, line uses line primitive, a straight mapping),
it uses bitmap rendering as a fallback. If you get a solid image going through
a PDF printer output, it could be something like that. Purely a guess, as
I can't really see in this situation, how Cairo would help. You'd be
doing something like HTML ---> Cairo ---> GDI??? ---> AdobePDFprinter ---> PDF.
I don't see how Cairo really helps in a major way. Must be missing the point.
The workaround is to try "PrintPDF" add-on, which did yield a searchable
PDF for the ucsc-extension.edu web page. Using this, adds File : Print To PDF
to the Firefox menu, after installation and a restart of Firefox.
https://addons.mozilla.org/en-US/firefox/addon/printpdf/?src=api
"PrintPDF 0.76 by Pavlov"
I actually built up (compiled from source) v3.6.28 in Visual C++ 2005 Express
on a Win2K virtual machine. (I was using that for debugging.) When I
added "PrintPDF 0.76 by Pavlov", I was seeing an Assert failure ("float
manager state") from v3.6.28. I don't expect that affects the real Firefox,
but it was curious nonetheless. I think that add-on is still worth a try. My
debug build emits output into an MSDOS-like window as it runs, and that
plus the debugger in the IDE is what I was using to watch how it works.
Too damn complicated to figure out how it works with a debugger though
(like, how the print architecture actually works, instead of my guess).
If anyone else wants to make a debug build of Firefox, this is the
"mozconfig" file I used in the mozilla-1.9.2 folder. The disable-ipc
was added to stop the build from breaking, in some code hooks for
connecting a debugger and collecting a stack trace of some sort.
The parallel compilation is set to "j1" since my virtual environment
only has one computing core.
mk_add_options MOZ_MAKE_FLAGS="-j1"
ac_add_options --with-windows-version=502
ac_add_options --enable-debug
ac_add_options --enable-application=browser
ac_add_options --disable-ipc
http://img16.imageshack.us/img16/1456/v3628running.gif
Have fun,
Very interesting information and findings. Thanx!
--
Dave
Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
http://www.pctipp.ch/downloads/dl/35905.asp
Hot-Text
2012-03-29 06:32:44 UTC
Permalink
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
I see that problem does not occur with IE8..
Look like Silicon Valley do not like Firefox..

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="X-UA-Compatible" content="IE=8">
<title>Areas of Study and Courses | UCSC Extension Silicon Valley</title>
s***@cohodata.com
2013-11-07 22:07:01 UTC
Permalink
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
It is more than a year later and this is still a problem.

Can anyone explain how to solve this issue? That is, when printing some websites to PDF, FF/Adobe Distiller renders all the pages as images, rather than text.

It only happens with some pages. For instance, this page renders as text (and is searchable) but a page like, http://www.ehow.com/how_4558279_prepare-tear-sheet.html renders entirely as an image (and is NOT searchable).

If I print to PDF (via Adobe Distiller) in another browser (i.e. Chrome) not only does it leave the text as text (and searchable) but it looks better too. Firefox v24 doesn't even render the aforementioned page correctly!

Anyone?
David H. Lipman
2013-11-08 11:37:03 UTC
Permalink
Post by s***@cohodata.com
Post by Adam
Host OS: Ubuntu 10.04 LTS
Guest OS: Windows XP Pro SP3 (via VirtualBox)
Browser: Firefox 3.6.28
PDF Writer: Adobe Acrobat 8 Professional / PDF Plug-In for Firefox
The following was originally posted to "mozilla.support.firefox" ...
Print to PDF (of some web pages) in Firefox creates non-searchable PDF.
Here's a problem link ...
http://course.ucsc-extension.edu/modules/shop/index.html?action=section&OfferingID=1532219&SectionID=5270686
The problem does not occur with IE but I prefer to find a fix for Firefox.
Any ideas?
It is more than a year later and this is still a problem.
Can anyone explain how to solve this issue? That is, when printing some websites to PDF,
FF/Adobe Distiller renders all the pages as images, rather than text.
It only happens with some pages. For instance, this page renders as text (and is
searchable) but a page like, http://www.ehow.com/how_4558279_prepare-tear-sheet.html
renders entirely as an image (and is NOT searchable).
If I print to PDF (via Adobe Distiller) in another browser (i.e. Chrome) not only does
it
leave the text as text (and searchable) but it looks better too. Firefox v24 doesn't
even
render the aforementioned page correctly!
Anyone?
Since Acrobat 8 was EoL'd long ago, switch to Acrobat Professional 11 (aka; Acrobat XI).
--
Dave
Multi-AV Scanning Tool - http://multi-av.thespykiller.co.uk
http://www.pctipp.ch/downloads/dl/35905.asp
Loading...