Free Republic
Browse · Search
General/Chat
Topics · Post Article

Skip to comments.

OCR, MAC's and PC's

Posted on 06/16/2006 1:09:02 PM PDT by Snoopers-868th

I belong to a Vets' association and I attempting to put all the past Newsletters on CD in a format that is secure not allowing changes and searchable to both a MAC and a PC. Below, is the process used as a test. I am in dire need for some solutions and hope there is someone out there that has used various software to accomplish a similar result.

My dilemma:

First, I OCR scan using MS Office Document & Imaging. I save the .tif file and send the text to Word 2002.

Next, I run spellcheck on the newly created Word text to correct the errors that occurred during the optical read conversion.

Finally, I rescan the original pages that contain pictures. Using MGI PhotoSuite I am able to cut and paste the pictures in the appropriate place in my newly generated MS Word file.

Question 1:

Does anyone have an idea that will utilize the mixed data (pictures & text--Word 2002) to yield a secure, readable and searchable format for use on both the Mac or PC that can be burned to CD?

I know nothing about Adobe and sure don't want to purchase a $500 program. Further, I have no experience with Adobe. On Adobe's web-site they have an on-line subscription service (monthly fee) which looks reasonable. However, for the life of me I can't figure out if the on-line service allows conversion of my mixed Word document to PDF while retaining the search ability OR if I have to use Adobe's OCR program, OR if the on-line feature even offers the search ability in the $10/month service.

https://createpdf.adobe.com/?v=AHP

Question 2:

Is anyone familiar with Adobe and knows the answer(s)? Has anyone used this service in this manner?

Question 3:

To all you MS 2003 users, is there an option in MS 2003 Word to save a PDF file? Or is there some utility or plug-in that will let me do this?

My computer is XP-Pro w/XP Office 2002 and there is no such save ability under Word. There are numerous more options in Excel. I tried saving in .rtf (as I think MAC can read it) and 13 pages was 20,000+KB's. Wayyyyy toooo big.

I saved files as PDF files when I worked but I cannot remember how and can't find anything on it. I have spent two days searching for answers and hope one of you has a solution. Different options are welcome. Thanks


TOPICS: Computers/Internet
KEYWORDS: adobe; computer; mac; pc; software
Navigation: use the links below to view more comments.
first 1-2021-4041-50 next last
I hope I have provided enough information. LOL I didn't want to leave anything out.
1 posted on 06/16/2006 1:09:04 PM PDT by Snoopers-868th
[ Post Reply | Private Reply | View Replies]

To: Snoopers-868th
You would do best with a PC-based solution using Omnipage 15 or Omnipage 15 Professional.

Take a look at your needs between them. You would likely qualify for an upgrade of some sort, assuming your scanner came with any form of OCR package.
2 posted on 06/16/2006 1:13:58 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 1 | View Replies]

To: Snoopers-868th; N3WBI3

Tech Support ping


3 posted on 06/16/2006 1:14:02 PM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Snoopers-868th
Does anyone have an idea that will utilize the mixed data (pictures & text--Word 2002) to yield a secure, readable and searchable format for use on both the Mac or PC that can be burned to CD?

Open Office will read in MS Office files and you can then export them as PDF. OOo is free for the download.

4 posted on 06/16/2006 1:16:03 PM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 1 | View Replies]

To: ConservativeMind

Thank you. I looked at it and because I am totally unfamiliar with it I shy away. Are you familiar with it and is it easy to use. I think the Pro version was $200 and the lesser version was $99. It said a reduced price because of the MS stuff I have.


5 posted on 06/16/2006 1:18:16 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 2 | View Replies]

To: Snoopers-868th
The Abbyy OCR software included free with some scanners can output some reasonably good WYSIWYG-style OCR format in RTF format. MS Word can open the RTF format and display text and images with the original layout. I've been using it to help Mrs. HAL9000 convert her recipes to text format.

Macs have built-in PDF file generation capability in all software applications, through the Print command.

6 posted on 06/16/2006 1:21:44 PM PDT by HAL9000 (Get a Mac - The Ultimate FReeping Machine)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Snoopers-868th
About printing PDF alone from Office, you can download the free open source program called PDFCreator. Get the version with GhostScript. The current version is 0.9.1.

http://sourceforge.net/projects/pdfcreator/
7 posted on 06/16/2006 1:22:05 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 1 | View Replies]

To: Snoopers-868th
I have used earlier versions over the years. It is the easiest and most accurate of what is available.

If you have the originals as Word files, run them through PDFCreator. If you have to do any paper conversion, use Omnipage and have it export to PDF.
8 posted on 06/16/2006 1:24:34 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 5 | View Replies]

To: HAL9000

I have a Lexmark printer. Can't find what OCR I am using. How do I look for it? As I stated .rtf results in a huge file. 13 pages saved from Word was 20,000+KB's and I need to be able to save a file that is at least 40 pages long. Maybe I don't understand something?


9 posted on 06/16/2006 1:27:13 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 6 | View Replies]

To: ConservativeMind

O.K. but will it be searchable after the PDF conversion? Do you know?


10 posted on 06/16/2006 1:28:38 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 7 | View Replies]

To: ConservativeMind

What do you mean paper conversion?


11 posted on 06/16/2006 1:30:18 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 8 | View Replies]

To: Snoopers-868th
If you only have a paper version of your newsletter, you will need to scan and OCR it.

If you have an electronic version (such as a Word file), you simply need to use PDFCreator, the Adobe online routine, OpenOffice, or Adobe Distiller (which is what Adobe uses to output PDF from all Windows programs).
12 posted on 06/16/2006 1:33:20 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 11 | View Replies]

To: ShadowAce

Will they be searchable? I do not know where in the process that option is available? Is it on the original file or is it an option in PDF?


13 posted on 06/16/2006 1:34:28 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 4 | View Replies]

To: Snoopers-868th
Will they be searchable?

To be honest, I don't know. I've never tried to make a PDF searchable, either with OOo, or Adobe--and I've used both at least a little.

For a free Office suite, it's worth a loot at.

14 posted on 06/16/2006 1:36:15 PM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 13 | View Replies]

To: Snoopers-868th
If you have only paper copies, let Omnipage rescan them with its own settings, as your TIF scans may not be of optimal resolution and/or color/gray scale depth.

Remember, OCR is not 100% accurate. What can be converted will be searchable, what can't will remain a picture with the text in it for you to still view, but not search.

OCR programs do what they can to retain your original format, but they aren't perfect.
15 posted on 06/16/2006 1:37:14 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 1 | View Replies]

To: Snoopers-868th
Whatever is editable in Word is searchable.

However, I have no idea how viable MS Office Document imaging is for OCR. I would imagine it is highly inaccurate.

It's bad enough among the purchased OCR programs, and I've used them all.
16 posted on 06/16/2006 1:41:02 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 13 | View Replies]

To: ConservativeMind

I understand. The OCR program (and I don't know what it is or how to find out) is working very well with very little error. Because it is Vets the pictures have crew members names that I want to be able to search. That is why I cut the pictures as I can type in the names.

I understand the OCR process. The process I do not understand is the conversion to PDF and if PDF has some special option to make the file searchable or not. It seems so from what I have read at their web-page. And I don't know where to go ask. CompUSA etc are not users. Thank you for your input.


17 posted on 06/16/2006 1:44:20 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 15 | View Replies]

To: ConservativeMind

When I opened the Imaging it calibrated to my scanner. I discovered that the settings were to scan at 150 pixels. I was getting way too many errors. I upped it to I think around 600-800. It takes a bit of time to scan but it is near perfect. I couldn't believe it.


18 posted on 06/16/2006 1:47:43 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 16 | View Replies]

To: ConservativeMind
Whatever is editable in Word is searchable. To be sure exactly of what you mean.

Whatever is editable in Word is searchable IN PDF.

Is that what you mean? It could make a huge $$$ difference to this project.

19 posted on 06/16/2006 1:51:27 PM PDT by Snoopers-868th (Send-a-Brick.com. Send a brick to Washington and cash to Minutemen for a wall.)
[ Post Reply | Private Reply | To 16 | View Replies]

To: Snoopers-868th
As you print to PDF, there is an option (a checkbox) that can make the file searchable or unsearchable.

When viewing the newsletter in Acrobat Reader, a user can click in the search box (if they download a version that support it--older versions with search ability were separate, larger downloads) and find text within.

Also, to search all PDF files (and other files) at once, download Yahoo!'s Desktop Search program. It will find all occurrences of a term on your hard drive.
20 posted on 06/16/2006 1:51:55 PM PDT by ConservativeMind
[ Post Reply | Private Reply | To 17 | View Replies]


Navigation: use the links below to view more comments.
first 1-2021-4041-50 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson