Can I automatically set all pictures to be enabled for text search

T

TonyG

When I import a picture (I am using my scanner), the picture is not
automatically enabled for searching. I have to select the popup menu and
select enable for English. Is there a setting to have this scanning
automatcally done?

Tony
 
P

Patrick Schmid

You are referring to ON 2007?
Pictures should be automatically searchable when inserted. Have you left
ON open for a night after you inserted a picture and then tried to
actually search for a term in it?

Patrick Schmid
 
A

Alex

are you inserting your scans as pdf? if so, there is a bug in beta 2 (fixed
now) where sanned pdf's will have no ocr data unless you run ocr manually.
if there is a way to scan to a different format, just while you have beta2,
then onenote should ocr them automatically.

-Alex
 
J

Jonathan

Alex,
Will any scans placed in OneNote between now and a "refresh" that addresses
this be OCR'ed *later* to catch up? That is, if I place them now, will
OneNote see tham as *needing OCR and indexing* when the feature is fixed inb
the code that I have on my machine? I use beta2 as released in later May.

I have to do a lot of scanning this summer before school starts and I was
planing to use pdf. Or will this have to wait until February 2007?
 
T

TonyG

I am using the "Insert/Pictures.../From Camara or Scanner". I don't know if
that is scanning as PDF. All the scans (embedded pictures) that I had in
onenote 2003 were indexed. But new scans are not. So this is probably the
case.

Can you tell me how to scan as PDF or other (where is the option)?

Thanks.
 
J

Jonathan

Tony,
Apologies for such a late answer; you did not show an e-mail address I could
use for s direct response. Perhaps you have fouind the answer already:
scanning to pdf depends on the scanner interface software and its particular
feature set offered. Some scanner software can take the raw scan right to pdf
some process that later from the native TIFF scan they generate with the
hardware-software interface default.

So you could buy software after the fact that would convert all the existing
scans. Of course Adobe will sell you this is a package but it's pricy.
(Although it has many tools for maging pdf's. Check out forums on
www.pdfplanet.com (I hope I have the url remembered correctly; search for
"pdfplanet" if I do not.)

I hope this is helpful.

Jonathan

Jonathan
 
R

Rainald Taesler

Alex said:
are you inserting your scans as pdf? if so, there is a bug in
beta 2 (fixed now) where sanned pdf's will have no ocr data
unless you run ocr manually. if there is a way to scan to a
different format, just while you have beta2, then onenote should
ocr them automatically.

How about PDFs having undergone OCR in Acrobat (Paper Capture)
already?
How does OneNote handle that?

Rainald
 
S

Stick

I, too, would really like to know the answer to this question. Alex, could
you please respond when you get a chance. Thanks. -PBW
 
G

Grant Robertson

I, too, would really like to know the answer to this question. Alex, could
you please respond when you get a chance. Thanks. -PBW

I would imagine that you could force it to. This is because it is not
OneNote that is doing the indexing. The indexing work is being done by
the Windows Desktop Search. WDS has an option to force it to reindex
everything from scratch. So you could always force WDS to reindex
everything and this would include any documents you had previously
printed to OneNote 07.
 
P

Patrick Schmid

I would actually assume that it won't do the OCR again. I think OCR is
OneNote and it is only run when you insert a picture. WDS only indexes
the text OCRed by ON.

Patrick Schmid
 
A

Alex

Sorry to be late in responding. The answer unfortunately is No. The Beta2
build thinks it has correctly OCR'd those scanned pdf's, so unfortunately
newer builds will not re-OCR them automatically. When you receive a newer
build, you will either have to re-insert the pdf's, or manually run OCR on
the older images.
 
A

Alex

OneNote should be able to correctly OCR those.

Rainald Taesler said:
How about PDFs having undergone OCR in Acrobat (Paper Capture)
already?
How does OneNote handle that?

Rainald
 
R

Rainald Taesler

... I think OCR is OneNote and it is only run when you insert a
picture.

How about things *printed* to ON?
WDS only indexes the text OCRed by ON.

Is not stuff sent to the ON printing engine "OCRed" by ON.

On my desktop (which has no problems with WDS as has my TabletPC) I
imported a number of things (f.e. lists from PDF-files) into ON by
sending them to the ON printer. I gave it enough time to index the
stuff (several hours).
The search results were simply horrible.

What might ne wrong in so far?

Rainald
 
R

Rainald Taesler

Thanks, Alex!

BTW: How can one run OCR "manually" in ON?
I could not find an item in the menus.
And the Online Help does not even know the akronym OCR. [siiigh]

Rainald
 
P

Patrick Schmid

... I think OCR is OneNote and it is only run when you insert a
How about things *printed* to ON?
Printouts are pictures :) So yes, they are of course OCRed.

Is not stuff sent to the ON printing engine "OCRed" by ON.

On my desktop (which has no problems with WDS as has my TabletPC) I
imported a number of things (f.e. lists from PDF-files) into ON by
sending them to the ON printer. I gave it enough time to index the
stuff (several hours).
The search results were simply horrible.

What might ne wrong in so far?
Right-click on one of those printouts and select copy all text. Then
paste the text somewhere else and take a look.
You can see for yourself then whether the quality of the OCR is the
issue or WDS.

Patrick Schmid
 
P

Patrick Schmid

Right-click, Make Text in image searchable.

Patrick Schmid
--------------
http://pschmid.net

Thanks, Alex!

BTW: How can one run OCR "manually" in ON?
I could not find an item in the menus.
And the Online Help does not even know the akronym OCR. [siiigh]

Rainald

Alex said:
OneNote should be able to correctly OCR those.
 
R

Rainald Taesler

Printouts are pictures :)

Unfortunately Yes.
AFAICS this construction was not the best of all possible solutions.
An *Import* feature working on known file-formats IMO would have been
a preferable solution. In the case of PDFs f.e. an instrument as used
in "Abbey PDF-Transformer" (which produces really fine formatted
output to WinWord [AFAICS based on Abbey's expertise of OCR software])
would have been ways better than sending text through a printer and
then re-cerate text by OCR. This seems a bit crazy to me.
So yes, they are of course OCRed.

But to which result?
A really bad one! (see below)

Right-click on one of those printouts and select copy all text.
Then paste the text somewhere else and take a look.
You can see for yourself then whether the quality of the OCR is
the issue or WDS.

Thanks for the suggestion!
It reveals how badly OCR is implemented in ON.

ON's OCR is the culprit, not WDS.

I.
1.) Sorry to say so: The OCR produces output hardly usable for a
search.
Unfortunately I cannot make any attachments, so pls permit longer
input here:

a) Result of Copy+Paste in Acrobat:0-110 Polizei 367 E106 Blessing, Peter, Dr.
0-112 Feuerwehr KÜN-190/156 C206 Bleyel, Bernd
0-19222 Rettungsleitstelle 318/467 D040 Bluthardt, Christian
A 221 A214 Bochert, Ralf, Dr.
367 E106 Ahrens, Uwe, Prof. 230/281/285 A304 Boelke, Klaus, Dr.
0-579796 A014 AISEC 393 E141 Boese, Jürgen, Dr.
263/264 B026 Akademisches Auslandsamt 280 C040 Böhm, Hugo
375 Y104 Albrecht, Tobias 326 C009 Bossack, Sandra
KÜN-137 A406 Albrecht, Wolfgang, Dr. 202 B007 Böttcher, Michael
432 F015 Asche, Gerd 449 Y006 Bouché, Daniel
207 A011 Asta (0-251460) 90 A013 Braner, Hannelore
0-506348 A012 Asta HN (Fax) 554 B001b Bräsel, Martina
KÜN-155 C105a Asta KÜN (KÜN-544756) 430 Z005 Bray, Laurent, Dr.
KÜN-53078 C105a Asta KÜN (Fax) KÜN-218 D110.1 Brazel, Christa
KÜN-208 A117 Auerbach, Achim 218 A204 Brecht, Ulrich, Dr.
288 C035 Aufenthaltsraum KÜN-211 D013.1 Breitenbacher, Manuel
640 A Aufzug 1-3 KÜN-166/167 C016 Breitkreuz, Ehrenfried
641 B Behindertenaufzug 260 B023 Brnic, Sonja
644 D Aufzug 321 D110 Brückner, Hans
646 E Aufzug 384 F129 Bucher, Georg, Dr.
645 F Aufzug 221 A214 Buer, Christian, Dr.
403 F222 Auth, Werner, Dr. KÜN-252 D219 Burk, Uwe
<<
Words are separated by blanks. Easy to be indexed and used in a
search.

b) Copy+Paste from ON (input from PDF via ON printer)0-110Polizei 367E106Blessing, Peter, Dr.
0-112Feuerwehr KÜN-190/156C206Bleyel, Bernd
0-19222Rettungsleitstelle 318/467D040Bluthardt, Christian
A 221A214Bochert, Ralf, Dr.
367E106Ahrens, Uwe, Prof.230/281/285A304Boelke, Klaus, Dr.
0-579796A014AISEC393E141Boese, Jürgen, Dr.
263/264B026Akademisches Auslandsamt280C040Böhm, Hugo
375Y104Albrecht, Tobias 326C009 Bossack, Sandra
KÜN-137A406Albrecht, Wolfgang, Dr.202B007Böttcher, Michael
432F015Asche, Gerd 449Y006Bouché, Daniel
207A011Asta (0-251460)90A013Braner, Hannelore
0-506348A012Asta HN (Fax)554B001bBräsel, Martina
KÜN-155C105aAsta KÜN (KÜN-544756)430Z005Bray, Laurent, Dr.
KÜN-53078C105aAsta KÜN (Fax)KÜN-218D110.1Brazel, Christa
KÜN-208A117Auerbach, Achim218A204Brecht, Ulrich, Dr.
288C035Aufenthaltsraum KÜN-211D013.1Breitenbacher, Manuel
640AAufzug 1-3KÜN-166/167C016Breitkreuz, Ehrenfried
641BBehindertenaufzug260B023Brnic, Sonja
644DAufzug321D110Brückner, Hans
646EAufzug384F129Bucher, Georg, Dr.
645FAufzug221A214Buer, Christian, Dr.
403F222Auth, Werner, Dr. KÜN-252D219Burk, Uwe
<<
Separation of words only if following comma +blank (", ").

2.) I'm sure that you'll agree that a search cannot work at all with
text-materiel like that.
MOST URGENT fix needed.

II.
Words separated by comma+blank are found in the search.
If there are multiple hits on a page the hits are not shown on the
list.

III.
As we are at it:
The search engine implemented in ON could be at least a bit better.
There are no options at all, neither using truncated search
(wildcards), nor a combined search using the Boolean algebra.
I would have expected that at least an "expert mode" would be provided
and at least something like Acrobat offers would be available in ON
(not talk about askSam's features).

Although I would prefer to have things from PDFs in ON, I guess that
in order to be able to perform intelligent searches I will have to
stick with Acrobat for PDFfed material and askSam for other material
[siiiigh]

Rainald
(who is seriously disappointed)
 
R

Rainald Taesler

Patrick Schmid said:
Right-click, Make Text in image searchable.

Thanks!
Had found it myself in the meantime.
Sometimes things are rather easy but not obvious [siiigh].

I really wonder why there is nothing in the menus.
Could it be due to the new philosophy of the Office design about which
you complained in an older thread?
It's a basic rule of GUI design that each and any operation can be
called by keystrokes too (not only by a mouse click). There are quite
some people still using a keyboard without the "Windows key". What are
they supposed to do?

Rainald
 
P

Patrick Schmid

Does it use German for you? Right-click and select make searchable. Is
German selected? OCR is in general extremely language dependent and if
English is selected there I am not surprised about this outcome.

Patrick Schmid
--------------
http://pschmid.net

Printouts are pictures :)

Unfortunately Yes.
AFAICS this construction was not the best of all possible solutions.
An *Import* feature working on known file-formats IMO would have been
a preferable solution. In the case of PDFs f.e. an instrument as used
in "Abbey PDF-Transformer" (which produces really fine formatted
output to WinWord [AFAICS based on Abbey's expertise of OCR software])
would have been ways better than sending text through a printer and
then re-cerate text by OCR. This seems a bit crazy to me.
So yes, they are of course OCRed.

But to which result?
A really bad one! (see below)

Right-click on one of those printouts and select copy all text.
Then paste the text somewhere else and take a look.
You can see for yourself then whether the quality of the OCR is
the issue or WDS.

Thanks for the suggestion!
It reveals how badly OCR is implemented in ON.

ON's OCR is the culprit, not WDS.

I.
1.) Sorry to say so: The OCR produces output hardly usable for a
search.
Unfortunately I cannot make any attachments, so pls permit longer
input here:

a) Result of Copy+Paste in Acrobat:0-110 Polizei 367 E106 Blessing, Peter, Dr.
0-112 Feuerwehr KÜN-190/156 C206 Bleyel, Bernd
0-19222 Rettungsleitstelle 318/467 D040 Bluthardt, Christian
A 221 A214 Bochert, Ralf, Dr.
367 E106 Ahrens, Uwe, Prof. 230/281/285 A304 Boelke, Klaus, Dr.
0-579796 A014 AISEC 393 E141 Boese, Jürgen, Dr.
263/264 B026 Akademisches Auslandsamt 280 C040 Böhm, Hugo
375 Y104 Albrecht, Tobias 326 C009 Bossack, Sandra
KÜN-137 A406 Albrecht, Wolfgang, Dr. 202 B007 Böttcher, Michael
432 F015 Asche, Gerd 449 Y006 Bouché, Daniel
207 A011 Asta (0-251460) 90 A013 Braner, Hannelore
0-506348 A012 Asta HN (Fax) 554 B001b Bräsel, Martina
KÜN-155 C105a Asta KÜN (KÜN-544756) 430 Z005 Bray, Laurent, Dr.
KÜN-53078 C105a Asta KÜN (Fax) KÜN-218 D110.1 Brazel, Christa
KÜN-208 A117 Auerbach, Achim 218 A204 Brecht, Ulrich, Dr.
288 C035 Aufenthaltsraum KÜN-211 D013.1 Breitenbacher, Manuel
640 A Aufzug 1-3 KÜN-166/167 C016 Breitkreuz, Ehrenfried
641 B Behindertenaufzug 260 B023 Brnic, Sonja
644 D Aufzug 321 D110 Brückner, Hans
646 E Aufzug 384 F129 Bucher, Georg, Dr.
645 F Aufzug 221 A214 Buer, Christian, Dr.
403 F222 Auth, Werner, Dr. KÜN-252 D219 Burk, Uwe
<<
Words are separated by blanks. Easy to be indexed and used in a
search.

b) Copy+Paste from ON (input from PDF via ON printer)0-110Polizei 367E106Blessing, Peter, Dr.
0-112Feuerwehr KÜN-190/156C206Bleyel, Bernd
0-19222Rettungsleitstelle 318/467D040Bluthardt, Christian
A 221A214Bochert, Ralf, Dr.
367E106Ahrens, Uwe, Prof.230/281/285A304Boelke, Klaus, Dr.
0-579796A014AISEC393E141Boese, Jürgen, Dr.
263/264B026Akademisches Auslandsamt280C040Böhm, Hugo
375Y104Albrecht, Tobias 326C009 Bossack, Sandra
KÜN-137A406Albrecht, Wolfgang, Dr.202B007Böttcher, Michael
432F015Asche, Gerd 449Y006Bouché, Daniel
207A011Asta (0-251460)90A013Braner, Hannelore
0-506348A012Asta HN (Fax)554B001bBräsel, Martina
KÜN-155C105aAsta KÜN (KÜN-544756)430Z005Bray, Laurent, Dr.
KÜN-53078C105aAsta KÜN (Fax)KÜN-218D110.1Brazel, Christa
KÜN-208A117Auerbach, Achim218A204Brecht, Ulrich, Dr.
288C035Aufenthaltsraum KÜN-211D013.1Breitenbacher, Manuel
640AAufzug 1-3KÜN-166/167C016Breitkreuz, Ehrenfried
641BBehindertenaufzug260B023Brnic, Sonja
644DAufzug321D110Brückner, Hans
646EAufzug384F129Bucher, Georg, Dr.
645FAufzug221A214Buer, Christian, Dr.
403F222Auth, Werner, Dr. KÜN-252D219Burk, Uwe
<<
Separation of words only if following comma +blank (", ").

2.) I'm sure that you'll agree that a search cannot work at all with
text-materiel like that.
MOST URGENT fix needed.

II.
Words separated by comma+blank are found in the search.
If there are multiple hits on a page the hits are not shown on the
list.

III.
As we are at it:
The search engine implemented in ON could be at least a bit better.
There are no options at all, neither using truncated search
(wildcards), nor a combined search using the Boolean algebra.
I would have expected that at least an "expert mode" would be provided
and at least something like Acrobat offers would be available in ON
(not talk about askSam's features).

Although I would prefer to have things from PDFs in ON, I guess that
in order to be able to perform intelligent searches I will have to
stick with Acrobat for PDFfed material and askSam for other material
[siiiigh]

Rainald
(who is seriously disappointed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top