Home News Reviews Forums Shop


file formats for document scanning.

Anything else

file formats for document scanning.

Postby dodecahedron on Sat Dec 18, 2004 9:44 pm

i was wondering if anyone knows what's the best format for scanning documents.

i want to scan some documents, basically i'm scanning them as graphics (i.e. no OCR), black & white (1 bit depth).
what's the most recommended format for doing this?
i've been using PCX till now (that's what i was told to use when i first started scanning a few years back) but i'm going to do a lot of documents soon and was wondering if there's anything better.

is PCX a compressed format? lossy or not?

i've heard of TIFF. how does it compare with PCX?

any other options that are as good / better than PCX? in terms of compression without loss of quality.



any recommendations welcome.
if anyone has a link to some page i can read up a bit about this, much appreciated
(weighed down by a lot of work so i've no time to do proper research, or search for that matter, myself... :( )

TIA
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the land of Mordor, where the Shadows lie
-- JRRT
M.C. Escher - Reptilien
User avatar
dodecahedron
DVD Polygon
 
Posts: 6865
Joined: Sat Mar 09, 2002 12:04 am
Location: Israel

Postby Justin42 on Sat Dec 18, 2004 10:24 pm

TIFs can be (losslessly) compressed. I wouldn't use PCX anymore. TIF is a pretty universal standard and can contain metadata. It's a lot more flexible in general...
Justin42
CD-RW Player
 
Posts: 723
Joined: Sat Jun 29, 2002 10:30 pm

Postby dodecahedron on Sun Dec 19, 2004 1:12 am

thanks.
do you have any idea how the size of a PCX vs. TIFF (losless compression) compare?

BTW what's metadata?
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the land of Mordor, where the Shadows lie
-- JRRT
M.C. Escher - Reptilien
User avatar
dodecahedron
DVD Polygon
 
Posts: 6865
Joined: Sat Mar 09, 2002 12:04 am
Location: Israel

Postby MediumRare on Sun Dec 19, 2004 8:11 am

I used TIFF for B&W scanning until you mentioned PCX in an earlier post and have used that since. The reason is fairly simple- I didn't fine the compression setting in my scanner software and my TIFF's were uncompressed. TIFF is a rather complex format and offers many compression forms. PCX uses simple RLE encoding. Corel Photopaint, for example, has various compression options for TIFF and these reduced an 18 kiB sample to as little as 2 kiB (CCITT4) vs 7 kiB for PCX.

G
User avatar
MediumRare
CD-RW Translator
 
Posts: 1768
Joined: Sun Jan 19, 2003 3:08 pm
Location: ffm

Postby dodecahedron on Thu Dec 23, 2004 6:12 pm

OK, here's what i found out by reading the help of my scanner drivers, software and Adobe Photoshop, and a little playing around:

an A4 sized page, handwritten (roughtly 2/3), scanned as B&W, 300dpi resolution, A4 size.

my scanner is Epson 3170.
i used the following:
the scanner driver; the Epson software bundled with the scanner; Adobe Photoshop Elements 2.0

scanner driver:
uncompressed TIFF: 1063K
compressed TIFF (CCITT group 4 compression): 28K

Epson software:
uncompressed TIFF: 1063K
PCX: 118K

Adobe Photoshop Elements 2.0:
uncompressed TIFF: 1082K
compressed TIFF (LZW compression): 90K
compressed TIFF (ZIP compression): 76K
PCX: 123K

uncompressed TIFF is huge, but the compressed versions are smaller than PCX.



a quote from the help file of Photoshop Elements 2.0:
RLE (Run Length Encoding)

is a lossless compression technique that will compress the transparent portions of each layer in images with multiple layers containing transparency.

LZW (Lemple-Zif-Welch)

is a lossless compression technique that provides the best results in compressing images that contain large areas of single color, such as screenshots or simple paint images.

JPEG (Joint Photographic Experts Group)

is a lossy compression technique that provides the best results with continuous-tone images, such as photographs.

CCITT

is a family of lossless compression techniques for black-and-white images. CCITT is an abbreviation for the French spelling of International Telegraph and Telekeyed Consultive Committee.

ZIP

encoding is a lossless compression technique. Like LZW, ZIP compression is most effective for images that contain large areas of a single color.



at this point i see no reason to go on using PCX. compressed TIFFs are smaller.
i've no idea what the differences between the verious compressions (RLE for PCX, ZIP, LZW, CCITT for TIFF) but i don't think it really matters, they're all lossless. Photoshop has a few more options for TIFFs but they're not relevant to me for such simple document scans.

@G:
you should scan using the driver (Epson Scan) not the bundled software (Epson Smart Panel -> Scan and Save). use Professinal Mode. the File Save Settings give you an options window when you select TIFF, there you can choose uncompressed or CCITT group 4 compression (for B&W documents).
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the land of Mordor, where the Shadows lie
-- JRRT
M.C. Escher - Reptilien
User avatar
dodecahedron
DVD Polygon
 
Posts: 6865
Joined: Sat Mar 09, 2002 12:04 am
Location: Israel

Postby MediumRare on Fri Dec 24, 2004 5:10 pm

dodecahedron wrote:my scanner is Epson 3170.

Congratulations! :D :D So you bought one. Hope it does well for you!
dodecahedron wrote:@G:
you should scan using the driver (Epson Scan) not the bundled software (Epson Smart Panel -> Scan and Save). use Professinal Mode. the File Save Settings give you an options window when you select TIFF, there you can choose uncompressed or CCITT group 4 compression (for B&W documents).

I do that. But I haven't done much B&W scanning and didn't see the option when I did. I'll watch for it next time (I won't be home until next year :D )

G
User avatar
MediumRare
CD-RW Translator
 
Posts: 1768
Joined: Sun Jan 19, 2003 3:08 pm
Location: ffm

Postby MediumRare on Fri Jan 14, 2005 6:46 pm

MediumRare wrote:
dodecahedron wrote:@G:
you should scan using the driver (Epson Scan) not the bundled software (Epson Smart Panel -> Scan and Save). use Professinal Mode. the File Save Settings give you an options window when you select TIFF, there you can choose uncompressed or CCITT group 4 compression (for B&W documents).

I do that.

Well I misinterpreted what you said (digesting holiday food slows the brain down :roll:).

I have generally used the Smart Panel, which only has uncompressed TIFF's as an option (I didn't read your post properly and thought you were referring to scanning via TWAIN from an external program). Tonight I tried calling Epson Scan directly as you suggested- and sure enough: there's the CCITT-4 option. :D The direct call also has a lot less overhead than the Smart Panel- you get results fast without hand-holding.

Thanks again for the info.

BTW- how do you like the scanner?

G
User avatar
MediumRare
CD-RW Translator
 
Posts: 1768
Joined: Sun Jan 19, 2003 3:08 pm
Location: ffm

Postby dodecahedron on Sat Jan 15, 2005 7:07 am

i like it very much.

till now i've done most of the scanning from within Photoshop Elements.
i think that's going to end though.
if you want to see the file you just scanned you need to close the driver (PE opens the driver), only then you can see the scanned document and edit it or whatever. see if it needs to be rescanned (i often play around with the Threshold setting of the Professional mode). and then re-open the driver for the next scan - very tiring.
i think i should just run the Epson Scan, scan to the hard drive and then i can open it in PE, while the driver is still open (not having been opened from within PE). i think what i'll do is scan directly from the Epson scan to Tiff without compressin, open it in PE and save from there with whatever compression i want the PE offers.
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the land of Mordor, where the Shadows lie
-- JRRT
M.C. Escher - Reptilien
User avatar
dodecahedron
DVD Polygon
 
Posts: 6865
Joined: Sat Mar 09, 2002 12:04 am
Location: Israel


Return to General Software Questions

Who is online

Users browsing this forum: No registered users and 4 guests

All Content is Copyright (c) 2001-2024 CDRLabs Inc.