Sakhr OCR - Gold Edition for Arabic - Pashto, Dari, Persian & Jawi Add-On

Sakhr OCR - Gold Edition for Arabic - Pashto, Dari, Persian & Jawi Add-On
Buy now for $1495.00

We ship worldwide. Bulk and academic discounts available -- contact us for pricing.
Ships in 5 - 7 business days.

Sakhr OCR is widely considered to be the best product available for Arabic and other middle eastern script languages.
Automatic reader features Supported Image Formats
Automatic Reader uses Scan Soft engine for loading and handling images. The engine supports B/W, 4-bit and 8-bit grayscale, 24-bit true-color, and 8 bit palette-color images. The resolution of the image must be in the range of 75-2400 dpi, and the width/height of the image should not exceed 6600 pixels. For the best OCR results, the image is recommended to be B/W with 300 dpi of resolution
All images (*.png; *.pdf, *.tif ;*.tiff; *pcx; *.bmp; *.dcx; *.jpg; *.jpeg; *.gif) Supported Document Saving Formats:
Automatic Reader saves the recognition results in the following formats: ART (Automatic Reader format), HTML, XML, TXT, RTF and searchable PDF format. Results can be also sent via e-mail.
Multi Languages Support
Automatic Reader supports a large number of languages. It can recognize them via its many recognition tools.
Support Arabic, European languages: English, Danish, Dutch, Finnish, French, German, Italian, Norwegian, Portuguese, Spanish, Swedish, Hungarian, Greek, Russian, Polish, Turkish and Czech.

Simple Elegant Interface
Automatic Reader 11 has a very convenient intuitive interface and is very easy to use. It allows even inexperienced users to navigate and perform the program functions easily and quickly.

Enhancements in Image Preprocessing
The program comes with powerful features that improve image preprocessing. This particularly enhances recognition accuracy for low-quality documents.

Simplified Font Learning Process
You can easily learn non-standard fonts to be used in recognition. Automatic Reader 11 presents these operations in a new much more simple design.

Increased Fax Documents Accuracy.
The program offers enhancements that ensure improved reading of low-quality fax documents.

Auto detection for Recognition Engine
Automatic Reader 11 delivers a feature that automatically detects the recognition engine. It reads internally the document and decides on the proper engine to use. This advanced technology makes detection easy for users who are not capable of selecting the suitable recognition engine.

Recognizing Bilingual Images
Automatic Reader can recognize bilingual image files (Arabic/English, Farsi/English and Arabic/French), even if both languages were in one page.

Recognizing Images with Diacritics
Image files containing diacritics with the characters can be recognized, either by using the recognition engines or the fonts library.

Recognizing Images with Broken Words
Automatic Reader can recognize an image file containing broken words. Recognizing Images with Stuck Characters

Automatic Reader can recognize an image file containing stuck characters, whether they are Arabic or English.

Enhanced Corrector
Automatic Reader uses an enhanced bilingual corrector to spell check the recognized text contents.

Enhanced Accuracy for Modern Word Fonts
The program added learning for new fonts used in Microsoft Word, the fact that increases the recognition quality by 15% for these new fonts.

Enhanced Text Post-Processing
The program comes with a new NLP technology that will improve the recognition accuracy. The technology is based on a larger database that contains the most popular Arabic words and that will help in correcting the user�fs mistakes.

Enhanced Accuracy for Recognition
This new version offers unmatched OCR recognition accuracy. This is done through a new quality validation algorithm that will improve recognition depending on the character shape.
Enhanced Auto detection for Recognition Engine In this version, auto detection has been improved for the recognition engine.

Engines and languages supported
Automatic reader has three engines listed below:
1. Arabic Engine: If the contents is only Arabic, please use the ARABIC engine.
2. Latin Engine: If the contents is Latin, please use only the Latin engine.
3. Bilingual Engine: If the contents is mixed (Arabic/English or Arabic/French) please use the bilingual engine.

Product ID: 502917

Category: OCR

Supporting languages: Dari, Farsi (Persian), Jawi, Pashto, Persian (Farsi)

Platforms/media types: Windows

Specifications: Installation

Hardware Requirements (Minimum)
* Intel(R) (4) Core(TM) i5 CPU 2.53GHz.
* 8 GB RAM.
* CD ROM Driver.
* 4 GB space on the hard disk.
* USB Slot for Wibu dongle connectivity.
* Scanner: any scanner supporting Twain or ISIS protocol.

Protection
The dongle given with the Automatic Reader V11.0 package must be properly plugged onto your machine to proceed correctly with the installation and running the engine.
If the environment is virtualized, dongle plugging will depend on the software used for virtualization, if it is VMware workstation, so dongle is supported directly, but if the software used is Microsoft hyper v or VMware vSphere or any other software that does not support dongles, it will need workaround (thirdparty application or hardware device) to emulate the dongle.
Sakhr recently developed a version that supports CryptKey protection (Serial Number), but this version is provided only to government sector

Basic information on OCR:
A simple idea about OCR can be shown by comparing scanned images versus electronic text. Scanned images are obtained through scanners which are much like copy machines. Scanners translate the scanned page into a grid or a map of millions of dots. The scanner wizard assigns a value for the dot which is known as "bit" in computers. For black and white scanners, the value will be either "0" representing the empty dots, or "1" representing the full dots. The number of dots forming the page map depends on the so called scanner resolution. This map of dots looks like photographs. Any modification depends only on the dot level, where you can change colors. Text characters are assigned Identity codes, which are commonly known as ASCII codes. Different code sets can be assigned to characters and are referred to as Code pages. All word processors, spread sheets, databases, and other Text processing systems, basically manipulate these text codes. Modification in recognized texts is thus possible on the character level rather than on the dot level as in images. OCR is thus the process of converting the bitmap of a scanned page containing text to text codes (ASCII). At a first novice glance, the OCR process might seem simple, when compared to human reading. In computer domain, however, OCR is a sophisticated heavy application