The government organization responsible for the IT support of many large scale government project, was looking for a new, efficient forms processing solution that not only can process tens of thousands of forms in a short time frame, but can automate the validation part of the processing, and therefore improve and speed up the whole processing. In this specific project, the forms were partly typed, partly handwritten, multiple-page data summaries created at regional locations. Previously, these forms were then sent to regional offices, were they were scanned and forwarded to the central office in Budapest, were the forms and the results were going through a fully manual, full page validation (all numbers and calculations needed to match and properly add up to the regional total, regional totals and the central total). This type of validation was slow, error-prone and required massive human resources. This is why our client was looking for a solution where the validation of tens of thousands of forms can be automated reducing the chance for errors and the need for human resources (therefore lowering operational costs).
The main obstacles of the project, beside the actual accurate automated validation (processing using character recognition) of the typed and handwritten forms containing sensitive information, were the large number of forms and the very short time-frames for the preparation, the solution fine-tuning and the actual forms processing.
- Speed up and Automate form processing and data validation
- Only a few days to plan, prepare and fine-tune the solution for the form templates
- 48 hours to process app. 25.000 forms (55.000 pages)
- 7 form types with over 35 different form variations
- Up to 120 fields/page
The project was delivered using MPS IntelliVector, – our award winning product, that combines automated recognition (OCR/IRC) with microtasking data entry to process forms and documents, with massive reductions in necessary human resources and operating costs, while maintaining high level of data accuracy and data confidentiality.
Initially there were a total of 35 variations of 7 different types of forms containing letters and numbers, some typed, some handwritten, and some – the combination of both. MPS received the form templates for fine-tuning MPS IntelliVector only a few days before the election. It took approximately 2-3 hours to “teach” the solution to process each form template including setting up the rules, how to break down each specific form into small pieces of information (microtasks) that needs to be processed (up to 120/page). During the processing these microtasks were automatically processed by a recognition engine and sent out to Data Entry (DE) users to type in. Then the two results were cross-checked for result validation, leaving literally only a few microtasks for Quality Control. The Data Entry users used a simplified web-interface, where they received only the small microtasks, instead of the full scanned images. This means a faster and much more secure manual processing, since users receive only small, anonymous pieces of information out of their original context.
For the verification a team of Data Entry and Quality Users were set up in the Budapest HQ, where all the scanned forms from the regional centers were coming in. Microtasking allows for faster manual data entry, so, compared to the previous solution, a much smaller team was assembled in the HQ. 15 Data Entry users and 6 Quality Control users, compared to the previously planned team of 100 people. Usually MPS IntelliVector requires less Quality Control users (1-2 for every 100 Data Entry users), but in this case, the client specifically asked for 2 additional levels of Quality Control.
Workflow with MPS IntelliVector:
The forms were manually processed (all necessary data calculated) at local station and sent to the regional offices. They were scanned there using portable, high volume scanners, with the scanned images securely uploaded to Budapest central server for result verification. At the HQ MPS IntelliVector broke these images up into small microtasks, which were first automatically processed (OCR/ICR) then manually typed in by two separate Data Entry User (DE1 and DE2). The OCR/ICR + DE1 + DE2 results were then cross-checked to see if they matched (depending on the confidentiality level 2 or 3 matching results were needed), then matched against the results of the first count at the local stations, filtering out calculation errors. In case of mismatch the result were sent to Quality Control users, who were able to decide based on the full page scanned images or request the re-examination of the forms at the regional centers.
All in all, MPS IntelliVector was able to automate the previously fully manual validation process, improve processing times, significantly reducing human resources and operating cost, while guaranteeing high level data accuracy and data security.
- 27 000+ forms (55 000 pages, app. 1 170 000 microtasks) processed
- All forms processed 25% faster (36 hours, instead of the available 48h)
- 79% less human resources used (21 instead of the planned 100)
- Improved level of security, no data leakage