A Semi-Automated Method of Generating Ground Truth for Invoices

School Name

Governor's School for Science and Mathematics

Grade Level

12th Grade

Presentation Topic

Computer Science

Presentation Type

Mentored

Abstract

This project came about to solve a problem in a bigger project, one in which an insurance company contracted the German Research Center for Artificial Intelligence (DFKI) to autonomize the making of medical invoices. DFKI was given access to the database necessary to recreate a set of 60 previously made invoices along with a scanned image set of those original invoices—information to create an output and information to verify the output. The project came about in the process of verifying the output; to check the values from the code, researches would have to spend valuable time looking at all of the original invoices, all of which vary in format and content, finding which information was needed, and typing out those values into a document where they could be directly used to assess the accuracy of the code. Therefore, the need to make this ground truth extracting process semi-autonomous became vital to the progress of the bigger project. The result of the research came about in a code that successfully does half of the human’s part in collecting ground truth and extracts the wanted data from the set of images into a document with fair accuracy.

Location

Neville 206

Start Date

4-14-2018 8:45 AM

Presentation Format

Oral and Written

COinS
 
Apr 14th, 8:45 AM

A Semi-Automated Method of Generating Ground Truth for Invoices

Neville 206

This project came about to solve a problem in a bigger project, one in which an insurance company contracted the German Research Center for Artificial Intelligence (DFKI) to autonomize the making of medical invoices. DFKI was given access to the database necessary to recreate a set of 60 previously made invoices along with a scanned image set of those original invoices—information to create an output and information to verify the output. The project came about in the process of verifying the output; to check the values from the code, researches would have to spend valuable time looking at all of the original invoices, all of which vary in format and content, finding which information was needed, and typing out those values into a document where they could be directly used to assess the accuracy of the code. Therefore, the need to make this ground truth extracting process semi-autonomous became vital to the progress of the bigger project. The result of the research came about in a code that successfully does half of the human’s part in collecting ground truth and extracts the wanted data from the set of images into a document with fair accuracy.