Abstract
This experience paper describes a repeatable model developed to address a class of data quality problems encountered when converting text data to ERPs. Users often devise their own means of implementing system features not directly supported by the systems. Often they employ what are known as clear-text, free-text, or "comment" fields to support the desired features. Moving data from these fields to ERPs involves first extracting atomic data items. Unlike most data, free text is not subject to structural or practice-oriented data quality measures when it is created. This results in a range of data quality challenges ranging from typing errors to structural errors such as prime key mismatch, duplication, and other issues. In our experiences with one large government system, a number of challenges were encountered that contained enough internal differences to require the development of a more generic framework for addressing this type of problem. The specifics of the actual issues confronted are not as interesting as the lessons that can be learned from the general approach to problems of this type. The solution type developed demonstrated a positive return on investment to the government. We will discuss the challenges, the costs associated with continuing along the original path, the solution developed, and its applicability to other organizations and situations.
Reference
M. David Allen, Peter Aiken, & Susan Carter, Mary Kay Cyrus, Kathy Wade and Sid McCormac, "Extracting Data from Free Text Fields: Assuring Data Quality for ERP Implementation" ICIQ-03 Proceedings.
Access
Before viewing the entire article, please provide just a bit of information about yourself and your organization:
Check this box to opt-out of our mailing list for Data-Ed Online, our monthly data management webinars. You will always have the option to out-out of our announcements.
This information is obtained only for its intended use. Data Blueprint will not share your information with 3rd parties. Click here to view our privacy policy.




