Grooper Consultant Training: Structured Data Extraction Training
August 25 - August 27$5,000
Structured Data Extraction Training
Data on structured documents generally exists in a predictable format from one document to the next. While information may change from document to document, presentation and labeling of that information is generally consistent. By no means does that mean extracting the data from them is always simple: poor form design, differences in format, inconsistencies in data formatting, and other idiosyncrasies and oddities provide challenges to extracting data from structured and semi-structured documents.
This course aims to educate users on different methods to configure data extraction for structured and semi-structured documents. This course will focus heavily on data modeling of document sets, using the Data Type extractor to target, collate, and populate results.
- Intermediate regular expressions to pattern match data structures
- Hierarchical data modeling and inheritance
- Collation methods to take advantage of spatial relationships and data manipulation
- Data sectioning to target repeating sections of similar information on a single document
- Students will analyze extraction methods to determine how best to target the data of structured and semi-structured documents
- Students will configure data models to demonstrate an understanding of hierarchical data modeling