Paper
3 May 2017 Human-machine interaction to disambiguate entities in unstructured text and structured datasets
Kevin Ward, Jack Davenport
Author Affiliations +
Abstract
Creating entity network graphs is a manual, time consuming process for an intelligence analyst. Beyond the traditional big data problems of information overload, individuals are often referred to by multiple names and shifting titles as they advance in their organizations over time which quickly makes simple string or phonetic alignment methods for entities insufficient. Conversely, automated methods for relationship extraction and entity disambiguation typically produce questionable results with no way for users to vet results, correct mistakes or influence the algorithm’s future results. We present an entity disambiguation tool, DRADIS, which aims to bridge the gap between human-centric and machinecentric methods. DRADIS automatically extracts entities from multi-source datasets and models them as a complex set of attributes and relationships. Entities are disambiguated across the corpus using a hierarchical model executed in Spark allowing it to scale to operational sized data. Resolution results are presented to the analyst complete with sourcing information for each mention and relationship allowing analysts to quickly vet the correctness of results as well as correct mistakes. Corrected results are used by the system to refine the underlying model allowing analysts to optimize the general model to better deal with their operational data. Providing analysts with the ability to validate and correct the model to produce a system they can trust enables them to better focus their time on producing higher quality analysis products.
© (2017) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kevin Ward and Jack Davenport "Human-machine interaction to disambiguate entities in unstructured text and structured datasets", Proc. SPIE 10207, Next-Generation Analyst V, 102070I (3 May 2017); https://doi.org/10.1117/12.2265825
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Databases

Analytical research

Data processing

Gold

Analytics

Crystals

Back to Top