Named Entity Recognition assignment

I tried out a few of the Europe-DH texts in NER (Netherlands, Spain) and found that it worked pretty well: apart from missing a few organization acronyms and location abbreviations or variants (e.g. “U.S.” or “Italian”), it does capture and categorize most of the entity types that it promises. From a visualization perspective, though, it’s a little limited: there doesn’t seem to be an easy way to have the tool display certain entity-types in isolation or clusters of entities that are close to each other.

I wouldn’t say that I would be able to get a “basic understanding” of these articles merely by looking at the highlighted text. At best, I’d get a quick glance at different kinds of DH projects underway in these various countries, associated with particular organizers and cities.

OlympicBios-NER.png

When I entered the text from our Olympic Artist bios, to see what NER could contribute to our project, I found it similarly effective at delivering on its stated objectives but it left me wanting more. I wonder whether another form of interactivity for a program like this could be to allow users to define their own parameters for different kinds of entities and assign those new entities their own names and colors. In the Olympic Bios example, it would be especially great to get the NER to recognize terms relating to artistic periods and movements: terms (toward the bottom of the image above) like “gothic,” “Renaissance,” “Primitivism,” “African influences.” At the very least, I could imagine programming–or letting users add a function–to capture words that end in “-ism.”

That said, in this case, the tagging of organizational names (different Academies and Schools) could provide similar insights about artistic affiliations and styles.

Enter text in Markdown. Use the toolbar above, or click the ? button for formatting help.