Skip to content Skip to sidebar Skip to footer

Problem To Extract Ner Subject + Verb With Spacy And Matcher

I work on an NLP project and i have to use spacy and spacy Matcher to extract all named entities who are nsubj (subjects) and the verb to which it relates : the governor verb of my

Solution 1:

This is a perfect use case for the Dependency Matcher. It also makes things easier if you merge entities to single tokens before running it. This code should do what you need:

import spacy
from spacy.matcher import DependencyMatcher

nlp = spacy.load("en_core_web_sm")

# merge entities to simplify this
nlp.add_pipe("merge_entities")


pattern = [
        {
            "RIGHT_ID": "person",
            "RIGHT_ATTRS": {"ENT_TYPE": "PERSON", "DEP": "nsubj"},
        },
        {
            "LEFT_ID": "person",
            "REL_OP": "<",
            "RIGHT_ID": "verb",
            "RIGHT_ATTRS": {"POS": "VERB"},
        }
        ]

matcher = DependencyMatcher(nlp.vocab)
matcher.add("PERVERB", [pattern])

texts = [
        "John Smith and some other guy live there",
        '"Hello!", says Mary.',
        ]

for text in texts:
    doc = nlp(text)
    matches = matcher(doc)

    for match in matches:
        match_id, (start, end) = match
        # note order here is defined by the pattern, so the nsubj will be firstprint(doc[start], "::", doc[end])
    print()

Check out the docs for the DependencyMatcher.

Post a Comment for "Problem To Extract Ner Subject + Verb With Spacy And Matcher"