Problem To Extract Ner Subject + Verb With Spacy And Matcher
I work on an NLP project and i have to use spacy and spacy Matcher to extract all named entities who are nsubj (subjects) and the verb to which it relates : the governor verb of my
Solution 1:
This is a perfect use case for the Dependency Matcher. It also makes things easier if you merge entities to single tokens before running it. This code should do what you need:
import spacy
from spacy.matcher import DependencyMatcher
nlp = spacy.load("en_core_web_sm")
# merge entities to simplify this
nlp.add_pipe("merge_entities")
pattern = [
{
"RIGHT_ID": "person",
"RIGHT_ATTRS": {"ENT_TYPE": "PERSON", "DEP": "nsubj"},
},
{
"LEFT_ID": "person",
"REL_OP": "<",
"RIGHT_ID": "verb",
"RIGHT_ATTRS": {"POS": "VERB"},
}
]
matcher = DependencyMatcher(nlp.vocab)
matcher.add("PERVERB", [pattern])
texts = [
"John Smith and some other guy live there",
'"Hello!", says Mary.',
]
for text in texts:
doc = nlp(text)
matches = matcher(doc)
for match in matches:
match_id, (start, end) = match
# note order here is defined by the pattern, so the nsubj will be firstprint(doc[start], "::", doc[end])
print()
Check out the docs for the DependencyMatcher.
Post a Comment for "Problem To Extract Ner Subject + Verb With Spacy And Matcher"