API Reference
Custom attributes
When you add spacytextblob into your spaCy pipeline it exposes a custom attribute ._.blob
. This attribute is available for for the Doc
, Span
, and Token
classes from spaCy.
Doc._.blob
Span._.blob
Token._.blob
The section below outlines commonly accessed ._.blob
attributes and methods. See the textblob docs for the complete listing of all attributes and methods that are available in ._.blob
.
Attributes
Name | Type | Description |
---|---|---|
doc._.blob.polarity |
Float |
The polarity of the document. The polarity score is a float within the range [-1.0, 1.0]. |
doc._.blob.subjectivity |
Float |
The subjectivity of the document. The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective. |
doc._.blob.sentiment_assessments.assessments |
tuple |
Return a tuple of form (polarity, subjectivity, assessments ) where polarity is a float within the range [-1.0, 1.0], subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective, and assessments is a list of polarity and subjectivity scores for the assessed tokens. |
Methods
doc._.blob.ngrams
Name | Type | Description |
---|---|---|
n | int |
The number of words to include in the ngram. By default 3 . |
RETURNS | List[WordLists] |
Config
When adding spacytextblob to your spaCy pipeline you can optionally pass additional parameters into the config
parameter:
Name | Type | Description |
---|---|---|
blob_only |
bool |
If True, spacytextblob will only expose ._.blob and not attempt to expose ._.polarity , ._.subjectivity , or ._.assessments . This should always be set to True when using TextBlob extensions. By default False . |
custom_blob |
Dict[str, str] |
The "custom_blob" key should be assigned to a dictionary that tells spaCy what function to replace textblob.TextBlob with. In this case, we want to replace it with TextBlobDE . The key of the dictionary is "@misc" . This tells spaCy to look into the misc section of the spaCy register. The value should be the string name of a function that you have registered with spaCy. See the TextBlob extensions section for more details. |
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
nlp = spacy.load("de_core_news_sm")
nlp.add_pipe( "spacytextblob", config={
"blob_only": ..., # bool
"custom_blob": ... # Dict[str, str]
})
Examples
Using spacytextblob without an extension:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
nlp = spacy.load('en_core_web_sm')
text = "I had a really horrible day. It was the worst day ever! But every now and then I have a really good day that makes me happy."
nlp.add_pipe("spacytextblob")
doc = nlp(text)
print(doc._.blob.polarity)
# -0.125
print(doc._.blob.subjectivity)
# 0.9
print(doc._.blob.sentiment_assessments.assessments)
# [(['really', 'horrible'], -1.0, 1.0, None), (['worst', '!'], -1.0, 1.0, None), (['really', 'good'], 0.7, 0.6000000000000001, None), (['happy'], 0.8, 1.0, None)]
print(doc._.blob.ngrams())
# [WordList(['I', 'had', 'a']), WordList(['had', 'a', 'really']), WordList(['a', 'really', 'horrible']), WordList(['really', 'horrible', 'day']), WordList(['horrible', 'day', 'It']), WordList(['day', 'It', 'was']), WordList(['It', 'was', 'the']), WordList(['was', 'the', 'worst']), WordList(['the', 'worst', 'day']), WordList(['worst', 'day', 'ever']), WordList(['day', 'ever', 'But']), WordList(['ever', 'But', 'every']), WordList(['But', 'every', 'now']), WordList(['every', 'now', 'and']), WordList(['now', 'and', 'then']), WordList(['and', 'then', 'I']), WordList(['then', 'I', 'have']), WordList(['I', 'have', 'a']), WordList(['have', 'a', 'really']), WordList(['a', 'really', 'good']), WordList(['really', 'good', 'day']), WordList(['good', 'day', 'that']), WordList(['day', 'that', 'makes']), WordList(['that', 'makes', 'me']), WordList(['makes', 'me', 'happy'])]
Using spacytextblob with an extension:
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
from textblob_de import TextBlobDE
text = '''
Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag. Ich muss
unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen. Aber leider
habe ich nur noch EUR 3.50 in meiner Brieftasche.
'''
@spacy.registry.misc("spacytextblob.de_blob")
def create_de_blob():
return TextBlobDE
config = {
"blob_only": True,
"custom_blob": {"@misc": "spacytextblob.de_blob"}
}
nlp = spacy.load("de_core_news_sm")
nlp.add_pipe("spacytextblob", config=config)
doc = nlp(text)
print(doc._.blob.sentences)
# [Sentence("Heute ist der 3. Mai 2014 und Dr. Meier feiert seinen 43. Geburtstag."), Sentence("Ich muss unbedingt daran denken, Mehl, usw. für einen Kuchen einzukaufen."), Sentence("Aber leider habe ich nur noch EUR 3.50 in meiner Brieftasche.")]
print(doc._.blob.sentiment)
# Sentiment(polarity=0.0, subjectivity=0.0)
print(doc._.blob.tags)
# [('Heute', 'RB'), ('ist', 'VB'), ('der', 'DT'), ('3.', 'LS'), ('Mai', 'NN'), ('2014', 'CD'), ('und', 'CC'), ('Dr.', 'NN'), ('Meier', 'NN'), ('feiert', 'NN'), ('seinen', 'PRP$'), ('43.', 'CD'), ('Geburtstag', 'NN'), ('Ich', 'PRP'), ('muss', 'VB'), ('unbedingt', 'RB'), ('daran', 'RB'), ('denken', 'VB'), ('Mehl', 'NN'), ('usw.', 'IN'), ('für', 'IN'), ('einen', 'DT'), ('Kuchen', 'JJ'), ('einzukaufen', 'NN'), ('Aber', 'CC'), ('leider', 'VBN'), ('habe', 'VB'), ('ich', 'PRP'), ('nur', 'RB'), ('noch', 'IN'), ('EUR', 'NN'), ('3.50', 'CD'), ('in', 'IN'), ('meiner', 'JJ'), ('Brieftasche', 'NN')]