Apply NLP
An NLP program specifies analysis types and options for each step. The Apply NLP step works on IBOCR files that are usually the result of a custom Apply Fetcher step.
How to use
In Flow Editor, add the Apply NLP step in a Flow after a custom Apply Fetcher step.
-
Leave the default input and output folders.
-
For NLP Program, click Choose File to select your NLP program.
NLP program
A typical NLP program is JSON-formatted and looks like this:
{
"steps":[
{
"name":"Sentiment Analysis"
},
{
"name":"Entity Action Analysis",
"analysis_profiles": ["suspicion", "penalty", "lawsuit"]
}
],
"requested_services":[
"app",
"account",
"files",
"username"
]
}
Input format
Define input-specific parameters for the NLP Pipeline in the nlp_parameters field. In this example pipeline, Sentiment Analysis and Entity Action Analysis both require an entity definition. An example IBOCR can have the following:
{
...
"nlp_parameters": {
"action_entity": "Facebook",
"sentiment_entity": "Facebook"
},
...
}
Output format
Running an Apply NLP step on an IBOCR attaches the NLP analysis results directly to the IBOCR and writes a new IBOCR file to the output directory. NLP results are stored in the nlp_analysis field. An NLP result consists of a set of span results (for example, information about a particular sentence, word, or span within the document) and information about the document as a whole (such as overall sentiment score). An example output is:
"nlp_analysis": {
"spanResults": [
{
"content": "Sphero was part of the inaugural class in 2014; littleBits joined two years later.",
"endIndex": 3114,
"startIndex": 3032,
"sentiment": {
"mention": "Sphero",
"magnitude": 0.10000000149011612,
"score": 0.10000000149011612
}
},
{
"content": "Sphero",
"endIndex": 329,
"startIndex": 323,
"entityAction": {
"profile": "acquisitions",
"type": "ENTITY"
}
},
{
"content": "acquired",
"endIndex": 1190,
"startIndex": 1182,
"entityAction": {
"profile": "acquisitions",
"type": "ACTION"
}
},
{
"content": "acquired",
"endIndex": 1190,
"startIndex": 1182,
"entityAction": {
"profile": "acquisitions",
"type": "APPROXIMATE_ACTION"
}
},
{
"content": "This deal marks Sphero\u2019s third acquisition, following its June 2018 purchase of Specdrums, a music technology startup that develops rings kids wear to make music.",
"endIndex": 5055,
"startIndex": 4893,
"entityAction": {
"profile": "acquisitions",
"type": "ENTITY_ACTION"
}
},
{
"content": "Sphero has acquired littleBits, a New York City-based company best known for its electronic kits and instructional resources that introduce kids to building and programming by hand.",
"endIndex": 504,
"startIndex": 323,
"entityAction": {
"profile": "acquisitions",
"type": "ENTITY_ACTION"
}
}
],
"docResults": {
"sentiment": {
"magnitude": 1.5,
"salience": 0.1903335303068161,
"score": 0.0
}
}
}