Sentiment Analysis

Overview

The Sentiment Analysis service of the Text Analysis Fact Extraction package is part of a series of enterprise-grade, natural-language products that find domain-specific relationships between entities in input texts. The package's Sentiment Analysis service identifies how customers feel about the quality of a product or service: What opinions do they have about different aspects of the asset? Are they happy or dissatisfied?

The Sentiment Analysis service processes text on a granular level, based on:

Identification of individual sentiments with associated topics by input sentence (not limited to a "summary sentiment" per document)
Classification on a five-level scale (strong positive, weak positive, neutral, weak negative, and strong negative)
Identification of problems, requests, emoticons, and ambiguous and unambiguous profanity

For example:

“While I love the lens on the new Nikon camera, I am not happy with the shutter speed settings”
- “love”: Strong positive sentiment with topic “lens”
- “not happy”: Weak negative sentiment with topic “shutter speed settings”
“The technical support guy was a total jerk.”
- “jerk”: Unambiguous profanity and strong negative sentiment with topic “technical support guy”
“My widget's thingamajig was broken upon arrival."
- “broken”: Major problem with topic “widget's thingamajig”

Use case

An example use of this service is a massive online retailer that analyzes customer reviews of purchased products. The volume is far too great for human analysis, with millions of existing reviews, and thousands of new reviews created each week. You want to write an application that gauges customer sentiment for specific products and analyzes the trend for each product, whether sentiment moves positively or negatively, over time. Each night, you analyze all new reviews created the previous day and you store attributes about the review, including product SKU, date, and opinions, in an SAP HANA database.

You use an API provided by the retailer to retrieve reviews created within a specific date range. The API returns metadata such as date, time, product SKU, price, and customer username, along with the text of each review in HTML. You pass the text to the Sentiment Analysis service, which removes the HTML markup and returns the opinions expressed by the reviewer, any problems they reported, and the objects, or topics, of those sentiments and problems. You store the metadata, sentiments, problems, and topics in HANA, linking each sentiment or problem to its topic so that you can later zero in on what customers specifically did and did not like.

A simple analysis your application can do is show how the favorability of a particular product trends over time. You would retrieve all of the sentiments expressed about a particular product, searching by SKU, within a particular week, and compute the sum of all sentiments using the following scores:

Entity Type	Score
StrongPositiveSentiment	+2
WeakPositiveSentiment	+1
NeutralSentiment	0
WeakNegativeSentiment	-1
StrongNegativeSentiment	-2

You plot the score weekly. When a user clicks on a plot point, you drill down and show all of the sentiments and topics expressed during that week.

Another simple analysis is to count the number of major or minor problems reported during time intervals and plot those over months or years as a rough indication of reliability or quality.

The service supports 11 languages: Arabic, Chinese (Simplified), Chinese (Traditional), Dutch (emoticons only), English, French, German, Italian, Portuguese, Russian, and Spanish.

The service accepts input in a wide variety of formats:

Abobe PDF
Generic email messages (.eml)
HTML
Microsoft Excel
Microsoft Outlook email messages (.msg)
Microsoft PowerPoint
Microsoft Word
Open Document Presentation
Open Document Spreadsheet
Open Document Text
Plain Text
Rich Text Format (RTF)
WordPerfect
XML

The size of each input file is limited to 100 kB.

The tenant parameter is not required because the service is stateless and no data is persisted.

API Reference

/

post

Extract sentiments such as customer opinions and requests, and identify use of emoticons and profanities, in input documents. An in-depth description of sentiment analysis and the languages for which it is available is in the Sentiment Analysis Fact Extraction section of the SAP HANA Text Analysis Language Reference Guide.

Request
Response

Headers

Authorization: required (string)
Used to send a valid OAuth2 access token.
Example:
```
Bearer access_token
```

Query Parameters

languageCodes: (string - default: All language codes supported by sentiment analysis.)
Comma-separated list containing 2-letter language codes. Include one or more languages in which the input might possibly be written. If multiple languages are listed, languageIdentification is implicitly invoked, and sentiments in the document's most dominant language are extracted.
Supported language codes for sentiment analysis:
ar - Arabic de - German en - English es - Spanish fr - French it - Italian nl - Dutch pt - Portuguese ru - Russian zh - Simplified Chinese zf - Traditional Chinese
Example:
```
it
```

Body

Type: application/json

Example:

{
  "text": "Sono assolutamente innamorato dell’accelerazione della mia Maserati."
}

Type: application/octet-stream

Example:

file-as-binary-stream

HTTP status code 200

Body

Type: application/json

Schema:

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "id": "http://www.sap.com/schemas/json/taaas",
    "title": "Text Analysis as a Service (TAaaS) results",
    "definitions": {
        "output_token": {
            "type": "object",
            "properties": {
                "normalizedToken": {
                    "description": "A normalized representation of the token. Normalization includes converting words to lower case, converting umlauts (ä to ae, for example), and removing diacritics. This value is empty when the partOfSpeech property is \"punctuation\".",
                    "type": "string"
                },
                "offset": {
                    "description": "The offset in characters relative to the beginning of the document. If the document's MIME type is other than text/plain, offset is relative to the document after text analysis converted it to plain text.",
                    "type": "number",
                    "minimum": 0
                },
                   "paragraph": {
                    "description": "The relative paragraph number containing the token (indicates that the nth paragraph contains this token).",
                    "type": "number",
                    "minimum": 1
                },
                "partOfSpeech": {
                    "description": "",
                    "type": "string"
                },
                "sentence": {
                    "description": "The relative sentence number containing the token (indicates that the nth sentence contains this token).",
                    "type": "number",
                    "minimum": 1
                },
                "stems": {
                    "description": "The token's base form(s); i.e., the forms referenced in a dictionary. For example, the singular nominative for nouns or the infinitive for verbs. This property is empty unless the token has a stem that differs from the token.",
                    "type": "array",
                    "minItems": 0,
                    "items": [
                        {
                            "type": "string"
                        }
                    ]
                },
                "token": {
                    "description": "The original, non-normalized form of the word as it appeared in the input.",
                    "type": "string"
                }
            }
        },
        "entity": {
            "type": "object",
            "properties": {
                "id": {
                    "description": "The ordinal position of this entity among other entities found in the input.",
                    "type": "number",
                    "minimum": 1
                },
                "label": {
                    "description": "The linguistic or semantic type of the entity, for instance \"PERSON\" or \"StrongPositiveSentiment\".",
                    "type": "string"
                },
                "labelPath": {
                    "description": "Identical to the label property unless the type is hierarchical; e.g., \"SOCIAL_MEDIA/TOPIC_TWITTER\". In this example, the label property would be TOPIC_TWITTER.",
                    "type": "string"
                },
                "normalizedForm": {
                    "description": "A normalized representation of the entity. For more information, see the description of normalizedToken in this schema.",
                    "type": "string",
                    "minLength": 0
                },
                "offset": {
                    "description": "The offset in characters relative to the beginning of the document. If the document's MIME type is other than text/plain, offset is relative to the document after text analysis converts it to plain text.",
                    "type": "number",
                    "minimum": 0
                },
                "paragraph": {
                    "description": "The relative paragraph number containing the entity (indicates that the nth paragraph contains this entity).",
                    "type": "number",
                    "minimum": 1
                },
                "parent": {
                    "description": "The value of the parent entity's \"id\" property. This property is not included if the token has no parent. Used to indicate that there is a lingustic relationship between two entities.  For example, it is used by sentimentAnalysis to relate topics to their enclosing sentiments.",
                    "type": "number",
                    "minimum": 1
                },
                "sentence": {
                    "description": "The relative sentence number containing the entity (indicates that the nth sentence contains this entity).",
                    "type": "number",
                    "minimum": 1
                },
                "text": {
                    "description": "The original, non-normalized form of the entity as it appeared in the input.",
                    "type": "string"
                }
            }
        }
    },
    "type": "object",
    "properties": {
        "language": {
            "description": "2-letter code indicating the primary language of the input text.",
            "type": "string",
            "minLength": 2,
            "maxLength": 2
        },
        "mimeType": {
            "description": "MIME type of the input.",
            "type": "string"
        },
        "textSize": {
            "description": "Number of characters in the input (after conversion to plain text if mimeType other than text/plain).",
            "type": "number"
        },
        "tokens": {
            "description": "",
            "type": "array",
            "items": [
                {
                    "$ref": "#/definitions/output_token"
                }
            ]
        },
        "entities": {
            "description": "Entities extracted from the input when entityExtraction or a variety of factExtraction invoked, otherwise empty.",
            "type": "array",
            "minItems": 0,
            "items": [
                {
                    "$ref": "#/definitions/entity"
                }
            ]
        }
    },
    "required": ["language","mimeType","textSize"]
}

Example:

{
  "entities": [
    {
      "id": 1,
      "label": "Sentiment",
      "labelPath": "Sentiment",
      "normalizedForm": "",
      "offset": 0,
      "paragraph": 1,
      "sentence": 1,
      "text": "Sono assolutamente innamorato dell\\\u00e2\u20ac\u2122accelerazione"
    },
    {
      "id": 2,
      "label": "StrongPositiveSentiment",
      "labelPath": "StrongPositiveSentiment",
      "normalizedForm": "",
      "offset": 19,
      "paragraph": 1,
      "parent": 1,
      "sentence": 1,
      "text": "innamorato"
    },
    {
      "id": 3,
      "label": "Topic",
      "labelPath": "Topic",
      "normalizedForm": "",
      "offset": 30,
      "paragraph": 1,
      "parent": 1,
      "sentence": 1,
      "text": "dell\\\u00e2\u20ac\u2122accelerazione"
    },
    {
      "id": 4,
      "label": "PERSON",
      "labelPath": "PERSON",
      "normalizedForm": "",
      "offset": 62,
      "paragraph": 1,
      "sentence": 1,
      "text": "Maserati"
    }
  ],
  "language": "it",
  "mimeType": "text/plain",
  "textSize": 76
}

HTTP status code 400

Request syntactically incorrect. Any details will be provided within the response payload.

Body

Type: application/json

Schema:

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"error",
  "description":"Schema for API specified errors.",
  "type":"object",
  "properties":
  {
    "status":
    {
      "type":"integer",
      "description":"original HTTP error code, should be consistent with the response HTTP code",
      "minimum":100,
      "maximum":599
    },
    "type":
    {
      "type":"string",
      "description":"classification of the error type, lower case with underscore eg validation_failure",
      "pattern":"[a-z]+[a-z_]*[a-z]+"
    },
    "message":
    {
      "type":"string",
      "description":"descriptive error message for debugging"
    },
    "moreInfo":
    {
      "type":"string",
      "format":"uri",
      "description":"link to documentation to investigate further and finding support"
    },
    "details":
    {
      "type":"array",
      "description":"list of problems causing this error",
      "items":
      {
        "$schema":"http://json-schema.org/draft-04/schema#",
        "title":"errorDetail",
        "description":"schema for specific error cause",
        "type":"object",
        "properties":
        {
          "field":
          {
            "type":"string",
            "description":"a bean notation expression specifying the element in request data causing the error, eg product.variants[3].name, this can be empty if violation was not field specific"
          },
          "type":
          {
            "type":"string",
            "description":"classification of the error detail type, lower case with underscore eg missing_value, this value must be always interpreted in context of the general error type.",
            "pattern":"[a-z]+[a-z_]*[a-z]+"
          },
          "message":
          {
            "type":"string",
            "description":"descriptive error detail message for debugging"
          },
          "moreInfo":
          {
            "type":"string",
            "format":"uri",
            "description":"link to documentation to investigate further and finding support for error detail"
          }
        },
        "required":["type"]
      }
    }
  },
  "required":["status" , "type" ]
}

Example:

{
  "status": 400,
  "message": "There are validation problems, see details section for more information",
  "moreInfo": "https://api.yaas.io/patterns/errortypes.html",
  "type": "validation_violation",
  "details": [
    {
      "field": "hybris-tenant",
      "message": "size must be between 1 and 36",
      "type": "invalid_header"
    }
  ]
}

HTTP status code 401

Given request is unauthorized. Bad or expired token. Reauthenticate the user. Any details will be provided within the response payload.

Body

Type: application/json

Schema:

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"error",
  "description":"Schema for API specified errors.",
  "type":"object",
  "properties":
  {
    "status":
    {
      "type":"integer",
      "description":"original HTTP error code, should be consistent with the response HTTP code",
      "minimum":100,
      "maximum":599
    },
    "type":
    {
      "type":"string",
      "description":"classification of the error type, lower case with underscore eg validation_failure",
      "pattern":"[a-z]+[a-z_]*[a-z]+"
    },
    "message":
    {
      "type":"string",
      "description":"descriptive error message for debugging"
    },
    "moreInfo":
    {
      "type":"string",
      "format":"uri",
      "description":"link to documentation to investigate further and finding support"
    },
    "details":
    {
      "type":"array",
      "description":"list of problems causing this error",
      "items":
      {
        "$schema":"http://json-schema.org/draft-04/schema#",
        "title":"errorDetail",
        "description":"schema for specific error cause",
        "type":"object",
        "properties":
        {
          "field":
          {
            "type":"string",
            "description":"a bean notation expression specifying the element in request data causing the error, eg product.variants[3].name, this can be empty if violation was not field specific"
          },
          "type":
          {
            "type":"string",
            "description":"classification of the error detail type, lower case with underscore eg missing_value, this value must be always interpreted in context of the general error type.",
            "pattern":"[a-z]+[a-z_]*[a-z]+"
          },
          "message":
          {
            "type":"string",
            "description":"descriptive error detail message for debugging"
          },
          "moreInfo":
          {
            "type":"string",
            "format":"uri",
            "description":"link to documentation to investigate further and finding support for error detail"
          }
        },
        "required":["type"]
      }
    }
  },
  "required":["status" , "type" ]
}

Example:

{
  "status" : 401,
  "message" : "Authorization: Unauthorized. Bearer TOKEN is invalid",
  "type" : "insufficient_credentials",
  "moreInfo" : "https://api.yaas.io/patterns/errortypes.html"
}

HTTP status code 403

Given authorization scopes are not sufficient and do not match required scopes.

Body

Type: application/json

Schema:

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"error",
  "description":"Schema for API specified errors.",
  "type":"object",
  "properties":
  {
    "status":
    {
      "type":"integer",
      "description":"original HTTP error code, should be consistent with the response HTTP code",
      "minimum":100,
      "maximum":599
    },
    "type":
    {
      "type":"string",
      "description":"classification of the error type, lower case with underscore eg validation_failure",
      "pattern":"[a-z]+[a-z_]*[a-z]+"
    },
    "message":
    {
      "type":"string",
      "description":"descriptive error message for debugging"
    },
    "moreInfo":
    {
      "type":"string",
      "format":"uri",
      "description":"link to documentation to investigate further and finding support"
    },
    "details":
    {
      "type":"array",
      "description":"list of problems causing this error",
      "items":
      {
        "$schema":"http://json-schema.org/draft-04/schema#",
        "title":"errorDetail",
        "description":"schema for specific error cause",
        "type":"object",
        "properties":
        {
          "field":
          {
            "type":"string",
            "description":"a bean notation expression specifying the element in request data causing the error, eg product.variants[3].name, this can be empty if violation was not field specific"
          },
          "type":
          {
            "type":"string",
            "description":"classification of the error detail type, lower case with underscore eg missing_value, this value must be always interpreted in context of the general error type.",
            "pattern":"[a-z]+[a-z_]*[a-z]+"
          },
          "message":
          {
            "type":"string",
            "description":"descriptive error detail message for debugging"
          },
          "moreInfo":
          {
            "type":"string",
            "format":"uri",
            "description":"link to documentation to investigate further and finding support for error detail"
          }
        },
        "required":["type"]
      }
    }
  },
  "required":["status" , "type" ]
}

Example:

{
  "status": 403,
  "message": "Given request does not have required scopes. It is not authorized to perform this operation.",
  "type": "insufficient_permissions"
}

HTTP status code 500

Body

Type: application/json

Schema:

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"error",
  "description":"Schema for API specified errors.",
  "type":"object",
  "properties":
  {
    "status":
    {
      "type":"integer",
      "description":"original HTTP error code, should be consistent with the response HTTP code",
      "minimum":100,
      "maximum":599
    },
    "type":
    {
      "type":"string",
      "description":"classification of the error type, lower case with underscore eg validation_failure",
      "pattern":"[a-z]+[a-z_]*[a-z]+"
    },
    "message":
    {
      "type":"string",
      "description":"descriptive error message for debugging"
    },
    "moreInfo":
    {
      "type":"string",
      "format":"uri",
      "description":"link to documentation to investigate further and finding support"
    },
    "details":
    {
      "type":"array",
      "description":"list of problems causing this error",
      "items":
      {
        "$schema":"http://json-schema.org/draft-04/schema#",
        "title":"errorDetail",
        "description":"schema for specific error cause",
        "type":"object",
        "properties":
        {
          "field":
          {
            "type":"string",
            "description":"a bean notation expression specifying the element in request data causing the error, eg product.variants[3].name, this can be empty if violation was not field specific"
          },
          "type":
          {
            "type":"string",
            "description":"classification of the error detail type, lower case with underscore eg missing_value, this value must be always interpreted in context of the general error type.",
            "pattern":"[a-z]+[a-z_]*[a-z]+"
          },
          "message":
          {
            "type":"string",
            "description":"descriptive error detail message for debugging"
          },
          "moreInfo":
          {
            "type":"string",
            "format":"uri",
            "description":"link to documentation to investigate further and finding support for error detail"
          }
        },
        "required":["type"]
      }
    }
  },
  "required":["status" , "type" ]
}

Example:

{
  "status" : 500,
  "message" : "Invalid server settings. Please contact administrator.",
  "type" : "internal_service_error",
  "moreInfo" : "https://api.yaas.io/patterns/errortypes.html"
}

HTTP status code 503

Occasionally, this error occurs during processing. Please consider implementing a retry mechanism in your client application for stable processing.

Body

Type: application/json

Schema:

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"error",
  "description":"Schema for API specified errors.",
  "type":"object",
  "properties":
  {
    "status":
    {
      "type":"integer",
      "description":"original HTTP error code, should be consistent with the response HTTP code",
      "minimum":100,
      "maximum":599
    },
    "type":
    {
      "type":"string",
      "description":"classification of the error type, lower case with underscore eg validation_failure",
      "pattern":"[a-z]+[a-z_]*[a-z]+"
    },
    "message":
    {
      "type":"string",
      "description":"descriptive error message for debugging"
    },
    "moreInfo":
    {
      "type":"string",
      "format":"uri",
      "description":"link to documentation to investigate further and finding support"
    },
    "details":
    {
      "type":"array",
      "description":"list of problems causing this error",
      "items":
      {
        "$schema":"http://json-schema.org/draft-04/schema#",
        "title":"errorDetail",
        "description":"schema for specific error cause",
        "type":"object",
        "properties":
        {
          "field":
          {
            "type":"string",
            "description":"a bean notation expression specifying the element in request data causing the error, eg product.variants[3].name, this can be empty if violation was not field specific"
          },
          "type":
          {
            "type":"string",
            "description":"classification of the error detail type, lower case with underscore eg missing_value, this value must be always interpreted in context of the general error type.",
            "pattern":"[a-z]+[a-z_]*[a-z]+"
          },
          "message":
          {
            "type":"string",
            "description":"descriptive error detail message for debugging"
          },
          "moreInfo":
          {
            "type":"string",
            "format":"uri",
            "description":"link to documentation to investigate further and finding support for error detail"
          }
        },
        "required":["type"]
      }
    }
  },
  "required":["status" , "type" ]
}

Example:

{
  "status": 503,
  "message": "A temporary service unavailability was detected. Refer to the error details response for a re-attempt strategy.",
  "type": "service_temporarily_unavailable",
  "moreInfo" : "https://api.yaas.io/patterns/errortypes.html"
}

HTTP status code 504

This error occurs if text analysis takes longer than 20 seconds. There could be two reasons for this error:

Processing takes longer because the current service load is high. Please try again later. You could also consider implementing a retry mechanism in your client application for more stable processing.
The text to be processed is too big or too complex. Sometimes, even small texts take a long time to process. Please split your text in smaller chunks and send it separately to the service.

Body

Type: application/json

Schema:

{
  "$schema":"http://json-schema.org/draft-04/schema#",
  "title":"error",
  "description":"Schema for API specified errors.",
  "type":"object",
  "properties":
  {
    "status":
    {
      "type":"integer",
      "description":"original HTTP error code, should be consistent with the response HTTP code",
      "minimum":100,
      "maximum":599
    },
    "type":
    {
      "type":"string",
      "description":"classification of the error type, lower case with underscore eg validation_failure",
      "pattern":"[a-z]+[a-z_]*[a-z]+"
    },
    "message":
    {
      "type":"string",
      "description":"descriptive error message for debugging"
    },
    "moreInfo":
    {
      "type":"string",
      "format":"uri",
      "description":"link to documentation to investigate further and finding support"
    },
    "details":
    {
      "type":"array",
      "description":"list of problems causing this error",
      "items":
      {
        "$schema":"http://json-schema.org/draft-04/schema#",
        "title":"errorDetail",
        "description":"schema for specific error cause",
        "type":"object",
        "properties":
        {
          "field":
          {
            "type":"string",
            "description":"a bean notation expression specifying the element in request data causing the error, eg product.variants[3].name, this can be empty if violation was not field specific"
          },
          "type":
          {
            "type":"string",
            "description":"classification of the error detail type, lower case with underscore eg missing_value, this value must be always interpreted in context of the general error type.",
            "pattern":"[a-z]+[a-z_]*[a-z]+"
          },
          "message":
          {
            "type":"string",
            "description":"descriptive error detail message for debugging"
          },
          "moreInfo":
          {
            "type":"string",
            "format":"uri",
            "description":"link to documentation to investigate further and finding support for error detail"
          }
        },
        "required":["type"]
      }
    }
  },
  "required":["status" , "type" ]
}

Example:

{
  "status": 504,
  "message": "Service is not reachable: Upstream service connection timeout.",
  "type": "service_temporarily_unavailable",
  "moreInfo" : "https://api.yaas.io/patterns/errortypes.html"
}

An empty entities array is normal

It is not an error if the service returns an empty entities array. Not all text contains entities as defined and recognized by the service. For example, the English sentence "It's the end of the world as we know it, and I feel fine" contains no entities, nor do any of these translations:

Spanish:

German:

Korean:

Russian:

Es el fin del mundo tal como lo conocemos, y me siento bien.

Es ist das Ende der Welt, wie wir es kennen, und ich fühle mich gut.

우리가 알고있는대로 그것은 세상의 종말이며, 나는 기분이 좋아집니다.

Это конец света, как мы его знаем, и я чувствую себя прекрасно.

The label and labelPath members

Some of the entity types that the service identifies are given general categories and then subdivided into more specific classifications. For example, URI is a general category with the subtypes EMAIL, IP, and URL. The label member of the entities array is the most specific entity type of an extracted entity. The labelPath member includes the general category and the subtype, separated by a forward slash ("/").
That means, if a document contains the web address "http://www.sap.com" in its text, that string is extracted as an entity with its label attribute's value set to URL and its labelPath set to URI/URL.
If the entity type does not have subtypes, for example, PERSON, the label and labelPath values are identical.

How Sentiments and Topics Are Linked in the JSON Response

A sentiment is composed of a topic and an opinion or problem report. There is a hierarchical relationship between sentiments and their constituents. They are all returned in separate elements in the entities array of the service's JSON response. Use the value of the label attribute to determine the role an entity plays in the sentiment relationship:

label value

Sentiment

Topic

StrongPositiveSentiment
WeakPositiveSentiment
Neutral Sentiment
WeakNegativeSentiment
StrongNegativeSentiment

MajorProblem
MinorProblem

Role in Sentiment

Sentiment

Topic

Opinion

Problem Report

Hierarchical Position

parent

child

The parent entity always appears immediately before its children in the entities array. The order in which children appear is not guaranteed; i.e., a Topic might appear before or after its sibling Opinion or Problem Report.
All entities in a child position have an extra attribute parent whose value is the id value of the parent Sentiment. Note: It is possible for an opinion or problem report to exist without an associated topic. For example, if the input text is nothing more than the word "junk", two entities will be returned: "junk" as a Sentiment and "junk" as a WeakNegativeSentiment.

Qualifiers Affect Sentiment Strength and Problem Severity

Qualifiers such as "very" (e.g., "very disappointed") and "total" (e.g., "total jerk") affect the strength of sentiments and severity of problems, but are not included in the extracted entities. For example, the phrase I was disappointed with the Widget. will produce "disappointed" as a WeakNegativeSentiment, whereas I was very disappointed with the Widget. will produce "disappointed" as a StrongNegativeSentiment. The word "very" is excluded from the extracted entity but is the cause for the change to a stronger sentiment. As another example, The customer support guy was a jerk. will yield "jerk" as a MinorProblem, but The customer support guy was a total jerk. will yield "jerk" as a MajorProblem; "total" is not included in the text of the extracted entity.

How parent and child entities are linked in the JSON response

Some entities are composed of other entities. This hierarchical relationship is indicated in the JSON response by an extra attribute, parent, that appears only on child entities. It contains an integer value that matches the id value of the parent.

The parent entity always appears before its children in the entities array. The order in which children appear is not guaranteed.

Default Language

The default language is either the first value listed in the languageCodes input parameter (see Setting a subset of languages in this topic) or English if the languageCodes input parameter is not specified.

Setting a subset of languages

You can instruct the service to choose from a specific, reduced set of languages by setting the languageCodes input parameter. This forces the service to choose from one of the languages you supply.
Use this setting with caution. If, for example, you set languageCodes to Danish, German, or Dutch and the input text is in Russian, the service cannot return Russian. It must return the default.

Meaning of the textSize value

The returned attribute textSize represents the amount of character data in the input, not the number of bytes. If the input is in plain text file without accented characters, textSize equals the input file's size. However, if the input is a binary file such as a PDF or Microsoft Word document, the textSize will probably be much smaller than the file size, especially if the file contains a lot of non-textual data such as an embedded image.

Annotated JSON schema

Descriptions of the objects and members of the JSON response returned by the Sentiment Analysis service are contained in the JSON schema. To read the schema, click the POST link in the API Reference then switch to the RESPONSE tab.

Further references

Extensive details on the capabilities and behavior of SAP's sentiment analysis technology can be found in the Sentiment Analysis Fact Extraction chapter of the SAP HANA Text Analysis Language Reference Guide (PDF).

Python Tutorial

This tutorial mirrors the use case described in the Overview. The application described in the associated use case charts how user opinions and problem reports change over time. This tutorial shows how you could retrieve and store sentiments. It does not illustrate the charting portion of the application.

Customer opinions over time

In this tutorial, you are using the Sentiment Analysis service to extract customer sentiments about various aspects of products, and later you will show how those sentiments track over time.

Get an access token

To use the service, you must pass an access token in each call. Get the token from the OAuth2 service.

import requests
import json

# Replace the two following values with your client secret and client ID.
client_secret = 'clientSecretPlaceholder'
client_id = 'clientIDPlaceholder'

s = requests.Session()

# Get the access token from the OAuth2 service.
auth_url = 'https://api.beta.yaas.io/hybris/oauth2/v1/token'
r = s.post(auth_url, data= {'client_secret':client_secret, 'client_id':client_id,'grant_type':'client_credentials'})
access_token = r.json()['access_token']

Call the service

The POST request body for this service includes a single value: the text upon which to perform sentiment analysis. The variable customer_review is an object your application obtained from an online retailer's API. It has the following members:

date_time - the timestamp when the customer submitted the review
prod_SKU - the product's SKU, or stock keeping unit, a unique identifier assigned by the retailer
price - the product's selling price at the time the review was submitted
author - the user ID of the customer who submitted the review
text - the customer's written review, formatted in HTML

You pass customer_review.text to the service, which returns sentiments and problem reports expressed in the text, as well as the objects of those sentiments or problems.

# The Sentiment Analysis service's URL
service_url = 'https://api.beta.yaas.io/sap/ta-sentiments/v1/'

# HTTP request headers
req_headers = {}

# Set **content-type** to `application/octet-stream`. The service automatically determines it is in HTML and removes the markup before performing sentiment analysis.
req_headers['content-type'] = 'application/octet-stream'
req_headers['Cache-Control'] = 'no-cache'
req_headers['Connection'] = 'keep-alive'
req_headers['Accept-Encoding'] = 'gzip'
req_headers['Authorization'] = 'Bearer {}'.format(access_token)

# Make the REST call to the Entity Extraction service. Pass the binary data in raw form. Do not base64-encode the data.
response = s.post(url = service_url,  headers = req_headers, data = customer_review.text)

Here is a sample, HTML-formatted customer review:

<!DOCTYPE html>
  <head><meta charset="utf-8"></head>
<body>
I'm extremely disappointed with Crappo Corporation's Widgetron. Its power button broke within the first five minutes of use.
</body>
</html>

The first few lines of the JSON response this service returns for that post are:

{
    "mimeType": "text/html",
    "entities": [
        {
            "sentence": 1,
            "text": "disappointed with Crappo Corporation's Widgetron",
            "label": "Sentiment",
            "paragraph": 1,
            "offset": 75,
            "normalizedForm": "",
            "id": 1,
            "labelPath": "Sentiment"
        },
        {
            "parent": 1,
            "sentence": 1,
            "text": "disappointed",
            "label": "StrongNegativeSentiment",
            "paragraph": 1,
            "offset": 75,
            "normalizedForm": "",
            "id": 2,
            "labelPath": "StrongNegativeSentiment"
        },
        {
            "parent": 1,
            "sentence": 1,
            "text": "Crappo Corporation's Widgetron",
            "label": "Topic",
            "paragraph": 1,
            "offset": 93,
            "normalizedForm": "",
            "id": 3,
            "labelPath": "Topic"
        },

The last few lines are:

        {
            "sentence": 2, 
            "text": "Its power button broke", 
            "label": "Sentiment", 
            "paragraph": 1, 
            "offset": 125, 
            "normalizedForm": "", 
            "id": 6, 
            "labelPath": "Sentiment"
        }, 
        {
            "parent": 6, 
            "sentence": 2, 
            "text": "power button", 
            "label": "Topic", 
            "paragraph": 1, 
            "offset": 129, 
            "normalizedForm": "", 
            "id": 7, 
            "labelPath": "Topic"
        }, 
        {
            "parent": 6, 
            "sentence": 2, 
            "text": "broke", 
            "label": "MajorProblem", 
            "paragraph": 1, 
            "offset": 142, 
            "normalizedForm": "", 
            "id": 8, 
            "labelPath": "MajorProblem"
        }, 
        {
            "sentence": 2, 
            "text": "five minutes", 
            "label": "TIME_PERIOD", 
            "paragraph": 1, 
            "offset": 165, 
            "normalizedForm": "", 
            "id": 9, 
            "labelPath": "TIME_PERIOD"
        }
    ], 
    "textSize": 202, 
    "language": "en"
}

Each sentiment extracted from the input text appears in the response's entities array in order of appearance. Every sentiment has three entities associated with it:

The entire phrase that spans the opinion or problem and the topic.
The value of this entity's label attribute is Sentiment. In the preceding JSON output, there are two examples: "disappointed with Crappo Corporation's Widgetron" and "Its power button broke". Sentiments always occur in the entities array before the topic and opinion or problem contained within them.
The topic, object, or target of the expressed opinion or reported problem.
This entity's label value is Topic. The two examples in the preceding JSON are "Crappo Corporation's Widgetron" and "power button".
The opinion expressed or problem reported.
Opinions' label attribute values are either StrongPositiveSentiment, WeakPositiveSentiment, NeutralSentiment, WeakNegativeSentiment, or StrongNegativeSentiment. The entity "disappointed" is an example of a StrongNegativeSentiment. The label value for problems is either MajorProblem or MinorProblem. The entity "broke" is a MajorProblem.
The entity "disappointed" is a StrongNegativeSentiment in this case because it is preceded by the qualifier "extremely". If that qualifier were missing, Sentiment Analysis would mark "disappointed" as a WeakNegativeSentiment.

Every entity returned has at least eight attributes:

sentence
text
label
paragraph
offset
normalizedForm
id
labelPath

For a detailed description of each attribute in the response, see the link to the JSON schema in the Details section of this service.

Some entities have a ninth attribute: parent, which associates opinions and problem reports to their topics. See How Sentiments and Topics Are Linked in the JSON Response in the Details section for an explanation of the parent attribute.

Because your application does different calculations on opinions than it does on problem reports, you store the two types of sentiments differently. The store_opinion() and store_problem() functions are defined in your example application (this tutorial does not include that code). Both take the customer_review object because you store the date, time, SKU, and customer ID with each sentiment. Presumably, you would convert the strings contained in curr_opinion_weight and curr_problem_severity to numeric values, at storage time, for quicker computation of scores later on.

This code omits some safety checks, such as making sure entity_type = e.get('label') doesn't return an error, to make the sample code easier for you to read. An actual, robust application should always include thorough error checking and handling.

# Process the result
if response.status_code == 200:
    # De-serialize the JSON reply and get the entities list.
    response_dict = json.loads(response.text)
    # If the service returns no entities, it's not an error, it just means no
    # sentiments were expressed. For example "This is the third Widget I bought     # for my home." contains no sentiments. 
    # Thus, the second parameter of the get() call is left out.
    entities = response_dict.get('entities')

    found_one_sentiment = None  # found at least 1 sentiment in the response
    curr_sentiment_is_opinion = None # saving an opinion or a problem?

    curr_topic = ''
    curr_problem = ''
    curr_problem_severity = ''
    curr_opinion = ''
    curr_opinion_weight = ''

    for e in entities:
        entity_type = e.get('label')
        text_of_entity = e.get('text')
        # At a parent Sentiment? They always precede topics and 
        # opinions/problems in the entity array.
        if (entity_type == 'Sentiment'):
            # If already encountered another Sentiment in the array, now's
            # the time to save it. This application is only interested in
            # opinions and problems that have a topic associated with them; 
            # e.g., if customer review was simply "JUNK!!!", there's no topic
            # so make sure there's a topic before saving.
            if (found_one_sentiment and curr_topic != ''):
                if (curr_sentiment_is_opinion):
                    store_opinion(customer_review, curr_topic, curr_opinion,
                        curr_opinion_weight)
                else:
                    store_problem(customer_review, curr_topic, curr_problem,
                        curr_problem_severity)

            found_one_sentiment = True
            curr_sentiment_is_opinion = None
        elif (entity_type == 'Topic'):
            # This is the topic/object/target of the opinion or problem.
            curr_topic = text_of_entity
        elif (entity_type == 'StrongPositiveSentiment' or 
              entity_type == 'WeakPositiveSentiment' or
              entity_type == 'NeutralSentiment' or
              entity_type == 'WeakNegativeSentiment' or
              entity_type == 'StrongNegativeSentiment' 
        ):
            # This is an opinion.
            curr_opinion = text_of_entity
            # Need to remember its strength and polarity so can compute
            # the favorability score described in the use case.
            curr_opinion_weight = entity_type
            curr_sentiment_is_opinion = True
        elif (entity_type == 'MajorProblem' or
              entity_type == 'MinorProblem' 
        ):
            # This is a problem report.
            curr_problem = text_of_entity
            # Remember its severity so can compute the reliability/quality
            # score described in the use case.
            curr_problem_severity = entity_type

    # End of entities array. If found any, the last one still needs to be
    # stored.
    if found_one_sentiment:
        if (curr_sentiment_is_opinion):
            store_opinion(customer_review, curr_topic, curr_opinion,
                curr_opinion_weight)
        else:
            store_problem(customer_review, curr_topic, curr_problem,
                curr_problem_severity)

else:
    print 'Error', response.status_code
    print response.text

Send feedback
If you find any information that is unclear or incorrect, please let us know so that we can improve the Dev Portal content.
Get Help
Use our private help channel. Receive updates over email and contact our specialists directly.
hybris Experts
If you need more information about this topic, visit hybris Experts to post your own question and interact with our community and experts.

Overview

Use case

API Reference

/

/

post /

Headers

Query Parameters

Body

HTTP status code 200

Body

HTTP status code 400

Body

HTTP status code 401

Body

HTTP status code 403

Body

HTTP status code 500

Body

HTTP status code 503

Body

HTTP status code 504

Body

An empty entities array is normal

The label and labelPath members

How Sentiments and Topics Are Linked in the JSON Response

Qualifiers Affect Sentiment Strength and Problem Severity

How parent and child entities are linked in the JSON response

Default Language

Setting a subset of languages

Meaning of the textSize value

Annotated JSON schema

Further references

Python Tutorial

Customer opinions over time

Get an access token

Call the service