Beginner’s Guide to Text Analytics

Glasses on top of an open textbook, suggesting someone was analyzing text of some type.

When customers are ecstatic or disappointed by interactions and a brand’s customer experience (CX), it’s likely they’ve provided customer feedback filled with data-rich insights. Feedback data — whether direct, indirect, structured, or unstructured — is everywhere.

From surveys to reviews on social media, an organization has the opportunity to tap into customer signals that drive decision-making and the overall success of the business.

Brands often struggle with this wealth of data, though. It’s overwhelming (and virtually impossible) for human analysts to manually examine thousands of pieces of feedback across a wide variety of channels on a regular basis.

But there’s a solution to this challenge: Text analytics, which draws insights from data by taking sentiment-rich comments and sorting them into business-relevant categories.

What is Text Analytics in Big Data?

Within the omnichannel ecosystem, there are countless touchpoints between a brand and its customers. Vast amounts of data are generated each day, often referred to as big data.

Text analytics is one of the methods used to gain insights from big data by converting unstructured text into structured data.

Several procedures are needed to analyze and understand unstructured text data. As such, text analytics includes processes such as data cleansing, pre-processing, feature extraction, and machine learning (ML).

Difference Between Text Analytics & Text Mining

Text analytics and text mining are commonly confused, and there’s no doubt the two topics are different. Text analytics uses statistical and machine learning methods to evaluate text data and derive insights, whereas text mining extracts information from unstructured data.

Benefits of Text Analytics

Text analytics is rising in popularity. Leading brands in many industries are investing in customer experience management (CEM) software platforms that offer text analytics as one of several top features.

Here’s an overview of the text analytics benefits that enhance a feedback program:

  • Increase insights with fewer questions: Long, drawn-out surveys may cause customers to shy away from providing feedback, but text analytics digs deeper into the words of even short responses to reveal the meaning behind their words.
  • Get to the root cause: Numerical scores don’t always tell the ‘why’ behind the feedback. Text analytics, however, does this by providing detail — this identifies what’s working and what’s the root cause of an issue customers frequently face.
  • Get timely insights: Employees already have a list of time-consuming tasks to tackle, and asking them to probe every word from customer feedback is unrealistic. Text analytics handles the entire lift, and it does so with much greater reliability.
  • Identify emerging trends: Humans need data to make informed, smart decisions. By tapping into the words and phrases customers are using, text analytics puts the spotlight on trends a business can’t afford to ignore and needs to take advantage of.
  • Understand customer needs: Customers will tell you what they want, need, and expect. You just need to listen to them, and text analytics brings keywords, themes, and sentiment to the forefront.
  • Make data-driven decisions: In order to better serve customers, let the insights derived from text analytics offer a path forward for customer experience strategy.
  • Improve customer and employee experience: As it does for CX, text analytics also improves employee experience (EX). Text analytics dives into employee data such as employee effort score, engagement, satisfaction, and sentiment.

As the volume of feedback organizations collect grows, text analytics is the only option to keep up.

Basic Text Analytics

Text analysis varies from basic to advanced. The type of insight you get depends on the type you use.

At the basic level, text analytics involves the following.

Word frequency analysis

Word frequency analysis counts words in the text. Using this approach helps you find the text’s most popular terms and topics.

Phrase detection

Aside from using specific words, there are phrases that your audience can use that have a significant bearing on sentiment and other key elements. Phrase detection allows you to find frequent phrases in the text, enabling you to identify themes.

Sentiment analysis

On a superficial level, words can be misleading if you do not have context. One way to gain the right perspective is by determining the associated emotions. This is where sentiment analysis comes into play. It helps you determine a text’s emotion allowing you to identify development areas.

Topic modeling

When words, phrases, and sentiments are repetitive, there’s an underlying theme contributing. Topic modeling identifies text themes which helps you identify a text’s primary ideas.

Advanced Text Analytics

Depending on the size and nature of your company, basic text analytics may not offer sufficient insight. Advanced text analytics might be necessary.

Named entity recognition

Named entity recognition (NER) identifies and categorizes persons, organizations, and places in the text. In addition, this method helps identify text entity connections.

Text classification

Text categorization includes classifying text into different categories. Along with helping you organize massive text data, this method finds patterns.


Another way to identify patterns in unstructured data is through clustering. This method groups text depending on the content, making it easier to identify patterns.

Relationship extraction

While it’s not often the case, some of your consumers may have close relationships. You can determine how people, organizations, and places are textually related through relationship extraction. Such insight adds context to their conversations, allowing you to act accordingly.

Network analysis

Going beyond relationships, you’ll realize that there are specific groups with a common link. Network analysis examines textual links to find patterns and trends which help explain how things interact.

Text Analytics Techniques & Applications

Text analytics involves various techniques for analyzing unstructured, text-based data. In addition to topic analysis and sentiment analysis, there are several other techniques businesses can use to gain insights from their text data.

Let’s review the techniques and applications of text analytics.

Topic analysis

Topic Analysis categorizes phrases within customer feedback into business-relevant topics. For example, “the sales associate was nice” would be categorized under “Staff Friendliness .”There are generally two ways to accomplish this: a manual setup, a rules-based approach, and machine learning techniques.

Analysts and linguists manually build rules for the rules-based method. For instance, a clause containing two words like “friendly” and “employee” might be placed under a “Staff Friendliness” subject.

Such rules might also assess word order and important word grammatical relationships. The setup procedure is time-consuming, but the classified comments are exact since each rule is individually constructed.

Machine learning, which uses supervised classification and clustering, is also a key component of the topic analysis. Therefore, an analyst manually assigns subjects to a sample of comments for supervised categorization. From there, the annotated dataset trains the classifier to automatically tag fresh comments.

While annotating data is easier than developing rules, classifiers only operate with less than ten subjects.

Sentiment analysis

Sentiment analysis tags phrases as having positive or negative sentiments. “The sales associate was really nice” would be tagged as positive.

Dictionary-based sentiment analysis is simple to set up. It’s similar to pulling all the words out of a dictionary and assigning positive or negative sentiment to each word. The sentiment of words changes, however, depending on the context.

You would usually think of swear words as conveying negative sentiment, but in the gaming community, for example, things may be fuzzier. Positive words are often used ironically, and negative words actually have positive sentiments when put into context.

To allow for context, supervised machine learning techniques provide a much better way of assigning sentiment. Similar to the supervised classification described for topic analysis, supervised machine learning for sentiment analysis involves taking a sample set of clauses for the context you’re interested in and manually assigning each clause a positive or negative sentiment. From this annotated data set, the algorithm can then assign new clauses with sentiment based on what it’s learned from the sample of comments.

Named entity recognition

Named entity recognition (NER) extracts persons, organizations, and locations from unstructured text data. NER can detect influential persons and organizations in consumer feedback and social media data. In addition, NER can also recognize text themes and topics.

Part-of-speech tagging

Text analytics uses part-of-speech (POS) tagging to classify each word in a phrase. This method helps analyze sentence grammar and comprehend literature.

Dependency parsing

Dependency Parsing in text analytics helps companies discover sentence and grammatical connections. In addition, this method helps analyze sentence structure and comprehend content.

Text classification

Text classification uses content to classify text into predetermined categories. This method helps identify popular subjects in consumer feedback and social media. Moreover, text classification can also reveal key ideas.

How to Conduct a Text Analysis

Text analysis is comprised of data collection, data processing, text analysis, and visualization.

Here’s a bit more information on how each step functions.

#1. Data collection

Text analysis starts with data from social media, consumer feedback forms, and online reviews. Make sure your data is relevant to your business challenge.

#2. Data processing

Following data collection comes processing, cleaning, and prepping data for analysis. Data processing involves deleting extraneous material, formatting it, and structuring unstructured data for analysis.

#3. Text analysis

After processing data, you’ll need to analyze it to draw insight. This involves sentiment analysis, topic modeling, and named entity identification.

#4. Visualization

Lastly, you’ll need to show your stakeholders the findings from the text analysis. You can achieve this through word clouds, bar charts, and heat maps.

How to Prepare Text Data for Analysis

Preparing data for text analysis ensures reliable and understandable outcomes.

Here’s how to go about preparing text data for analysis.

#1. Clean data

Text data is cleaned by eliminating HTML elements, URLs, and special characters. This cleans and organizes data for analysis.

#2. Pre-process text

Text pre-processing converts text data into an analysis-ready format. It involves removing numerals, punctuation, and lowercase text.

#3. Tokenize text

Tokenization divides the text into words and phrases. By doing so, text data analysis becomes easier.

#4. Remove stop words

Stop word removal removes frequent words like “and,” “the,” and “is” from text. These terms can distort analyses.

#5. Simplify data with stemming and lemmatization

Stemming and lemmatization involve rooting words. This can simplify and analyze text data. Stemming involves removing suffixes from words, whereas lemmatization reduces them.

A Smarter Approach to Text Analytics

Text analytics typically feels like a foreign, complex concept when first exploring its capabilities and benefits. But now you should know the basics of text analytics, and your next step is to partner with a software provider that brings specialized expertise to the table.

Medallia’s real-time, human-centric text analytics ensures that you uncover high-impact insights and drive action. It uses artificial intelligence (AI) and natural language processing (NLP). to quickly identify emerging trends and key insights at scale. And because we started building our native text analytics more than a decade ago, it’s the most comprehensive, connected, and accessible text analytics available.