In the current flood of Business Intelligence and insight tools, there is a phrase causing users to abandon the fanciest tools and leading to serious self-doubt for the provider – the “so what?” question. Indeed, your high-quality analytics application might spit out accurate, statistically valid data and package them into intuitive visualisations – but if you stop there, your data has not yet become a basis for decision and action. Most users will be lost or depend on the help and expertise of a business translator, thus creating additional bumps on their journey to data-driven action.

In this article, we focus on applications of Web-based Text Analytics – not “under-the-hood” technological details, but the practical use of Text Analytics and Natural Language Processing (NLP) to answer central business questions. Equipped with this knowledge, you will be able to tap into the full power of Text Analytics and fully benefit from large-scale data coverage and machine intelligence. A real-time mastery of the oceans of data floating on the Web will allow you to make your market decisions and moves with ease and confidence.


1. The basics

Before diving into details, let’s first get an understanding of how Text Analytics works. Text Analytics starts out with raw, semi-structured data – text combined with some metadata. The metadata have a custom format, although some fields, such as dates and authors, are pretty consistent across different data sources. The first step is a one-by-one analysis of these datapoints, resulting in a structured data basis with a unified schema. Even more important than the structuring is the transformation of the data from qualitative to quantitative. This transformation enables the second step – aggregation, which condenses a huge number of structured representations into a small number of consolidated and meaningful analyses, ready for visualization and interpretation by the end user.

2. Answering questions with Text Analytics

A number of questions can be answered with Text Analytics and NLP. Let’s start with the basics – what do users talk about and how do they talk about it? We’ll be providing examples from the Chinese social media landscape on the way.

First, the what – what is relevant, popular or even hot? This question can be answered with two algorithms:

  • Text categorisation classifies a text into one or multiple predefined categories. The category doesn’t need to explicitly be named in the text – instead, the algorithm takes words and their combinations as cues (so-called features) to recognise the category of the text. Text categorisation is a coarse-grained algorithm and thus well-suited for initial filtering or getting an overview over the dataset. For example, the following chart shows the categorisation of blog articles around the topic of automotive connectivity:
  • Concept extraction digs more into depth and identifies concepts such as brands, companies, locations and people that are directly mentioned in the text. Thus, it can identify multiple concepts of different types, and each concept can occur multiple times in the text. For example, the following chart shows mention frequencies for the most common automotive brands in the Chinese social web in February 2018:

Using time series analysis in the aggregation, text categorisation and concept extraction can be used to identify upcoming trends and topics. Let’s look into the time development for Volkswagen, the most frequent auto brand:

Once we have identified what people talk about, it is time to dig deeper and understand howthey talk about it. Sentiment analysis allows to analyze how the topics and concepts are perceived by customers and other stakeholders. Again, sentiment analysis can be applied at different levels: whole texts can be analysed for an initial overview. At an advanced stage, sentiment analysis can be applied to specific concepts to answer more detailed questions. Thus, competitor brands can be analysed for sentiment to determine the current rank of one’s own brand. Products can be analysed to find out where to focus improvement efforts. And finally, product features are analysed for sentiment to understand how to actually make improvements. As an example, the following chart shows the most positively perceived models for Audi in the Chinese web:

3. From insights to actions

Insights from Web-based Text Analytics can be directly integrated into marketing activities, product development and competitive strategy.

Marketing intelligence

By analysing the contexts in which your products are discussed, you learn the “soft” facts which are central for marketing success, such as less tangible connotations of your offering – these can be used as hints to optimise your communication. You can also understand the interest profile of your target crowd and use it to improve your story and wording. Finally, Text Analytics allows to monitor the response to your marketing efforts in terms of awareness, attitude and sentiment.

Product intelligence

With Text Analytics, you can zoom in on awareness and attitudes about your own products and find out their most relevant aspects with concept extraction. Using sentiment analysis, you can compare the perception of different products amongst each other and focus on their fine-grained features. Once you place your products and features on a positive-negative scale, you know where to focus your efforts to maximise your strengths and neutralise your weaknesses.

Competitive intelligence

Your brand doesn’t exist in a vacuum – let’s broaden our research scope. Text Analytics allows you to answer the above questions not only for your own brand, but also for your competitors. Thus, you will learn about the marketing, positioning and branding of your competitors to better differentiate yourself and present your USPs in a sharp and convincing manner. You can also analyse competitor products to learn what they did right – especially on those features where your own company went wrong. And, in a more strategic perspective, Text Analytics allows you to monitor technological trends to respond early to market developments.

So what?

How to show that your findings are not only accurate and correct, but also relevant to business success? Using continuous real-time monitoring, you can track your online KPIs and validate your actions based on the response of the market. Concept extraction can be used to measure the changes in brand awareness and relevance, whereas sentiment analysis shows how brand, product and product feature perceptions have improved based on your efforts.

With the right tools, Text Analytics can be efficiently used in stand-alone mode or as a complement to traditional field research. In the age of digitalisation, it allows you to listen to the voice of your market on the Web and turns your insight journey into an engaging path to actionable, transparent market insights.


Get in touch with Anacode’s specialists to learn how your business challenges can turn into opportunities with Text Analytics.

Just as the rest of the China’s financial system, the Chinese stock market is subject to rather strict government regulations. However, in recent years, it offers more and more opportunities to risk-tolerant foreign investors.

This report sample provides an overview over the Chinese stock market based on data from the Chinese finance portal 金融界 (; Finance World).


Download the report sample here.

In this white paper, you will learn how we use text and data analytics to extract actionable, statistically relevant insights from Web data. The paper shows how AI and Machine Learning technology can be used to build competitive advantage with a crystal-clear, up-to-date  understanding of customer needs.

Please download the white paper here.

On September 20th, we open the event series “Doing Business in China”, a cooperation between Anacode, TechCode, kleef&co and Portus Corporate Finance GmbH. The series will provide talks, workshops and case studies on different aspects of China market entry, incl. market research, local business development, funding and legal topics.


Please download the program here.

In the ideal business world, market and consumer research precedes any marketing activity. The world is not ideal, but when it comes to capricious emerging markets such as China, the need for solid research turns into an acute necessity: sound and specific knowledge of these markets allows to minimize the business risks which go hand in hand with their complexity and volatility. By contrast, an insufficient understanding of market context, customers and competitors can lead to failure, as we have seen at large scale in examples such as Barbie, eBay and BestBuy.

Not surprisingly, market research in China presents a challenge in itself. Multiple factors come into play. First, the Chinese market is inherently difficult to structure and systematize due to its heterogeneity and quick change. Its developments are conditioned by a unique mix of social, political, ethnic and cultural variables. Therefore, they cannot be anticipated by analogies in the familiar context of developed Western markets.

Second, China’s market research industry is relatively young and thus immature: whereas the discipline of market research was introduced in the West at the beginning of the 20th century, it was not until the 1980’s that the first market research unit, a subsidiary of Procter&Gamble, was established in China. Since then, Chinese marketers have gone a long way in mastering Western methods of market research and adapting them to the Chinese reality. However, as of now, the industry is still fragmented and lacks a unified quality standard.

Finally, as a foreign company, you will not only witness the “inherent” challenges of China, but also bump into linguistic, cultural and legal access barriers. The installation of additional intermediaries in the intent of overcoming these – be they consultants, local providers or native employees recruited for that purpose – often does not lead to the expected results. Instead, it further complicates the information flow and pulls the company into a vicious circle of dropping quality at an increased cost.

The potential of social media for market research

Where there is a problem, there is a solution – in the case of China market research, one solution, intriguing and challenging at the same time, is to step out of the comfort zone of familiar methods such as surveys and interviews, and “ride the wave” with social media and advanced analytics technology. More than in any other region of the world, social media in China have developed into a powerful and ubiquitous digital infrastructure. WeChat, the uncontested leader among Chinese social networks, counts 1.1 billion of accounts and 517 millions of daily users; other national platforms such as Weibo and Zhihu, as well as endless topic-specific or regional resources, complete the picture and cover almost all conceivable communication topics – thus contributing to a self-sufficient ecosystem which flourishes hand in hand with the informational liberation of the country after decades of strict censorship.

Chinese social media contains a wealth of information about consumers and markets. This is due, on the one hand, to the strong orientation towards consumption of the Chinese society, and especially of the younger, online-savvy generations. On the other hand, digital channels for sales and service are gaining in popularity, which also contributes to the creation of market-relevant data. Beyond the availability of the relevant data, social media has some additional advantages when compared to traditional, “old-school” research:

  • It is big – millions of posts and comments are posted daily. By contrast, field data projects normally range in the thousands of samples.
  • It is up-to-date — with the appropriate technologies, social media data can be harvested and analysed in near real-time. Traditional field research produces static data for one point in time, with a high cost for subsequent updates and follow-ups.
  • It is to the point – users talk about what is directly relevant to them and invite the researcher to discover and explore. By contrast, market research surveys and interviews prime the respondent to specific topics, thus preshaping and limiting the information he provides.
  • It is authentic – the lack of personal contact often neutralises culture-specific communication barriers. For example, whereas Chinese respondents normally remain polite in face-to-face communication, the Web 2.0 encourages uninhibited, authentic self-expression, often leading to frank negative statements which uncover important opportunities for improvement.
  • Last but not least, it is free – as opposed to data solicitation which comes with a high price per sample and creates a trade-off between cost and data quantity.

Integrating social insights in the organization

Obviously, these advantages come at a price – social media data is not available in the familiar, structured and well-focussed format of market research data. It is online and cannot be directly “imported” into common analytics programs. Besides, most of the data is unstructured and has a high degree of noise. Inside a company, three ingredients should be mingled to successfully generate insights from social media:

  1. Technology and tools
  2. Data science expertise
  3. Mastery of the business context

Tools ensure the feasibility of the research – the right technology will allow to collect data that contains the relevant information and to actually extract this information. In most cases, there will be no single “one-stop shop” that can do the job. Instead, multiple tools are combined into a pipeline that produces detailed findings and is customized to the specific business circumstances. Special attention should be paid to the technical details behind unstructured data analytics. While applications in this domain are often marketed based on alluring concepts such as Artificial Intelligence, Machine Learning and Natural Language Processing, the algorithms don’t always produce high-accuracy results and thus can devaluate even the smartest data strategy.

Data science expertise is needed to pick and mix the right tools so as to produce relevant and correct output. The data scientist makes sure that the right tools are correctly integrated into an analysis pipeline. The main requirement is that the pipeline produces results that are maximally close to concrete actions and decisions. A point which is often neglected here is the cleaning and preprocessing of the data: as noted above, social media data comes with high levels of noise in form of spam, advertising etc. The “garbage in, garbage out” principle applies at full scale – thus, before going into analysis, the data should undergo a carefully designed cleaning and filtering process.

Finally, mastery of the business context is required to use the social media tool set with maximum benefit. On the input side, this means translating business issues and reframing information needs into the query framework of the used data and applications. On the output side, the analysis results are fed back into the real-world business context and translated into concrete and actionable insights.

Adopting social media for market research is a challenge that requires the right tools, skills and judgment. However, efforts put into designing a customized social insight strategy will pay off and solve many a productivity issue associated with traditional market research. Especially in a market as volatile and diverse as China, leapfrogging over familiar research methods to leverage advanced analytics and the wealth of available online data appears to be a promising, future-proof strategy for an up-to-date and actionable understanding of the market.


Janna Lipenkova, CEO Anacode GmbH

This report provides a descriptive overview of the Chinese Web 2.0 landscape for automotive feedback, focussing on BMW 7 Series and comparing it with Audi A8 and Mercedes-Benz S-Class. The feedback is analysed both from a qualitative and a quantitative perspective. The main observations and findings are as follows:

  • Popular topics and concepts: We find that users are most concerned about the price and optical aspects (design, visual appearance) of the three considered series. Competitor brands that are discussed in a comparative perspective are mostly high-end or consumer-oriented foreign brands from Germany, US and Japan, whereas native Chinese brands are much less frequent. Geographically, users concentrate in the big cities and more affluent regions along the East coast.
  • Temporal evolutions: The quantity of buzz grows relatively evenly for all three series before 2015, with BMW 7 and S-Class leading. In 2015 – 2016, there is a burst in the quantity of data for BMW 7, which correlates with the introduction of the new generation of the series.
  • User satisfaction and sentiment: Users are generally satisfied with the frequently mentioned major product features of BMW 7. There are, however, some categories that are perceived negatively – specifically, components related to the front part of the car, the fuel consumption and aspects related to acoustic quality and insulation.
  • Social influencers: Among the key influencers on WeChat, China’s leading social network, we mostly find media accounts posting on general automotive topics. There are no accounts with a wide social reach that would specialize on the BMW brand. Thus, influencer marketing is an opportunity yet to be explored by BMW’s marketing and branding strategy.

Download the social report.

Connectivity and the “Internet of Vehicles” is one of the main current technological trends in the automotive area. With its rapid digitalization, China is a major “testbed” as well as a source of innovations in this field.

The present report uses data from the Chinese social web to find concepts and topics that dominate the Chinese discussion around connectivity and Internet of Vehicles (IoV; fTQ).

Our key observations are as  follows:

• Whereas connectivity already rose around 2011-12 in OECD markets, its China journey started relatively late (2014). Thus, the current progress and penetration are all the more impressive and uncover an even greater potential for future development backed by numerous supporting measures by the Chinese government.

• Foreign and domestic brands and services are equally present in the data. Foreign brands dominate the car manufacturer area, whereas domestic Chinese brands are more present in the IT context. This balanced mix conveys the strength of local providers as well as the agitated landscape of international deals between car manufacturers and connectivity service providers.

• Major product aspects discussed in the area turn around smart and assisted functionality, safety and advancements in mobile internet technology.

• In a more strategic perspective, autonomous driving and its implications for the overall traffic system seems to be the leading concern. There are also numerous discussions about the underlying  technologies such as Artificial Intelligence, Big Data and Internet of Things.

Download the connectivity report.

On May 3rd, we are participating at the Junges Wissenschaftsforum Dahlem ( to share our experience in technology transfer and founding out of academia with postgraduates and young researchers.

On May 25th, Dr. Janna Lipenkova will be giving a talk at Asia-Pacific Weeks Berlin (APW 2016) on different approaches to market research and market entry in China. Check out the presentation abstract.

Participation @ Mobile World Congress Shanghai (#MWC Shanghai)