As the sharing economy is booming in China, shared mobility services are radically transforming the dynamics of Chinese cities. This means a fundamental shift for automotive players: “owning” a car is no longer considered the ultimate status symbol in China.[1,2] Consumers get more pragmatic, whereas the urban landscape gets more and more congested and difficult to navigate. The Chinese government supports shared mobility solutions as a way to relax congestion and enforce the large-scale use of electric vehicles. Thus, the opportunity for solution providers is large. Traditional car manufacturers are establishing themselves in this new market by initiating new cooperations and product developments that are adjusted to the new sharing reality.

This article outlines the services, players and pain points that are often discussed in the context of shared mobility. The provided data was collected from Chinese social media and analysed using Anacode’s text analytics technology.[3,4]

The Five Types of Shared Mobility

Amount of discussion for the five shared mobility services

The landscape of shared services can be segmented into five types[5], each of them addressing different transportation needs:

Ride sharing mostly targets urban travellers. It is relatively new in China but was quickly picked up by Didi and integrated into their service range, followed by new players such as DiDa pinche. Didi eventually suspended its ride sharing service in August 2018, thus freeing up the space for a range of smaller competitors.

Car rental as the traditional mobility service targets long-distance and cross-city travellers. Car rental does not make the headlines when it comes to cutting-edge technology and giant funding rounds, but it is still the go-to alternative for consumers who seek a safe, time-tested mobility solution.

Bike sharing aims to solve the “last-mile” problem, helping consumers reach the next subway or bus station from their office or home. Bike sharing has seen extensive growth in China for some years. In 2018, however, this segment experienced a slow-down due to an oversupply of bikes and flaws in the payment processes.

Ride hailing targets a similar demographic as ride sharing and is also widely dominated by Didi. Currently, ride hailing is moving its focus towards lower-tier cities. Benefiting from their knowledge of local specifics, a large number of regional players are already active in the market.

Car sharing is similar to car rental, with the main difference being the duration: whereas traditional rental is optimized for intervals of multiple hours or even days, shared cars can be used for minutes. This flexibility makes them a welcome alternative for ad-hoc city travellers.

Newcomers and Traditional Car Manufacturers Need to Work Together

Multiple interest groups come together in the shared mobility market: car manufacturers in general are interested in cooperating with service providers and customising their products for the specific requirements of shared vehicles. Besides, the service-oriented transformation pushes the development of new automotive technology: a large number of shared solutions employ NEV fleets. Recently, with Pony.ai launching its fleet of autonomous cars, autonomous driving also enters the stage. Finally, shared mobility is an integral part of the infrastructure transformation in China and is thus generously subsidized by the Chinese government.

Popular shared mobility brands

OEMs frequently discussed in the shared mobility context

Shared Mobility Services – Far From Perfect

Sentiment about shared mobility solutions is generally positive: consumers love the new flexibility. Still, five pain points recur across social discussions, as illustrated in the following chart:

Five shared mobility pain points and their share-of-voice on the social Web

Cleanliness and vehicle condition in general

有一个朋友来深圳找我,日租了一辆ponycar, 一整天只花了150块!我们跑了一整天也不用充电,玩的很开心。挺好,唯一的一点就是希望车内卫生能够做好!

A friend of mine came to visit me in Shenzhen and I rented the Ponycar for a whole day. It was only 150 yuan! We drove for an entire day without having to recharge the car; we had a really good time! I only wish the car was cleaner!

– @一个傻子东西

Traffic violations

共享汽车都可以乱停车的吗?我的车被堵住了,叫客服快一个小时了,到现在都还没人来挪车?

Shared cars can be parked like this now? My car is blocked. I’ve called the customer service for an hour and no one has come yet to move the car.

– @蔡_三岁

Deposit issues

@一步用车什么情况,退个押金能退俩月?谁能解决我的问题?

What happened @Ibuyongche? Why can I still not withdraw my deposit after two months? Who can solve my problem?

– @WLH_呜啦啦

Charging point availability

好多Evcard的充电桩都有问题导致我没办法充电,真的太不方便了。
So many charging points won’t let me charge the EVCARD car! This is really not convenient!

– @dy_crab

Service location availability

Ponycar开起来还行,就是停车点太少了,取车还车要走好几公里。
Ponycar is easy to drive, but there are too few parking spots. I need to walk several kilometers to pick up and return the car.

– @supertramp

 

The Dark Side of the Sharing Economy

Peer-to-peer lending, shared mobility, home sharing… whatever the application, it seems that Chinese society is quick to establish the collective trust needed to make sharing models work. But the risk of individual abuse persists. In 2018, two passenger murder cases were recorded related to Didi’s ridesharing service. The brand strengthened its security measures and eventually closed down the service. However, a more recent case dates from December, with a driver being killed by his passenger, and clearly demonstrates that the root of the security problem still persists.

 

While the shared mobility space has evolved rapidly over the last years, Chinese consumers are also getting aware of the current drawbacks. They are disillusioned by pain points relating to usability, infrastructure and security, which show that the industry is still in major need of improvement. Going forward, it is important to systematically attack these issues. Multiple interest groups need to come together in laying a solid basis and creating the infrastructure for a safe and seamless user experience.

 

References

[1] Wouter Baan, Paul Gao, Arthur Wang, Daniel Zipser (2017). Savvy and sophisticated: Meet China’s evolving car buyers. Retrieved from https://www.mckinsey.com/industries/automotive-and-assembly/our-insights/savvy-and-sophisticated-meet-chinas-evolving-car-buyers

[2] Raymond Tsang, Pierre-Henri Boutot and Dorothy Cai (2018). China’s Mobility Industry Picks Up Speed. Retrieved from https://www.bain.com/insights/chinas-mobility-industry-picks-up-speed/

[3] Weibo data 2018. Retrieved from https://www.weibo.com

[4] Anacode GmbH (2018). Anacode MarketMiner: Web-based Text Analytics for International Market Intelligence. Retrieved from http://anacode.de/wordpress/wp-content/uploads/2017/11/Anacode_Technology_Whitepaper_v1.pdf

[5] Marco Hecker, Quan Zhou, Zoe Wu (2017). The Future of Shared Mobility in China. Retrieved from https://www2.deloitte.com/content/dam/Deloitte/cn/Documents/about-deloitte/dttp/deloitte-cn-dttp-vol7-ch3-future-of-shared-travel-en.pdf

Authors: Janna Lipenkova, Stephan Telschow (G-I-M)

Market research is inherently biased: each method comes with its own limitations on authenticity and representativity which are rooted in our psychology. For example, survey research can suffer from acquiescence bias, in which case the respondent automatically says “yes” to any question that comes his way. As another example, research based on product reviews is often polarized: users are more motivated to post extremely negative or extremely positive opinions. The “middle ground” of the average, neutral opinion is not worthy to mention and stays underrepresented.

One way to alleviate bias is to combine multiple data sources and methods to address the same research question. In the following case study, we apply a combination of two methods – a classical survey and a social media analysis using Natural Language Processing –  to analyze eating behavior in China. Our goal is to understand the complementarity of the two methods and to show how they can be combined to generate synergies.

The format of the classical closed-question survey is well understood in market research. Why do we opt to complement it with insights from social media? Beyond constituting a dataset with radically different characteristics, social media is omnipresent and thus highly relevant in the life of the modern consumer; this is especially true in China, where consumers heavily rely on social media when they make purchasing decisions. As described in this article, a brand can significantly improve on customer centricity by following and responding to the social conversation.

Characterization of the two datasets

Overview over the two datasets

The two methods rely on datasets with different underlying properties. The survey uses a demographically representative sample of 2000 respondents. The leading theme is “Please describe your last meal”, with more detailed questions covering aspects such as type of food, location, motivation etc. It can be assumed that the described situations converge to typical set of meal situationsthat are reflecting daily-life routines. All questions are closed, thus the data is structured. Since it is solicited explicitly, the respondents are motivated by external incentives. The questions are determined directly by the research goals and thus cover the information that is relevant to the researcher.

The social study uses a sample of 5M unstructured posts from Weibo related to food and eating behavior. There is no way to control demographics on social media, thus the sample is demographically not representative and, furthermore, biased towards the dominating user group of the platform. Users don’t have external incentives to post on social media and thus are intrinsically motivated. This augments the authenticity and customer-centricity of the data: only topics that are actually salient are discussed. Contrarily to survey answers which refer to the “average” meal situation, social media covers those cognitively prominent situations that are worthy to recall, describe and share publicly. Thus, this dataset contains a larger share of exceptional, non-routine situations.

Figure 1: Cognitive levels addressed by the two datasets

Methodology

The size and structure of the two datasets call for different methods of analysis. The survey is structured and controlled in terms of content and can be evaluated using standard analytical methods. By contrast, the social data is noisy, unstructured and, furthermore, very large – thus, it requires an additional effort of cleaning, filtering and structuring. We apply two Natural Language Processing algorithms – concept extraction and sentiment analysis – to structure the dataset. The algorithms are built on the basis of Anacode’s ontology, which classifies all relevant concepts such as ingredients, brands and common food locations; it also contains psychological universals incl. emotions and motivations. Figure 2 summarized the setup of the two methods.

Results and insights

  • Granularity of parameter values

Inside of the individual parameters, social data allows for a much larger variety. In the survey data, the number of possible responses has to be limited to control survey length and avoid fatigue bias. For example, “Cooking Oil” is one generic ingredient without further variations. By contrast, in social media, the number of possible parameter values is virtually infinite, covering any aspect that is deemed relevant by the users. For example, the ontology we use contains 25 variations of cooking oil, such as peanut oil, coconut oil, soybean oil etc.

  • Completeness of the dataset

The survey allows to fill all parameters of the considered situations, whereas social data is inherently incomplete. The following table shows some posts and their analyses:

It can be seen that many variables remain unknown in the social dataset. As a consequence of this sparsity, it is difficult to represent the whole complexity of eating situations. By contrast, survey respondents are required to fill in all parameters, thus producing a matrix without empty cells.

  • Analysis results

When comparing the results of the two methods, the individual parameters show different distributions. For example, the following chart illustrates the social setting of eating situations:

The distributions are clearly different: for instance, meals with colleagues are frequently mentioned in the survey, whereas the social posts favor friends. We can speculate about the possible causes – for instance, the daily lunch situation with colleagues has a higher probability of being “caught” in the survey sample than the occasional, non-routine get-together with friends. The two types of situation also differ in their salience and emotional engagement.

Remarkably, distributions become very similar once contextual filters are set as an additional “control” on the social dataset. Once restricted by the location of the meal – i.e., whether it takes place at home or outside – the relative order of the social settings shows a striking difference:

How can the two methods be combined efficiently?

Our findings can be summarized as follows:

  • The two methods operate on different cognitive levels: survey data allows to measure awareness, whereas social data addresses salience, relevance and judgment.
  • Appropriate context filters on the social dataset lead to comparable results.
  • Surveys can be used to cover complex, differentiated questions with relatively few possible parameter values; the complexity of questions that can be covered by social data is limited.
  • The “answers” obtained from social data for individual questions and parameters are much more differentiated and exhaustively cover all aspects that actually appear relevant to the user.

How can the two methods be combined beyond a simple comparison? One natural synergy would result from applying the social media analysis in a first step to prepare the survey design. Social data can be mined exploratively to trace the questions, parameters and values which are relevant to consumers. With the relevant topics covered, the survey can be used to get structured, clean and complete answers to even complex questions. This approach increases customer centricity and allows to reduce subjectivity biases on the part of the researcher.  A relevance-driven survey design potentially also reinforces the intrinsic engagement of the respondents and motivates them to provide maximally authentic and truthful answers to the survey questions.

 

About: This study is a joint project with G-I-M. You can download the full presentation here.

 

Author: Adrian De Riz

In China, you are what you drive

It is not unusual for Chinese grooms and brides to be chauffeured in a Bentley, Maybach or Rolls Royce, while their entourage follows in a uniform suit of upper-class vehicles. The same holds true for Chinese business executives, who expect and are expected to be driven in higher class cars. In many aspects of Chinese life, the car reflects a person’s “face”. This cultural importance of cars, together with the growth of the Chinese economy, creates a strong demand in the Chinese luxury car market.

The Chinese car market: different and too big to miss

Historically, the Chinese luxury car segment has been served by non-Chinese players from Europe, the US and Japan. Built in and for Western markets, these luxury cars were often not designed with Chinese customers in mind. However, for the past 10 years China has been the biggest car market, and will remain so for at least another decade to come. The Chinese car market has become a crucial battleground that these brands are not willing to give up.

Catering to the needs of Chinese customers means winning the market

For global car brands, product localisation can decide over success or failure in China. Audi serves as a great example for a brand that recognized this opportunity and acted on it. In Europe, the majority of executives drive themselves to work. Therefore, the driving experience behind the wheel often dictates the purchasing decision. Chinese executives, on the other hand, are driven to work by their chauffeurs. Aspects of driving the car are secondary to the perceived comfort in the back of the car. The following chart shows the relevance of various interior components in executive cars and sedans as distilled from discussions in Chinese social media:

The backseat is clearly the most relevant component for executive cars, while front seats are more relevant for sedans. Additionally, maneuvering constraints and parking problems from excessive car length are less of a concern than it would be in Western markets.

Understanding these differences in customer needs, Audi focused its product development on the customer experience in the back of the car. In 2005, it introduced their first products designed exclusively for the Chinese market: the Audi A6L and A8L. The two models explicitly target the Chinese executive segment with enlarged wheelbases of up to 30cm. This additional length is applied in the back of the car, allowing for more leg space and room for movement. Additionally, Audi used the finest materials and accessories, normally found in the front row, and moved them to the back.

The result: a Chinese champion in the executive car segment was born. A6L sales compared to the base model soared by 27% in the first quarter. It took competitors half a decade to close this product development gap. The following chart shows sentiments for A6L seats and overall model perception, compared to sentiments for the competing products by BMW and Mercedes Benz:

A6L manifests the best sentiment both for the back seat and for the overall perception of the product.

In the end, customer centricity wins

By recognizing the cultural context and tailoring their product accordingly, Audi was able to design a car that perfectly addresses the requirements of the Chinese executive car market. This insight of the relative importance of front and back seat made Audi the trendsetter. Listening closely to the needs and wants of the local target group, the brand became the #1 choice among Chinese executives and gained a competitive advantage of several years. 

Author: Janna Lipenkova

Market research surveys typically consist of two types of questions, namely “closed” and “open” questions. Closed questions limit the possible range of responses and result in structured data. By contrast, “open” questions allow the respondent to reply with free text, expressing their full-fledged, authentic opinion. This flexibility makes open questions attractive in market research: given the right wording, an open question can trigger responses that are broader, deeper and provide more authentic insight than the rigid, black-and-white multiple-choice question. 

Challenges with the analysis of open-ended questions

Why, then, are open-ended questions not widely adopted in market research? One of the reasons is that they are difficult to analyze due to their unstructured character. Most researchers use manual coding, where open questions are manually structured into a classification scheme (the “coding frame”) according to the topics they represent. The following table shows some examples:

Manual coding comes with several issues:

  • High cost: manual coding is labor-intensive and thus expensive.
  • Errors: depending on their mental and physical shape, human coders make mistakes or provide inconsistent judgments at different points in time.  
  • Subjectivity: due to the inherent ambiguity and the subtleties involved in human language, different people might code the same response in different ways. 

Last but not least, coding hundreds or even thousands of open-end questions can be a frustrating endeavor. In today’s world, where AI is used to automate and optimize virtually every repetitive task, it seems natural to turn to automated processing to eliminate the monotone parts of coding. Beyond the several benefits of automation, this also creates time for more involved challenges that require the creativity, experience and intellectual versatility of the human brain. 

Using Natural Language Processing as a solution

Natural Language Processing (NLP) automates the manual work researchers do when they code open-ended questions. It structures a text according to the discussed topics and concepts as well as other relevant metrics, such as the frequency, relevance and sentiment. Beyond speeding up the coding process, NLP can be used to discover additional insights in the data and enrich the end result. The capacity of a machine to look at a large dataset as a whole and discover associations, regularities and outliers is larger than that of the human brain. 

Three algorithms – topic modeling and classification, concept extraction and sentiment analysis – are particularly useful in the coding process. 

Topic modeling and classification

Topic modeling detects abstract topics in the text. Topic modeling is an unsupervised learning method similar to clustering, and learns lexical similarities between texts without a predefined set of classes. Thus, it is particularly useful in the initial stage of the construction of a coding frame. The following word cloud shows words that are frequently mentioned in texts about comfort:

Topic classification is similar to topic modeling. However, it works with a given coding frame and classifies each text into one of the predefined classes. This means, that it can be used for coding after the coding frame has been constructed.

Concept extraction

Concept extraction matches concrete strings in the text. Whereas topic modeling and classification work with – often implicit – lexical information distributed everywhere in the text, concept extraction matches the exact words and phrases that occur in the text. On a more advanced level, concept extraction also uses the structure of the lexicon and can deal with lexical relations, such as:

  • Synonymy: EQUALS-relationship, e. g. VW EQUALS Volkswagen 
  • Hypernymy: IS-A-relationship, e. g. Sedan IS-A Vehicle
  • Meronymy: PART-OF relationship, e. g. Engine PART-OF Car

Concept extraction usually focuses on nouns and noun phrases (engine, Volkswagen). In the context of evaluations (open-ended questions), it is also useful to extract concepts that are “hidden” in adjectives (fast ➤ Speed, cozy ➤ Comfort) and verbs (overpay ➤ Price, fail ➤ Reliability).

In terms of implementation, there are two main approaches to concept extraction: the dictionary-based approach and the machine-learning approach. The dictionary-based approach works with predefined lists of terms for each category (also called “gazeteers”). The machine-learning approach, on the other hand, learns concepts of specific types from large quantities of annotated data. As a rule of thumb, the smaller and more specific the available dataset, the more efficient the use of pre-defined lists of concepts and linguistic expressions. 

Sentiment analysis

Sentiment analysis detects whether a given text has a positive or a negative connotation. Sentiment analysis can be further detailed to the level of individual aspects mentioned in a text, thus allowing to detect mixed sentiments on the phrase level: 

“Classy and reliable, but expensive.”

Sentiment analysis operates on an emotional, subjective and often implicit linguistic level. This subtlety raises several challenges for automation. For example, sentiment analysis is highly context dependent: a vacuum cleaner that sucks would probably get a neutral-to-positive sentiment; by contrast, the internet connection in a car normally shouldn’t “suck”. Another complication is irony and sarcasm: on the lexical level, ironic statements often use vocabulary with a clear polarity orientation. However, when put into the surrounding context, this polarity is inversed:

“Really great engineering… the engine broke after only three weeks!”

Irony is mostly detected from anomalies in the polarity contrasts between neighboring text segments. For instance, in the example above, “really great engineering” gets a strong positive sentiment which radically clashes with the negative feel of “the engine broke after only three weeks”. Since the two phrases are directly juxtaposed without a conjunction such as “but” or “although”, the machine is able to recognize the overall negative connotation. 

Combining Human and Artificial Intelligence

Summing up, using NLP for the coding of open-ended questions leverages several benefits of automation: it speeds up the process and saves human labor on the “frustrating” side of things. It achieves better consistency and objectivity, mitigating the effects of human factors such as fatigue and inconsistent judgment. Finally, the ability of the machine to process large data quantities at a high level of detail allows a level of granularity that might be inaccessible to the human brain. 

While it is out of question that NLP automation increases the efficiency of verbatim coding, keep in mind that current AI technology is not perfect and should always have a human in the driving seat. Methods such as NLP can process large quantities of data in no time, but they do not yet capture the full complexity of language. A combination of high-quality NLP with a carefully engineered process for continuous optimization will ensure a rewarding journey towards in-depth understanding of the opinions, needs and wants of end consumers.   

Authors: Xiaoqiao Yu, Daryna Konstantinova, Sonja Anton

Comparing Alessandro Michele and Virgil Abloh

Louis Vuitton and Gucci are quickly climbing the ranks as the most valuable international fashion brands of 2018. We compared the public image of Gucci’s design director Alessandro Michele, with Louis Vuitton’s menswear designer Virgil Abloh in China:

So what?

As the founder of the successful urban lifestyle brand Off-white, Virgil Abloh is a blow of fresh air in high fashion. His designs are perceived as fresh and original against Louis Vuitton’s luxurious backdrop. Alessandro Michele incorporated renaissance aspects in his designs, giving them a romance and vintage feel.

Heated discussion online

Author: Janna Lipenkova

In the past years, the tech world has seen a surge of Natural Language Processing (NLP) applications in various areas, including adtech, publishing, customer service and market intelligence. According to Gartner’s hype cycle, NLP has reached the peak of inflated expectations in 2018. Many businesses see it as a “go-to” solution to generate value from the 80% of business-relevant data that comes in unstructured form. To put it simply – NLP is wildly adopted with wildly variable success.

In this article, I share some practical advice for the smooth integration of NLP into your tech stack. The advice summarizes the experience I have accumulated on my journey with NLP — through academia, a number of industry projects, and my own company which develops NLP-driven applications for international market intelligence. The article does not provide technical details but focusses on organisational factors including hiring, communication and expectation management.

Before starting out on NLP, you should meditate on two questions:

1. Is a unique NLP component critical for the core business of our company?

Example: Imagine you are a hosting company. You want to optimise your customer service by analysing incoming customer requests with NLP. Most likely, this enhancement will not be part of your critical path activities. By contrast, a business in targeted advertising should try to make sure it does not fall behind on NLP — this could significantly weaken its competitive position.

2. Do we have the internal competence to develop IP-relevant NLP technology?

Example: You hired and successfully integrated a PhD in Computational Linguistics with the freedom to design new solutions. She will likely be motivated to enrich the IP portfolio of your company. However, if you are hiring middle-level data scientists without a clear focus on language that need to split their time between data science and engineering tasks, don’t expect a unique IP contribution. Most likely, they will fall back on ready-made algorithms due to lack of time and mastery of the underlying details.

Hint 1: if your answers are “yes” and “no” — you are in trouble! You’d better identify technological differentiators that do match your core competence.

Hint 2: if your answers are “yes” and “yes” — stop reading and get to work. Your NLP roadmap should already be defined by your specialists to achieve the business- specific objectives.

If you are still there, don’t worry – the rest will soon fall in place. There are three levels at which you can “do NLP”:

  1. Black belt level, reaching deep into mathematical and linguistic subtleties
  2. Training & tuning level, mostly plugging in existing NLP/ML libraries
  3. Blackbox level, relying on “buying” third-party NLP

The black belt level

Let’s elaborate: the first, fundamental level is our “black belt”.  This level comes close to computational linguistics, the academic counterpart of NLP. The folks here often split into two camps — the mathematicians and the linguists. The camps might well befriend each other, but the mindsets and the way of doing things will still differ.

The math guys are not afraid of things like matrix calculus and will strive on details of newest methods of optimisation and evaluation. At the risk of leaving out linguistic details, they will generally take the lead on improving the recall of your algorithms. The linguists were raised either on highly complex generative or constraint-based grammar formalisms, or alternative frameworks such as cognitive grammar. These give more room to imagination but also allow for formal vagueness. They will gravitate towards writing syntactic and semantic rules and compiling lexica, often needing their own sandbox and taking care of the precision part. Depending on how you handle communication and integration between the two camps, their collaboration can either block productivity or open up exciting opportunities.

In general, if you can inject a dose of pragmatism into the academic perfectionism you can create a unique competitive advantage. If you can efficiently combine mathematicians and linguists on your team — even better! But be aware that you have to sell them on an honest vision — and then, follow through. Doing hard fundamental work without seeing its impact on the business would be a frustrating and demotivating experience for your team.

The training & tuning level

The second level involves the training and tuning of models using existing algorithms. In practice, most of the time will be spent on data preparation, training data creation and feature engineering. The core tasks — training and tuning — do not require much effort. At this level, your people will be data scientists pushing the boundaries of open-source packages, such as nltk, scikit-learn, spacy and tensorflow, for NLP and/or machine learning. They will invent new and not always academically justified ways of extending training data, engineering features and applying their intuition for surface-side tweaking. The goal is to train well-understood algorithms such as NER, categorisation and sentiment analysis, customized to the specific data at your company.

The good thing here is that there are plenty of great open-source packages out there. Most of them will still leave you with enough flexibility to optimize them to your specific use case. The risk is on the side of HR — many roads lead to data science. Data scientists are often self-taught and have a rather interdisciplinary background. Thus, they will not always have the innate academic rigour of level 1 scientists. As deadlines or budgets tighten, your team might get loose on training and evaluation methods, thus accumulating significant technical debt.

The blackbox level

On the third level is a “blackbox” where you buy NLP. Your developers will mostly consume paid APIs that provide the standard algorithm outputs out-of-the-box, such as Rosette, Semantria and Bitext (cf. this post for an extensive review of existing APIs). Ideally, your data scientists will be working alongside business analysts or subject matter experts. For example, if you are doing competitive intelligence, your business analysts will be the ones to design a model which contains your competitors, their technologies and products.

At the blackbox level, make sure you buy NLP only from black belts! With this secured, one of the obvious advantages of outsourcing NLP is that you avoid the risk of diluting your technological focus. The risk is a lack of flexibility — with time, your requirements will get more and more specific. The better your integration policy, the higher the risk that your API will stop satisfying your requirements. It is also advisable to invest into manual quality assurance to make sure the API outputs deliver high quality.

Final Thoughts

So, where do you start? Of course, it depends — some practical advice:

  • Talk to your tech folks about your business objectives. Let them research and prototype and start out on level 2 or 3.
  • Make sure your team doesn’t get stuck in low-level details of level 1 too early. This might lead to significant slips in time and budget since a huge amount of knowledge and training is required.
  • Don’t hesitate — you can always consider a transition between 2 and 3 further down the path (by the way, this works in any direction). The transition can be efficiently combined with the generally unavoidable refactoring of your system.
  • If you manage to build up a compelling business case with NLP — welcome to the club, you can use it to attract first-class specialists and add to your uniqueness by working on level 1!

About the author: Janna Lipenkova holds a PhD in Computational Linguistics and is the CEO of Anacode, a provider of tech-based solutions for international market intelligence. Find out more about our solution here

We are excited to start our cooperation with GIM Gesellschaft für innovative Marktforschung mbH, which gives us the opportunity to benefit from the long-standing expertise of GIM in the area of market research. Together, we are going to work on innovative approaches to consumer insight and produce the best blend of “traditional” and technology-based methods. Read more…