Download our summary of quantum computing.
In our work, we have the pleasure of meeting many entrepreneurs who want to “do something with NLP”, and organizations who want to count NLP in their digital stack. Indeed, NLP is cool and trendy – especially since it deals with language, a natural and powerful capacity of our brains. But when it comes to business, using NLP for the sake of NLP is not a good idea – it is a tool and should be implemented with a specific use case in mind. In the right contexts, NLP can increase productivity or enhance existing knowledge. And, when married with specific expertise and intuition from the actual business domain, it can activate your creative muscles and trigger disruptive ideas and completely new ways of doing things. The following chart shows the variety of business functions that are affected by NLP, based on our analysis of related online news and discussions:
In the following, we explain three overarching goals for implementing NLP inside a business and share our experience about priorities and common pitfalls for each category.
Streamline existing tasks
You can save costs with NLP by increasing the productivity of your work with existing text data. NLP can be used to streamline routines that involve texts. This scenario is not only easy to understand, but also easy to measure in terms of ROI – you simply count the cost in man-hours used for a specific task “before” and “after” the implementation of NLP.
A prominent and familiar example is customer support. Tasks in customer support are often focussed around a small number of variables such as products and processes. These are perfectly known inside the business, but may not be familiar to external customers. On the receiving side, NLP can be applied to analyse and classify standard requests such as calls and complaint emails. Responses can be automated with conversational AI, as implemented in chatbots and virtual assistants. Algorithms such as sentiment analysis allow to evaluate large quantities of customer feedback, thus giving the business the opportunity to react and win on customer centricity.
There are many other examples for NLP automation, such as the use of machine translation software and intelligent document analysis. The nice thing about most of these applications is that training data is already available in the organization – the challenge is to set up a stable, sustainable supply of clean data and a feedback loop via your employees. The actual NLP implementation can be based on standard algorithms, such as sentiment analysis and entity recognition, customized with business-specific knowledge. This customization is often lexical. For example, to tailor an entity extraction algorithm to parse product failure complaints, it will be useful to “teach” the entity recognizer which products and features are likely to fail. Finally, in production mode, human verification will still be needed for those cases where the NLP algorithms are not confident about their output.
Support your decisions with better information
This second area allows to enhance existing data analytics use cases with the power of text data. Most of us heard that 80% of business-relevant data exists in unstructured form. However, with NLP just entering the “live” business arena, mainstream analytical techniques still focus on statistical and quantitative analysis of structured data – i.e., the “other” 20%. By adding unstructured data sources to the equation, a business can improve the quality and granularity of the generated results. Ultimately, NLP generates a unique information advantage and paves the way to better decisions.
An example where this scenario applies is the screening of target companies for M&A transactions. The “traditional” target screening process is highly structured and quantitative. It focusses on basic factors such as location and area of the business, formal criteria (legal form, shareholder structure) and, of course, financial indicators such as revenue and profitability. Many of the less tangible, but central aspects of a business – for example, its intellectual property, public image and the quality of the team – don’t surface in the result and have to be manually investigated on the basis of additional data sources. NLP allows to leverage a variety of text data that contains information about a company – social media, business news, patents etc. – to efficiently extract this information for a large number of companies.
NLP can enhance decision making in all areas of market intelligence, such as trend monitoring, consumer insight and competitive intelligence. In general, use cases in this category require a more involved layer of business logic. While NLP is used to structure large quantities of data, additional knowledge of the business and its context has to be applied to make sense of this data. The M&A screening example first requires a definition of the goals of an M&A transaction and, from there, the relevant parameters: if the goal is to expand B2C sales into a different geography, the perception of the target company by consumers is crucial. On the other hand, the acquisition of a complementary technology will direct the focus on the intellectual assets of the target. The relevant parameters for each goal have to be formulated by the business-side team and can then be coded into a machine-usable form.
Conquering the greenfields
So far, we have seen relatively defensive approaches to improving what is already being done. But NLP can also trigger bold new “ways of doing things” and lead to high-impact applications that might justify whole new businesses. This journey requires the right equipment – not only solid domain knowledge, but also market expertise and the ability to find sweet spots at the intersection of technology and market opportunity.
As an example, NLP can be applied in the mental health area to analyze the mental and emotional state of a person. This can be used to identify endangered individuals, such as individuals suffering from severe depression and suicide risk. Traditionally, these individuals are identified and treated upon a proactive doctor visit. Naturally, the more “passive” cases are rarely recognized in time. NLP techniques such as sentiment and emotion analysis can be applied on social media to screen the mental and emotional states of users, thus pointing out individuals that are in a high-risk state for further support.
Further examples for disruptive use cases can be found in various industries, such as drug discovery in healthcare and automatic broadcasting in media. Venturing in this space requires a high confidence and competence in one’s own industry. As everywhere else, disruptions are often pioneered by start-ups whose flexibility and innovation focus give rise to fruitful intersections between business model and technology. However, with the right amount of technological competence and a technology-driven mindset, incumbents can also strive in these areas, massively capitalizing on existing assets such as data, market expertise and customer relations.
In the end – things don’t come for free. NLP has the potential to save costs, improve decision making and disrupt any language-intensive area. To get on this path, businesses need crystal-clear formulations of their goals and use cases and the willingness to customize out-of-the-box NLP offerings to their specific knowledge base. Those who get it right will not only reap the benefits of specific projects down the road, but also uncover new use cases and strategic potentials for NLP throughout the whole organization.
Earlier this year, Gartner published its new hype cycle for Artificial Intelligence . Business adoption of AI is experiencing significant growth – thus, according to Gartner’s 2019 CIO Agenda survey, organizations that have deployed artificial intelligence (AI) grew from 4% to 14% between 2018 and 2019 . However, until now, only two AI technologies – GPU Accelerators and Speech Recognition – have reached the plateau of productivity. The greater majority is situated in the first half, with a lot of excitement and experimentation, but also disillusionment and steep learning curves to be expected in the years to come.
To dive into the details of AI adoption, we mined a large-scale dataset of Web data for industries and use cases which are affected by each trend. The following overview shows the amount of discussion for the top 15 trends in our dataset in 2019 (excluding umbrella terms such as Machine Learning and Deep Learning):
Figure 1: Amount of discussion for per technology
As differentiated in this chart, our analysis integrates three types of Web resources with different degrees of specialization on AI:
- General news are business and economic news without an explicit specialization, such as businessinsider.com and independent.co.uk.
- Technology blogs are focused on technological and digital topics, both from the technology and the business perspective. Examples are techcrunch.com and theverge.com.
- Blogs with AI focus are specialized AI and machine learning resources for practitioners, such as machinelearningmastery.com and aitrends.com.
As can be seen from the chart, the bulk of discussion goes on in technology blogs and blogs with AI focus. Topics with significant ethical, infrastructural and regulatory stakes, such as Autonomous Driving, Quantum Computing and Artificial General Intelligence, also attract considerable attention in general news.
It is worth noting that the considered trends belong to different conceptual classes and levels of AI. Some of them, for example Computer Vision and Natural Language Processing, are whole subdomains of AI that can be relevant to a multitude of use cases and industries. Others, like AutoML, simplify and scale the integration of AI in the enterprise context. Autonomous Vehicles, Chatbots, Virtual Assistants and Conversational UIs are situated at the application level. Finally, concepts such as Augmented Intelligence relate to the human-machine interface and are mainly meant to spur the user acceptance of new AI technologies.
Artificial Intelligence across industries
The following chart shows how tightly the considered trends are associated with various B2B and B2C industries:
Figure 2: AI trends by industry
Let’s look at some highlights:
- Computer Vision is highly relevant in Automotive since it is an important component of Autonomous and Assisted Driving technology.
- Autonomous Vehicles are tightly associated with Automotive – not really a surprise. In our analysis, Autonomous Vehicles are not necessarily cars – they also include Aerospace devices such as drones and space robots. Finally, the Construction industry is developing highly specialized autonomous utility vehicles. Operating on a well-delimited, well-understood terrain, the adoption of these narrow-focus vehicles faces lower hurdles in terms of infrastructure, ethics and regulation.
- Quantum Computing is gaining attention as traditional computers approach their limits in terms of processing power. Quantum Computing is widely researched in the domain of Aerospace, where it is used to address the complex challenges of flight physics. Its high relevance to the Energy industry is due less to the possible applications, but to the fact that it is itself a solution to a major anticipated energy problem: by 2040, energy needs for classical computational computing will exceed capacity that can be delivered by the worldwide grid . Alternatives such as the quantum computer are set out to prevent this bottleneck.
- Robotic Process Automation (RPA) is performed by virtual “software” robots as opposed to physical robots which are used in manufacturing etc. RPA is essential in the Finance industry, where it enhances and automates routine tasks such as verification, credit scoring and fraud detection.
- Conversational UIs are especially relevant to B2C industries such as Entertainment, Fashion and Retail. Their distribution in our data roughly correlates with Chatbots, Virtual Assistants and Speech Recognition, a core technology behind voice interfaces. Most Big Data companies are pushing their own conversational solutions such as Amazon Alexa, Google Assistant etc. According to Gartner, the conversational interface is one of the technologies with the biggest transformational impact in a short-term perspective.
Artificial Intelligence across use cases
The following chart illustrates the relevance of the considered trends to industry-independent use cases and business functions:
Figure 3: AI trends by use case
Let’s consider some of the data points in detail:
- As most of us have experienced, AI experiments are particularly popular in the domain of Customer Service. Especially in B2C, most customer requests can be roughly classified into a finite number of buckets, thus providing a fertile ground for training data creation and automation. Most customer interactions use language, and all trends related to language processing – NLP, Chatbots, Speech Recognition, Virtual Assistants and Conversational UIs – are highly salient in this domain. To a smaller degree, the relevance of these technologies is also visible for Customer Acquisitionand Marketing in general.
- Quantum Computing is strongly associated with Manufacturing. According to IBM, Manufacturing is one of the most promising early beneficiaries of this technology. Quantum Computing can significantly scale up and optimize tasks such as chemical discovery, simulations for product development and supply chain optimization .
- Augmented Intelligence is strongly present across most of the use cases. It should be kept in mind that Augmented Intelligence is a rather abstract concept, mainly used to communicate and also reassure that in the foreseeable future, AI will be “enhancing” (rather than substituting) the intelligence of human beings. The concept clearly demonstrates and helps to build awareness of the limits of AI on the application level. On the opposite end, Artificial General Intelligence – the intelligence of machines that can learn to perform any intellectual tasks formerly performed by humans – shows a loose association to most of the considered use cases.
A great majority of the considered trends aim to increase the efficiency of existing tasks by reproducing fundamental functions of the human brain such as language and vision. They are necessary building blocks of the overarching vision behind AI, as reflected in the concept of Artificial General Intelligence. For the present moment, the popularity of Augmented Intelligence shows that AI has passed a “reality check” and deployment is smoothened by more realistic expectations about the cooperation between humans and machines. Finally, Quantum Computing is an active area of research which could allow to harness the combined potential of ever-growing data quantities and sophisticated algorithms, thus allowing for a “quantum leap” towards the general application of AI.
 Gartner (2019). Hype Cycle for Artificial Intelligence, 2019. Retrieved from https://www.gartner.com/en/documents/3953603/hype-cycle-for-artificial-intelligence-2019.
 Gartner (2018). Gartner Survey of More Than 3,000 CIOs Reveals That Enterprises Are Entering the Third Era of IT. Retrieved from https://www.gartner.com/en/newsroom/press-releases/2018-10-16-gartner-survey-of-more-than-3000-cios-reveals-that-enterprises-are-entering-the-third-era-of-it.
 SIA / SRA (2015). Rebooting the IT Revolution: A Call to Action. Retrieved from https://www.semiconductors.org/wp-content/uploads/2018/06/RITR-WEB-version-FINAL.pdf.
 IBM (2019). Exploring quantum computing use cases for manufacturing. Retrieved from https://www.ibm.com/thought-leadership/institute-business-value/report/quantum-manufacturing.
When it comes to the adoption of new technologies, the construction industry is on the conservative end of the spectrum. However, in the past years, buzzwords such as 3D printing, Augmented Reality and Big Data have also penetrated architecture and construction. Technologies which have been rather perceived as toys or entertainment some years ago now show serious global potential to disrupt construction and, thus, our urban landscape.
From inflated expectations to implementation
The following chart displays the amount of online discussion for four key technological trends in construction since the beginning of 2016:
All curves show relatively high values until mid-2017, which remind us of the phase of inflated expectations according to Gartner’s Hype Cycle . This enthusiasm is followed overall slump around the end of 2017. Starting in 2018, all trends have a growing tendency and are frequently discussed in the context of implementation and production use. Drones have the most pronounced curve, which might be an indicator of their huge transformative potential.
The analysis is based on a range of major acknowledged English-language Web blogs and portals on construction and architecture. The following chart lists the resources and shows the average number of daily articles analysed for each resource:
Mention numbers are normalised by the quantity of data available for each time span. The analysis is conducted with Concept Extraction, an algorithm of Natural Language Processing. Anacode’s Concept Extraction uses a self-learning ontology which is updated daily from a continuous stream of new Web data.
Would you like to learn more about our analytics capabilities for the construction industry? Get in touch and let’s talk!
 Jackie Fenn , Marcus Blosch. Understanding Gartner’s Hype Cycles. Gartner, 2018.
In the past years, the tech world has seen a surge of Natural Language Processing (NLP) applications in various areas, including adtech, publishing, customer service and market intelligence. According to Gartner’s hype cycle, NLP has reached the peak of inflated expectations in 2018. Many businesses see it as a “go-to” solution to generate value from the 80% of business-relevant data that comes in unstructured form. To put it simply – NLP is wildly adopted with wildly variable success.
In this article, I share some practical advice for the smooth integration of NLP into your tech stack. The advice summarizes the experience I have accumulated on my journey with NLP — through academia, a number of industry projects, and my own company which develops NLP-driven applications for international market intelligence. The article does not provide technical details but focusses on organisational factors including hiring, communication and expectation management.
Before starting out on NLP, you should meditate on two questions:
1. Is a unique NLP component critical for the core business of our company?
Example: Imagine you are a hosting company. You want to optimise your customer service by analysing incoming customer requests with NLP. Most likely, this enhancement will not be part of your critical path activities. By contrast, a business in targeted advertising should try to make sure it does not fall behind on NLP — this could significantly weaken its competitive position.
2. Do we have the internal competence to develop IP-relevant NLP technology?
Example: You hired and successfully integrated a PhD in Computational Linguistics with the freedom to design new solutions. She will likely be motivated to enrich the IP portfolio of your company. However, if you are hiring middle-level data scientists without a clear focus on language that need to split their time between data science and engineering tasks, don’t expect a unique IP contribution. Most likely, they will fall back on ready-made algorithms due to lack of time and mastery of the underlying details.
Hint 1: if your answers are “yes” and “no” — you are in trouble! You’d better identify technological differentiators that do match your core competence.
Hint 2: if your answers are “yes” and “yes” — stop reading and get to work. Your NLP roadmap should already be defined by your specialists to achieve the business- specific objectives.
If you are still there, don’t worry – the rest will soon fall in place. There are three levels at which you can “do NLP”:
- Black belt level, reaching deep into mathematical and linguistic subtleties
- Training & tuning level, mostly plugging in existing NLP/ML libraries
- Blackbox level, relying on “buying” third-party NLP
The black belt level
Let’s elaborate: the first, fundamental level is our “black belt”. This level comes close to computational linguistics, the academic counterpart of NLP. The folks here often split into two camps — the mathematicians and the linguists. The camps might well befriend each other, but the mindsets and the way of doing things will still differ.
The math guys are not afraid of things like matrix calculus and will strive on details of newest methods of optimisation and evaluation. At the risk of leaving out linguistic details, they will generally take the lead on improving the recall of your algorithms. The linguists were raised either on highly complex generative or constraint-based grammar formalisms, or alternative frameworks such as cognitive grammar. These give more room to imagination but also allow for formal vagueness. They will gravitate towards writing syntactic and semantic rules and compiling lexica, often needing their own sandbox and taking care of the precision part. Depending on how you handle communication and integration between the two camps, their collaboration can either block productivity or open up exciting opportunities.
In general, if you can inject a dose of pragmatism into the academic perfectionism you can create a unique competitive advantage. If you can efficiently combine mathematicians and linguists on your team — even better! But be aware that you have to sell them on an honest vision — and then, follow through. Doing hard fundamental work without seeing its impact on the business would be a frustrating and demotivating experience for your team.
The training & tuning level
The second level involves the training and tuning of models using existing algorithms. In practice, most of the time will be spent on data preparation, training data creation and feature engineering. The core tasks — training and tuning — do not require much effort. At this level, your people will be data scientists pushing the boundaries of open-source packages, such as nltk, scikit-learn, spacy and tensorflow, for NLP and/or machine learning. They will invent new and not always academically justified ways of extending training data, engineering features and applying their intuition for surface-side tweaking. The goal is to train well-understood algorithms such as NER, categorisation and sentiment analysis, customized to the specific data at your company.
The good thing here is that there are plenty of great open-source packages out there. Most of them will still leave you with enough flexibility to optimize them to your specific use case. The risk is on the side of HR — many roads lead to data science. Data scientists are often self-taught and have a rather interdisciplinary background. Thus, they will not always have the innate academic rigour of level 1 scientists. As deadlines or budgets tighten, your team might get loose on training and evaluation methods, thus accumulating significant technical debt.
The blackbox level
On the third level is a “blackbox” where you buy NLP. Your developers will mostly consume paid APIs that provide the standard algorithm outputs out-of-the-box, such as Rosette, Semantria and Bitext (cf. this post for an extensive review of existing APIs). Ideally, your data scientists will be working alongside business analysts or subject matter experts. For example, if you are doing competitive intelligence, your business analysts will be the ones to design a model which contains your competitors, their technologies and products.
At the blackbox level, make sure you buy NLP only from black belts! With this secured, one of the obvious advantages of outsourcing NLP is that you avoid the risk of diluting your technological focus. The risk is a lack of flexibility — with time, your requirements will get more and more specific. The better your integration policy, the higher the risk that your API will stop satisfying your requirements. It is also advisable to invest into manual quality assurance to make sure the API outputs deliver high quality.
So, where do you start? Of course, it depends — some practical advice:
- Talk to your tech folks about your business objectives. Let them research and prototype and start out on level 2 or 3.
- Make sure your team doesn’t get stuck in low-level details of level 1 too early. This might lead to significant slips in time and budget since a huge amount of knowledge and training is required.
- Don’t hesitate — you can always consider a transition between 2 and 3 further down the path (by the way, this works in any direction). The transition can be efficiently combined with the generally unavoidable refactoring of your system.
- If you manage to build up a compelling business case with NLP — welcome to the club, you can use it to attract first-class specialists and add to your uniqueness by working on level 1!
About the author: Janna Lipenkova holds a PhD in Computational Linguistics and is the CEO of Anacode, a provider of tech-based solutions for international market intelligence. Find out more about our solution here.
This report presents a snapshot on AI in Chinese social media Dec 2018 – Jan 2019, focussing on related technologies, use cases and startups.
Download the report sample here.
This report presents consumer feedback on connected car brands and features, with a special focus on Audi, BMW, Tesla, Geely, Chery and BYD.
Download the report sample China Connected Car Overview-5
With over 860,000 new-energy vehicles (NEVs) sold in the first 10 months of 2018, China is currently on the forefront of electrification. Made in China 2025, China’s strategic plan tracing the energy transition and the internal development into a tech superpower, includes a significant increase of the EV proportion until 2025. The government is generously incentivizing producers and consumers to reach this goal.
Taking a qualitative perspective, how does supply match demand in this highly regulated segment? In this article, we analyze the main players in the industry and shed light on awareness, acceptance and confidence on the side of real-world consumers. The provided data was collected from Chinese social media in 2018 and analysed using Anacode’s text analytics technology.[2,3]
A vivid playing field for automotive producers
Both Chinese and international OEMs are motivated to compete for market share and pioneering technology in the EV race. The following chart shows the frequently mentioned players along with their sentiment:
As expected for an industry with a strong vision and a favourable funding environment, startups were fast to pick up on the NEV wave. In terms of media attention and awareness, these dynamic lightweights compete on a par with the OEM incumbents:
Sophisticated PR strategies, fancy concept cars and huge funding rounds generate a lot of buzz around startups. However, when it comes to actual products on the market, the discussion is dominated by NIO along with a range of OEM-produced models:
The ambivalent perception of Chinese consumers
Putting aside the famous Chinese entrepreneurial spirit, where are down-to-earth consumers on their journey of acceptance for the new technology and its long-term benefits? Are they willing to serve as test bed for technological experiments, pay higher prices and buy into – even temporary – trade-offs in terms of quality and convenience? And, most important, do they actually have trust or sense another bubble coming? To dig into these topics, we created and mined a comparative dataset of random samples of equal sizes (50k posts) relating to NEVs and internal combustion engine (ICE) vehicles. The following chart depicts the general image of NEVs and ICE vehicles:
Product quality is the main concern for NEVs, as opposed to ICE vehicles where design is more prominent. In terms of sentiment, NEVs score lower on central aspects such as quality, design and price. These trade-offs can still be acceptable if there is high awareness for the long-term environmental benefits of NEVs. The following charts shows the discussion quantities and sentiments for environmental aspects on the comparative dataset:
Clearly, environment topics are more relevant to the NEV discussion. The opinions are not always optimistic and, more often than not, critical towards the domestic providers:
你说讽刺不讽刺，宣传“节能”的玩具车，还能呼叫“污染”的燃油车过来给它充电这是传说中的 #蔚来# 产品的移动燃油车充电宝吗？
Isn’t it funny that, in order to push their NEV toys, NIO offers a charging service where a non-electric car comes by to charge your “environment-friendly” NEV?
I will not buy a domestic NEV. The two options I consider are a Toyota PHEV or a petrol car. Domestic OEMs jumped on the NEV train since they failed to produce high-quality gasoline engines and didn’t really have a choice. The actual benefits of NEVs for the environment are currently far below expectation. They are just cheating on subsidies and consumers to move the money around.
Finally, consumer trust is also undermined on the financial level – the topics of excessive subsidies, subsidy fraud and the “burning” of large funding amounts are common topics in the discussions:
Domestic OEMs are not able to develop high-quality internal combustion engines and transmissions, so they had to switch to electric cars. But after many subsidies, consumers realized that the top technologies for NEV batteries, engines and electronic controls are still not from China.
This country has no future for NEVs. The policy has failed – it has subsidized a bunch of so-called environment-friendly NEVs that will have no market after three years.
10 billion RMB of funding is still not enough for these manufacturers! Can the NEV startup Xpeng win the battle against NIO?
China has set highly ambitious goals for the energy transition and its internal technological development which are highly stimulating for players in the automotive industry. However, to create a sustainable business environment, consumer trust and acceptance have to match up to these ambitions. Once government subsidies decrease and gradually turn into “soft”, non-financial incentives, industry players should be prepared to assume responsibility for product-market fit and convince their customers based on reputation, quality and long-term trust and loyalty.
 CAAM (2018). 2018年10月汽车工业经济运行情况. Retrieved from http://www.caam.org.cn/xiehuidongtai/20181109/1505220056.html
 Weibo data 2018 on e-mobility topic. Retrieved from https://www.weibo.com
 Anacode GmbH (2018). Anacode MarketMiner: Web-based Text Analytics for International Market Intelligence. Retrieved from http://anacode.de/wordpress/wp-content/uploads/2017/11/Anacode_Technology_Whitepaper_v1.pdf