Uncovering Disruptive Forces to Stocks using NLP

Highlighting the power of natural language processing

September 2021. Reading Time: 10 Minutes. Authors: Wachi Bandara, PhD, and, Rodolfo Martell, PhD

This research note is a guest post from Wachi Bandara, PhD, Chief Investment Officer, and, Rodolfo Martell, PhD, Head of Portfolio Strategy, of Pluribus Labs LLC, a San Francisco-based systematic active equity manager.


  • Traditional equity factors are not useful for identifying disruption in public equities
  • Natural language processing (NLP) can be applied to uncovering the forces of disruption
  • Especially when applied to unstructured data from private markets


Every investor has wished at some point, whether they admit to it or not, to have access to a crystal ball that could foresee the future. For all Tolkien fans out there, a palantír. For all Marvel fans, the Eye of Agamotto.

FactorResearch normally publishes work on factors (duh!) because we can for the most part agree on their importance as drivers of excess returns, in large part due to the growing body of published, unpublished, and self-published work in this field of finance. Factors have been used to explain the past, and through somewhat heroic assumptions, to extrapolate into the future (try Finominal’s Know Your Factors for a factor exposure analysis).

The problem is, of course, the latter use assumes the future will look like the past (which invites decades-old debates about how efficient markets are). This creates a catch-22 conundrum: either we assume that factors are good to forecast the future (which implies a strong intellectual hubris or we assume factors are not well equipped to help forecast the future (in which case we accept that making future investment decisions based on factors is not a good idea).

So, if traditional factors are not efficient nor useful to understand the forces that are driving disruption in public equities, what can we do?

Let’s break this problem in smaller parts. The first step towards uncovering the forces shaping change (disruptive or continuous) in public equities is to measure innovation.


In quantitative asset management, innovation has been until recently the Moby Dick of firm characteristics – that elusive item everyone knows is important, but no one could reliably model in a way that captured its desired properties – of which one is the well-established connection between innovation and future stock returns. It has been difficult to capture consistently (in a time series and cross-sectional form) firms’ exposure to innovation for several reasons:

  1. The first place where investors look is Research and Development (R&D). This is a bad proxy because, among other reasons, it is a discretionary item that is treated as a short-term cost and is difficult to amortize.
  2. There is a strong paradox when it comes to a firm’s decision to disclose the details of their R&D: on one hand, companies want to let investors know about it but on the other hand, they don’t want to give up a competitive advantage by disclosing too much.
  3. Sell-side analysts face career concerns that make them averse to making calls based on innovation given the uncertain long-term nature and low hit rate of these calls
  4. Public equity markets provide a very poor backdrop to measure innovation given the agency issues associated with the financing of innovative ideas and concepts.

What to do then? One solution is to look at alternative markets where successful innovation is identified and understand how it is rewarded. The challenge is that we need more tools than we have traditionally used in factor research. Specifically, we need to use natural language processing.


The use of applications constructed via natural language processing (NLP) is present in almost every aspect of our lives, from Google searches to Netflix recommendations. These techniques, when applied to all publicly available information on alternative markets allows us to create a model of innovation based on the support (financing), attention (news), and protection (patent activity) surrounding funding events in the non-public space. The result is the identification of the innovative concepts embedded in swaths of unstructured data.

The chart below shows a very small (emphasis on small) section of the data, where the cream and yellow-colored diamonds represent concepts like “cryptocurrency” or “hydrogen fuel”, the circles represent conventional industries, like “banks” or “diversified metals” and the gray lines represent the connectivity between concepts and industries. For instance, notice how “bitcoin” is connected to “gold” – readers that are into crypto will immediately recognize this as the disruptive threat that bitcoin represents to gold as a store of value.

NLP Example

Source: Pluribus Labs LLC


More broadly, our analysis of innovation indicates that in the medium term some industries likely to experience disruptive innovation are precious metals, regional banks, airlines, agrochemical, and utilities. It is important to remember that disruption is a process, not a single event.

Because disruption takes time, incumbents frequently overlook potential disruptors and their transformational impact. Most importantly, some disruptive innovations succeed, others don’t. With that in mind, let’s look at the top concepts likely to disrupt two industries: consumer financial services and metals and mining.

Concepts Likely To Disrupt Consumer Financial Services

Source: Pluribus Labs LLC

The most important disruptive trend in consumer financial services is the one represented by the nascent adoption of DeFi (decentralized finance) technologies. Growth is already steep and could still accelerate in the near term. Sell-side institutions have reported that DeFi has generated $2.0 trillion in total investment interest, as of April 2021, and that this figure has doubled in the first third of 2021 alone.

In an environment where political tensions run high and economic uncertainty abounds, there is obvious appeal to a technology that is global, permissionless, flexible, transparent, and interoperable. Some of the most promising DeFi projects are native lending tokens, which allow lenders to passively farm income while borrowers get access to attractively priced capital to use in numerous traditional capacities.

Concepts Likely To Disrupt Metals & Mining

Source: Pluribus Labs LLC

Looking at metals and mining, it is interesting that the two most disruptive concepts are Impact Investing and Venture Capital. Within metals, steel is one of the most integral components of modern civilization, serving as the skeleton for buildings, roads, railways, and other components of contemporary infrastructure.

At surface, many would assume that there is little technological innovation impacting steel production. However, “smart” plants and other environmentally friendly innovations are displacing and replacing traditional production environments with highly automated, digitalization-enabled facilities which unlock economic and ecological efficiencies throughout the production process.

Climate-driven innovation is directly aiming at steel production given it is a massive source of pollution, generating 7% to 9% of all direct emissions from fossil fuels globally. We should expect to hear more about projects aiming to create fossil-free processes in the steel industry soon (read ESG vs Low Carbon Investing).

Technological innovation is also enhancing the viability of recycling in the industry. Steel production generates high volumes of waste materials like dust, fines, and mill scale. Innovations in by-product recycling are allowing proactive producers to convert these residual materials into useful and profitable resources.


These are just glimpses of the power and potential of natural language processing when used to answer a well-defined question. Potential extensions and applications of this methodology create exciting opportunities to asset managers and allocators alike (read Quant Strategies: Theory vs Reality).


AI, what have you done for me lately?

Venture Capital, worth venturing into?


Wachi Bandara is the Chief Investment Officer of Pluribus Labs LLC. Prior to Pluribus Labs, Wachi was the co-founder and Chief Research Officer of the original Pluribus Labs entity, a data science and machine learning start-up that was acquired by Golden Gate Capital in 2017. Previously, he was a Senior Quantitative Researcher at Mellon Capital, focused on fixed income and equity risk modeling. He holds a PhD and an MS in finance from The George Washington University, a MS in applied mathematics from the Florida Institute of Technology, and a BS in pure mathematics from the University of Colombo, Sri Lanka. Wachi also holds the Financial Risk Manager certification from the Global Association of Risk Professionals.

Connect with me on LinkedIn.


Rodolfo Martell is the Head of Portfolio Strategy at Pluribus Labs LLC. He previously worked at AQR as a Managing Director in the Global Stock Selection Group. Before that, he was employed by QMA, a PGIM company, as a global strategist and co-chair of the ESG Committee. He started his investment career at BlackRock, where he ultimately worked as a senior portfolio manager in the Scientific Active Equities Group. He has been a lecturer at UC Berkley and an Assistant Professor at Purdue University. He holds PhD in finance from Ohio State University.

Connect with me on LinkedIn.