To appreciate the edge that artificial intelligence can bring to the financial markets, it’s worth understanding how fast the technological landscape has changed for investors. It has been propelled by research that has incorporated advanced techniques from AI, particularly from several subfields that have played a crucial role.

One is machine learning, which involves training algorithms to learn patterns and make predictions from data. The term dates back to 1959, but the area of study began to receive a lot more attention starting in the early 2000s as computational power increased and the internet helped support a trove of data available to train ML models.

In the past five years, researchers have embraced ML to solve finance problems. In 2020, Booth PhD student Shihao Gu, Yale’s Bryan T. Kelly, and Booth’s Dacheng Xiu summarized the performance of diverse ML models when applied to finance. They presented various models predicting stock returns and compared them in terms of efficiency and accuracy. The best performers were trees and neural networks—statistical methods modeled on decisions and outcomes, and on the human brain, respectively. The paper has been widely cited in research, racking up more than 1,800 citations so far.

That same year, City University of Hong Kong’s Guanhao Feng, Yale’s Stefano Giglio, and Booth’s Xiu created an ML method to evaluate factors and identify those most relevant for asset prices. In 2021, Booth’s Stefan Nagel published a book, Machine Learning in Asset Pricing, to explain how ML tools, which were not originally developed for finance, could be applied to empirical research in pricing and theoretical modeling of financial markets. Researchers have since used ML to predict prices and construct portfolios, among other tasks.

Meanwhile, finance research has progressed in the subfield of natural language processing, an area in which ML techniques are turned on language itself to mine information from text. Early adopters of language tools included Shanghai Jiao Tong University’s Feng Li (a graduate of Booth’s PhD program), who in 2008 studied the relationship between the readability of 10-K filings and corporate performance. He found that companies with longer and more difficult-to-read reports tended to have poorer earnings.

On the level

AI can analyze the complexity of written material, which research has found to be meaningful information to investors.

In separate but related work, University of Notre Dame’s Tim Loughran and Bill McDonald explored sentiment analysis of 10-Ks, finding in a 2011 paper that the existing dictionary of words used to determine the sentiment of a text was not well suited to the financial domain. For example, words such as liability, cost, and tax were scored as negative for sentiment using the traditional dictionary, but these words are not necessarily negative when used in a financial context. Loughran and McDonald in turn created a dictionary tailored to finance.

Other researchers have developed new techniques for analyzing textual data. Boston University’s Tarek Alexander Hassan, Tilburg University’s Stephan Hollander, Frankfurt School of Finance and Management’s Laurence van Lent, and London Business School’s Ahmed Tahoun (then a research scholar at Booth) published research in 2019 that used a simple algorithm for assessing political risk in earnings call transcripts. It counted bigrams (two-word combinations including the constitution or public opinion) used in conjunction with the words risk and uncertainty, or their synonyms, to identify potential risks to companies. The higher the count, the greater the political risk for the company, the research finds. Subsequent papers resulted in a startup, NL Analytics, that works with central banks and international organizations to use these methods for economic surveillance.

The jumps that led to deeper understanding

Finance and accounting have long sought to learn from text. Economists have, too, and originally used a “bag-of-words” model. This relies on counting word frequency in a text—for example, how many times does a document include the words capital and spending? In this case, the more frequently these words occur, the more likely it is that the document discusses corporate policies.

This method is straightforward: in 1963, the late Frederick Mosteller and the late David L. Wallace used it to argue that James Madison, not Alexander Hamilton, had written 12 of the 85 essays and articles in the Federalist Papers whose authorship had been in dispute. By counting commonly used words in Madison’s and Hamilton’s known texts, they could compare them with the count of those words in the disputed articles in the Federalist Papers.

The method is also limited, however. It doesn’t take into account potentially important information such as grammar or the order in which words appear. As a result, it’s unable to capture much in terms of a document’s context. A company’s 10-K filing might report that “Increased transportation costs have offset our revenue gains,” and bag-of-words may interpret this as a positive statement—after all, the word increased and the phrase revenue gains might seem confident. But it misses the fact that increased taken with costs is negative and that offset changes the meaning of revenue gains.

Researchers at Google took a big step toward incorporating this context in 2013 when the company introduced word2vec, a neural network–based model that learns vector representations of words and captures the semantic relationships between them. Vectorization enabled ML models to process and understand text in a more meaningful way. If you have three related words, such as man, king, and woman, word2vec can find the next word most likely to fit in this grouping, queen, by measuring the distance between the vectors assigned to each word.

And in a 2017 paper, a team of researchers led by Ashish Vaswani, who was then at Google Brain, introduced what’s known by practitioners of deep learning as transformer architecture. Transformers form the basis of the large language models we know today and represent a significant improvement over previous architectures in their ability to understand and generate human language, which word-based models could not do.

One prominent LLM, BERT (bidirectional encoder representations from transformers), is used to understand the context of words but was not designed to generate text. It works by considering the words that appear before and after a particular word to decipher its meaning.

Meanwhile, GPT (generative pretrained transformer) is able to predict the most likely next word in a sequence based on the text leading up to it. For example, finish this sentence: “Why did the chicken cross the _____?” Your brain automatically fills in the blank with the word road as the most probable next word, even though many other words would work here, including street, highway, or maybe even yard. GPT does the same thing. Its parameters can be set, however, so that it doesn’t always choose the highest-probability word. This allows more creativity in the text it generates.

Now these LLMs, too, are tools that are being applied to finance, enabling researchers and practitioners in the field to extract increasingly valuable insights from data of all kinds.

More from Chicago Booth Review

More from Chicago Booth

Your Privacy
We want to demonstrate our commitment to your privacy. Please review Chicago Booth's privacy notice, which provides information explaining how and why we collect particular information when you visit our website.