Artificial intelligence will inevitably transform legal practice, but it will take time to realise its full potential. Enormous benefits, and dangerous pitfalls, await the tentative legal user. And careful regulation is also required, writes prominent in-house lawyer and legal transformation adviser, Sharyn Ch’ang
In a world where art imitates life, Hollywood movies often envision a future where artificially intelligent computers and robots seamlessly integrate into society, blurring the line between human and machine.
From the sentient HAL 9000 computer in Arthur C Clarke’s 1968 novel 2001: A Space Odyssey to Pixar’s lovable robot animation Wall-E, these cinematic representations reflect our collective fascination with the possibilities of artificial intelligence (AI).
Twenty-first century real-world advancements in AI are already used in consumer and business applications, even those from the relatively new field of generative AI.
OpenAI’s public domain ChatGPT broke records with more than 100 million users in the first two months of its November 2022 release. Industry heavyweights like Amazon, Alibaba, Baidu, Google, Meta, Microsoft and Tencent have their own generative AI products.
We’re witnessing a democratisation of AI in a manner not previously seen – you don’t need a degree in any subject to use these tools. You just need to type a question.
So, ready or not, generative AI is here. It is impacting our daily lives and will reshape various industries, including the legal profession.
If you need any convincing, just listen to today’s big tech gurus like Google CEO Sundar Pichai, who described AI generally as “the most profound technology humanity is working on. More profound than fire, electricity or anything that we have done in the past,” in an interview with The Verge in May. Pichai is not alone.
There are different types of generative AI models, each quite complex to explain. The generative pre-trained transformer (GPT) model is commonly known due to the popularity of ChatGPT.
However, as ChatGPT is a general-use AI tool, this legal industry-focused article instead references Harvey. This is a generative AI platform built on OpenAI’s GPT-4, and is the best example of a multifunctional, generative AI platform designed specifically for lawyers.
Similar legal industry generative AI tools include CoCounsel, LawDroid Copilot and Lexis+AI, and there are other tools with more limited functionality.
Brief history of AI
The foundations of AI can be traced back to the 1940s and mechanical computing pioneers like Alan Turing. Turing is credited with early visions of modern computing, including the hypothetical Turing machine, which could simulate any computer algorithm. In 1950, Turing’s seminal article titled Computing Machinery and Intelligence described how to create intelligent machines and, in particular, how to test their intelligence.
This Turing test is still considered a benchmark for identifying the intelligence of an artificial system: if a human is interacting with another human and a machine, and is unable to distinguish the machine from the human, then the machine is said to be intelligent.
“Artificial intelligence” was coined as a term in 1956, at the Rockefeller Foundation-funded Dartmouth Summer Research Project on Artificial Intelligence workshop hosted by computer scientists John McCarthy and Marvin Minsky. McCarthy and Minsky, together with computer scientist Nathaniel Rochester (who later designed the first commercial scientific computer, the IBM 701) and mathematician Claude Shannon (who founded information theory) are considered the four founding fathers of AI.
The 1960s saw advancements in machine learning and expert systems, and substantial US and UK government funding for AI research. Outputs like Arthur Samuel’s checkers playing program were among the world’s first successful self-learning programs and a very early demonstration of AI fundamentals.
Expert systems were a prominent area of research for the following 20 years, with the likes of IBM’s Deep Blue chess playing program, which beat world chess champion Gary Kasparov in 1997. However, expert systems are arguably not true AI because they require a formal set of rules, reconstructed in a top-down approach as a series of “if then” statements.
The 1980s and 1990s are often referred to as the AI winter, with high expectations by early AI researchers not reflecting the actual capabilities of the AI models being developed. Nevertheless, this period brought significant developments in subfields like neural networks – important because generative AI models use neural networks and deep learning algorithms to identify patterns and generate new outcomes.
AI research saw a resurgence from the turn of the millennium, driven by advancements in computing power and the availability of vast amounts of data. In 2011, IBM Watson, a computer system capable of answering questions posed in natural language, won popular attention by defeating two human champions on the quiz show, Jeopardy!
IBM’s broader goal for Watson was to create a new generation of deep learning technology that could find answers in unstructured data more effectively than standard search technology. Deep learning, a subset of machine learning focused on artificial neural networks with multiple layers, is also a key technology underpinning generative AI.
AI has boomed since 2011, spurred by breakthroughs in computing power, algorithmic advancement and more systematic access to massive volumes of data. AI applications have flourished across various domains including natural language processing, robotics and autonomous vehicles.
AI continues to evolve rapidly, with ongoing research in explainable AI, ethics and the development of generative AI systems that are more transparent, accurate and accountable. While there is immense potential for further innovation, there are also serious industry concerns about the unexpected acceleration in AI systems development and its direction.
This is best exemplified by comments from computer scientist Geoffrey Hinton. Hinton is widely recognised as the “godfather of AI”, who quit his long-time AI role at Google to speak freely without his views reflecting on the company.
His apocalyptic warning is that while machines are not yet as intelligent as humans, in perhaps five to 20 years they may become super intelligent, and become a threat to humanity.
His overarching concern is the lack of global and national legal and regulatory guardrails to control the misuse and abuse of AI, especially from bad actors proliferating deep fakes, harmful content, and mis- and disinformation. He believes it’s unrealistic to halt or pause AI innovation, so governments worldwide must take control of how AI can be used.
Generative AI is a subfield of AI where the system or algorithmic model is trained with a human help to produce original content like text, images, videos, audio, voice and software code.
One form of generative AI is a large language model (LLM), a neural network trained on large amounts of the internet and other data. It generates outputs in response to inputs (prompts), based on inferences over the statistical patterns it has learned through training.
ChatGPT and OpenAI’s GPT-4, on which Harvey is built, are popular examples of LLMs. The model can process prompts at a speed, volume and accuracy that outranks average human capability.
Unlike conventional AI systems that primarily classify or predict data (think Google search), generative models learn the patterns and structure of the input training data, then generate new content that has similar characteristics to the training data. However, responses are based on inferences about language patterns rather than what is known to be true or arithmetically correct.
Put another way, LLMs have two central abilities:
- Taking a question and working out what patterns need to be matched to answer the question from a vast sea of data; and
- Take a vast sea of data and reversing the pattern-matching process to become a pattern creation process.
Both functions are statistical, so there’s a certain chance the engine will not correctly understand any question. There is a separate probability that its response will be fictitious; a hallucination.
Given the scale of data on which the LLM has been trained, and the fine-tuning it receives, it can seem it knows a lot. However, the reality is, it is not truly “intelligent”. It is only processing patterns to produce coherent and contextually relevant text. There is no thinking or reasoning.
The accuracy and relevance of an output will often depend on the input of the prompt engineering; how the question is asked and contextualised. A prompt might include:
- Particular instructions, like asking the LLM to adopt a particular role, or requesting a particular course of action if it does not know the answer;
- Context like demonstration examples of answers;
- Instructions about the response format; and
- An actual question.
Even with well-crafted prompts, answers can be wrong, biased and include completely fictitious information and data, with sometimes harmful or offensive content.
However, the potential benefits of generative AI easily outweigh such shortcomings, which major AI market makers like OpenAI are consciously working to improve.
Who and what is Harvey?
Harvey, the generative AI technology built on OpenAI’s GPT-4, is a multi-tool software-as-a-service AI platform that is specifically designed to assist lawyers in their day-to-day work.
It comes from a start-up company founded by two roommates – Winston Weinberg, a former securities lawyer and antitrust litigator from O’Melveny & Myers, and Gabriel Pereyra, previously a research scientist at DeepMind, Google Brain (one of Google’s AI groups) and Meta AI.
The story goes that Pereyra showed Weinberg OpenAI’s GPT-3 text-generating system and Weinberg quickly realised its potential to improve legal process workflows.
In late 2022, Harvey emerged, with USD5 million in funding led by OpenAI’s startup fund. In April 2023, Harvey successfully secured USD21 million series A funding led by Sequoia, with participation again from OpenAI together with Conviction, SV Angel and Elad Gil and Mixer Labs.
Typical of generative AI models, Harvey is an LLM that uses natural language processing, machine learning and data analytics to automate and enhance legal work, from research to analysing and producing legal documents like contracts. Sequioa states: “Legal work is the ultimate text-in, text-out business – a bull’s-eye for language models.”
In May 2023, only two organisations were licensed to use Harvey: PwC has exclusive access among the Big Four; and Allen & Overy is the first law firm user. More than 15,000 law firms are on the waiting list.
Harvey is similar to ChatGPT but with more functions, specifically for lawyers. Like ChatGPT, users simply type instructions about the task they wish to accomplish and Harvey generates a text-based result. The prompt’s degree of detail is user-defined.