Large language models (LLMs) are machine learning models able to understand and generate human language text. They function by analyzing large amounts of data regarding language.
LLM end
A large language model (LLM) is a type of artificial intelligence (AI) program that can recognize and generate text, among other tasks. LLMs are trained on huge sets of data — hence the name "large." LLMs are built on machine learning: specifically, a type of neural network called a transformer model.
In simpler terms, an LLM is simply a computer program that's been fed enough examples so it can recognize and interpret human language, or other types of complex data. Many LLMs are trained on data that's been gathered from the Internet: thousands or even millions of gigabytes of text. But the quality of the samples impacts how well LLMs will learn natural language, so an LLM's programmers may use a more curated data set.
LLMs use a type of machine learning called deep learning to understand how characters, words, and sentences function together. Deep learning involves the probabilistic analysis of unstructured data, which eventually enables the deep learning model to recognize distinctions between pieces of content without human intervention.
LLMs are then further trained via tuning: they are fine-tuned or prompt-tuned to the particular task that the programmer wants them to do, such as interpreting questions and generating responses, or translating text from one language to another.
What are LLMs used for?
A number of tasks can be trained into LLMs. One of the most well-known applications is their use as generative AI: given a prompt or asked a question, they can produce text in reply. The publicly available LLM ChatGPT, for example, can generate essays, poems, and other textual forms in response to user inputs.
Any large, complex data set can be used to train LLMs, including programming languages. Some LLMs can help programmers write code. They can write functions upon request — or, given some code as a starting point, they can finish writing a program. LLMs may also be used in:
Sentiment analysis
DNA research
Customer service
Chatbots
Online search
Examples of real-world LLMs include ChatGPT (from OpenAI), Bard (Google), Llama (Meta), and Bing Chat (Microsoft). Another example is GitHub's Copilot, but for coding instead of natural human language.
What are some advantages and limitations of LLMs?
One of the defining features of LLMs is the ability to respond to unpredictable queries. A traditional computer program receives commands in its accepted syntax, or from a certain set of inputs from the user. A video game has a finite set of buttons, an application has a finite set of things a user can click or type, and a programming language is composed of precise if/then statements.
On the other hand, an LLM can respond to natural human language and use data analysis to answer an unstructured question or prompt in a way that makes sense. While a typical computer program would not recognize a prompt like "What are the four greatest funk bands in history?", an LLM might reply with a list of four such bands, and a reasonably cogent defense of why they are the best.
As far as information goes, however, LLMs can only be as accurate as the data fed into them. If given bad facts, they will report back bad information to users. LLMs also sometimes "hallucinate": they produce fantasy information when they can't come up with a proper answer. For instance, in 2022 news outlet Fast Company asked ChatGPT about the previous financial quarter of the company Tesla; while ChatGPT responded with a coherent news article, much of the information within was invented.
User-facing applications based on LLMs are as prone to bugs as any other application. LLMs also can be manipulated by means of malicious inputs to offer certain kinds of responses above others - including dangerous and unethical kinds of responses. Finally, one of the security problems of LLMs is that users may upload secure, confidential data into them in order to increase their own productivity. But LLMs use the inputs they receive to further train their models, and they are not designed to be secure vaults; they may expose confidential data in response to queries from other users.
How do LLMs work?
Machine learning and deep learning
In short, LLMs are founded on machine learning. Machine learning is an area of AI that generally describes the activity of inputting vast amounts of data into a program in the hope of training the program to learn the characteristics of the data itself, without the involvement of human judgment.
LLMs depend on a type of machine learning technology called deep learning. A deep learning model can truly learn to recognize differences without explicitly being taught, although some human finetuning is usually always required.
Deep learning uses probability in order to "learn." For example, in the sentence "The quick brown fox jumped over the lazy dog," the letters "e" and "o" are the most common, appearing four times each. From this, a deep learning model could conclude (correctly) that these characters are among the most likely to appear in English-language text.
Realistically, a deep learning model simply cannot conclude anything from just one sentence. But after parsing trillions of sentences, it can learn enough to predict the way to logically finish off an incomplete sentence, or perhaps even generate its own.
Neural networks
Deep learning requires this type of neural network. LLMs are based on neural networks. Just as the human brain is constructed of neurons that connect and send signals to each other, an artificial neural network (usually abbreviated "neural network") is constructed of network nodes that connect with each other. They are made up of a few "layers": an input layer, an output layer, and any number of layers in the middle. The layers only send signals to each other when their own output exceeds a given threshold.
Transformer models
The actual type of neural networks applied to LLMs is called transformer models. Transformer models can learn context - something very important in human language, which tends to rely very heavily on context. Transformer models use a mathematical trick called self-attention to spot subtle ways in which elements within a sequence relate to one another. This makes them far more sensitive to context than other varieties of machine learning. This lets them grasp how the end of a sentence relates to the beginning, or how the sentences within a paragraph are related to each other.
This allows LLMs to understand human language, even when that language is ambiguous or poorly defined, grouped together in combinations they have not seen before, or placed in new contexts. On some level they "understand" semantics in that they can associate words and concepts by their meaning, having seen them grouped together in that way millions or billions of times.
How developers can rapidly get started building their own LLMs
In building LLM applications, developers require easy access to several data sets and need places for these data sets to live. Infrastructure investments in cloud and on-premises storage may be outside the scope of what developers can budget for these purposes. Training data sets are typically stored in several locations, and moving all that data to a central location will incur huge egress fees.
Fortunately, Cloudflare provides a range of services to enable developers to get started rapidly with spinning up LLM applications and other forms of AI. Vectorize is a globally distributed vector database for querying data stored in no-egress-fee object storage (R2) or documents stored in Workers Key Value. Coupled with the development platform Cloudflare Workers AI, developers can get started experimenting with their own LLMs using Cloudflare quickly.
Comments