Skip to main content

Choose your Champion! Task-Specific vs. General Models

Should AI models be like Swiss Army knives, versatile and handy in a variety of scenarios? Or do we prefer them as precision tools, finely tuned for specific tasks? In the world of artificial intelligence, and natural language processing specifically, this is an ongoing debate. The question boils down to whether models trained for specific tasks are more effective at these tasks than general models. Task-specific models: specialization and customization In my last blog post , we looked at the rise of personalized LLMs, customized for specific users. Personalized LLMs can be seen as an extreme form of task-specific model. Fans of task-specific models stress that these kinds of models are better suited for tasks involving confidential or proprietary data. This is obviously true. But some people also believe that specialized models necessarily perform better in their specific domains. It may sound logical, but the ans...

GPT-4: The Good, the Bad, and the Ugly about OpenAI's Latest

The Wild West

The good news: GPT-4 is here! The bad news: It doesn’t quite live up to the hype.

The versions of GPT-4 currently available to the public are refined and improved versions of their predecessors, sure. But the much-touted multimodal capabilities are more limited than was widely expected. And even the ability of users to upload visuals for various reasons is not quite ready for public roll-out. Disappointingly to many as well, OpenAI is keeping mum on the specifics of GPT’s size and training data.

A girl looking at her phone. She seems surprised.

What is GPT-4?

GPT-4, short for Generative Pre-training Transformer 4, is the latest of OpenAI’s AI language models. (A variation, GTP-4-32K, is being rolled out separately, but for the sake of simplicity, we will refer to both as GTP-4.)

GPT-4 follows in the footsteps of GPT-3.5, the technology behind the now-famous ChatGPT.

"Generative" refers to the fact that GPT models can produce human-like text. It does this by predicting the next word in a sequence of words. "Pre-training" refers to the fact that GPT models are first trained to understand language. Afterward, they receive more specialized training on tasks like answering questions. Finally, "Transformer" refers to the type of neural network architecture used in GPT models.

Someone holding a robot being built

How was it developed?

GPT was originally described in a research paper in 2018. GPT-2 followed in 2019, and GPT-3 in 2020. But it was with the launch of ChatGPT in late 2022 that this technology really entered the public sphere. ChatGPT is a freely-accessible chatbot with many business use cases.

The GPT-4 release date was 14 March 2023. Like previous models, GPT-4 works by predicting the next word in a sequence. It was trained on a huge amount of text data from the internet, from which it learned to recognize and emulate statistical patterns.

Two robots: A smaller, older one and a larger, newer one

What do we know about the earlier GPT models?

All the models were based on the Transformer architecture. This is a type of neural network designed for natural language processing tasks. The models were trained on a large amount of text data using so-called “unsupervised” pre-training. This involves training the model on a large corpus of text to learn a general representation of language. It is “unsupervised” because the program figures out patterns and relationships in the text on its own.

Steerability

In the context of machine learning, steerability refers to the ability to control or steer the output of a model in a specific direction. This is done by manipulating certain input parameters (see section below).

OpenAI strongly emphasizes steerability in its research and development of AI models. Steerability is a key factor in creating AI systems that are flexible and adaptable to real-world applications. It enables developers or users to control and manipulate the outputs of AI models. This can enhance their interpretability, fairness, and robustness.

The development of steerability in GPT has been a gradual process that has evolved with the different versions of the model. In GPT-1, the first version, the focus was mainly on language modeling. The model did not have any explicit control mechanisms for steering the output toward a specific task.

With the release of GPT-2, there was a significant improvement in the quality of the generated text. The model introduced a few control mechanisms that allowed for some level of steerability.

GPT-3 introduced a more comprehensive set of control mechanisms. This includes, for example, the ability to control the generation length and temperature value (see section below).

Parameters

"Parameters" are numerical values that determine how the network processes and generates text. They are learned by the neural network during training.

The parameters can be thought of as the settings or knobs that control how the GPT model works. They are the numbers that the computer adjusts and fine-tunes during training to get better at understanding and generating text.

These parameters are what make the GPT models so powerful and versatile. By tweaking these numbers, we can adjust how the model generates text. For example, we can tweak the length and complexity of sentences, the style and tone of language, the topics it focuses on, and so on.

One example of a parameter in a GPT language model could be the value assigned to the "temperature" parameter. This controls the randomness and creativity of the model's text generation. A higher temperature value will result in more unpredictable and diverse output. A lower value will generate input that is more conservative and predictable.

  • GPT-1: The first version of the GPT model was released by OpenAI in 2018. GPT-1 had 117 million parameters.
  • GPT-2: Released in 2019, GPT-2 was a larger and more powerful version of the GPT model, with 1.5 billion parameters.
  • GPT-2 "small": OpenAI also released a smaller version of GPT-2, which had only 117 million parameters, the same as GPT-1. This smaller model was designed to be more accessible and efficient than the larger version.
  • GPT-3: After GPT-2 “small”, OpenAI released the GPT-3 model in 2020. It had 175 billion parameters. GPT-3 introduced several new features and capabilities, including dynamic control of context length, pattern-based sparse attention, and few-shot learning. Due to its larger size and improved architecture, GPT-3 has achieved state-of-the-art performance on several natural language processing benchmarks and tasks. OpenAI released several variations of the GPT-3 model, including GPT-3 "small" (125 million parameters), GPT-3 "medium" (350 million parameters), GPT-3 "large" (760 million parameters), and GPT-3 "extra large" (1.3 billion parameters).

Dynamic context control

While the temperature value is used to control the randomness and creativity of the generated text, dynamic context control is used to control the relevance and coherence of the generated text with respect to the previous context.

The context length refers to the number of preceding words, or “tokens”, that the model considers when generating the next word or token in the sequence.

  • GPT-1: The first version of the GPT model used a fixed context length of 1024 tokens. This means it considered the previous 1024 words before generating the next word or token in the sequence.
  • GPT-2: The second version of the GPT model introduced the concept of dynamic control of context length. It can adjust the context length based on the input prompt and the desired length of the generated text. This way, the model can adapt its output to better suit the prompt or task at hand.
  • GPT-3: The third version of the GPT model further improved the dynamic control of context length. It introduced new sampling methods that allow for more precise control over the amount of context used during text generation. GPT-3 could generate text up to 2048 tokens long, but could also produce shorter text outputs by adjusting the context length dynamically.
A woman looking at her phone with her hand outstretched in a questioning gesture

So, what about GPT-4?

We know that, like GPT-3, the newest model can produce text that is indistinguishable from that produced by a human. It can also summarize, complete or translate text, and it can write poetry, prose, or lyrics.

We were told that GPT-4 would be the first model to accept both textual and visual input, although it would still only provide textual output. But users were disappointed to realize after the launch that they are not able to provide visual input just yet. This feature is currently in the research preview stage. In an update on the OpenAI website posted on 15 March, Joshua J. wrote, “We aren’t offering [this] as a service right now. We’re happy to hear that you’re excited about our services and when we have anything to release, we’ll announce this to the community.”

How is it different from the previous models?

More than 50 experts were used to test GPT-4, ensure it refuses dangerous requests, and that it handles sensitive subjects better.

As a bigger and improved model, GPT-4 is, predictably, better able to handle nuanced instructions. Some sensitive topics, such as medical advice, are handled better, too.

OpenAI also put GPT-4 through its paces by having it write several human exams. Examples were the Uniform Bar Exam and LSAT. The feedback was that it performed better than any other large language model created so far.

The standard GPT-4 model will offer 8,000 tokens for the context. GPT-4-32k, an extended 32,000 token context-length model, will be rolled out separately. Among other things, this means that GPT-4 can now generate longer responses. That said, initial user reports seem less than impressed with GPT’s attempts at producing long-form content.

Considering steerability, OpenAI explained on their website that developers will now be better able to prescribe their AI’s style and tone – if GPT-4 is used to power another chatbot. With GPT-3.5, users were already able to specify a certain style or tone, for example, ”Please respond in the way an angry human might respond if I asked them..”

What is different, is that the model can now distinguish between user and system input. What this means is that someone creating a new chatbot powered by GPT-4 would be able to specify the style and tone beforehand. Users won’t be able to override this by, for example, specifying a different tone. This adds a layer of security.

Earlier, the model did not distinguish between user and system input. It handled all text input equally. Now text messages have labels identifying who sent them. When user prompts conflict with system prompts, the model is now programmed to ignore the user prompts.

As for parameters, however, OpenAI has decided to keep the size of GPT-4 undisclosed. When asked by MIT Technology Review for details, the answer was telling: “It’s pretty competitive out there.” The new gold rush in technology is here, and it is clear that OpenAI are keeping their cards close to their chest.

They have also not revealed any information about how GPT-4 was developed, including details about the data, computing power, or training techniques used. This has led to widespread criticism that OpenAI has now become…well, closed.

A remaining concern

According to OpenAI, the newest model still sometimes fabricates information, or “hallucinates”, as they put it. Initial attempts to use ChatGPT in journalism have been surrounded by controversy due to this tendency. Checking content and code for errors remains important.

A computer and a cup of coffee

Who is using GPT-4, and how can you try it?

According to OpenAI, the following companies have already integrated GPT-4 into their products: Duolingo, Stripe, and Khan Academy. (Khan Academy has introduced Khanmigo, a chatbot “tutor” powered by GPT-4.)

It is also powering the new Bing search engine (and apparently has been doing so from the start). So, if you are simply waiting to chat with GPT-4, Bing is one way to access it for free.

As for ChatGPT, unfortunately the free version doesn’t run on this version of the model yet. It is still unclear whether this will ever be the case. However, the paid version (ChatGPT Plus) has the option to switch between GPT-4 and earlier default and legacy models.

As for the GPT-4 API, there is a waiting list.

A futuristic neural network

Will there be a GPT-5?

Probably. OpenAI have not indicated that they are capping their efforts with the release of GPT-4. But they have also not released any details on a possible GPT-5 just yet. Nextbigfuture.com speculates that we could expect GPT-5 at the end of 2024 or in 2025.

An exit

In short

That concludes my AI-related post for the week. Gotta go! Overall, GPT-4 is good news for AI language models, but for many users, it's not as impressive as expected. It's better than previous versions, but its multimodal abilities are more limited than anticipated, and uploading visuals is not ready for public use. OpenAI hasn't provided specific details about GPT-4's size and training data. However, the model's development has been gradual and has improved with each version. It will be exciting to see how GPT will keep disrupting the world of content creation and customer service for the better.

Comments

Popular posts from this blog

Why the Bots Hallucinate – and Why It's Not an Easy Fix

It’s a common lament: “I asked ChatGPT for scientific references, and it returned the names of non-existent papers.” How and why does this happen? Why would large language models (LLMs) such as ChatGPT create fake information rather than admitting they don’t know the answer? And why is this such a complex problem to solve? LLMs are an increasingly common presence in our digital lives. (Less sophisticated chatbots do exist, but for simplification, I’ll refer to LLMs as “chatbots” in the rest of the post.) These AI-driven entities rely on complex algorithms to generate responses based on their training data. In this blog post, we will explore the world of chatbot responses and their constraints. Hopefully, this will shed some light on why they sometimes "hallucinate." How do chatbots work? Chatbots such as ChatGPT are designed to engage in conversational interactions with users. They are trained on large ...

Chatbots for Lead Generation: How to harness AI to capture leads

What is lead generation? Lead generation is the process of identifying and cultivating potential customers or clients. A “lead” is a potential customer who has shown some interest in your product or service. The idea is to turn leads into customers. Businesses generate leads through marketing efforts like email campaigns or social media ads. Once you have identified one, your business can follow up with them. You can provide information, answer questions, and convert them into a customer. The use of chatbots for lead generation has become popular over the last decade. But recent advancements in artificial intelligence (AI) mean chatbots have become even more effective. This post will explore artificial intelligence lead generation: its uses and methods. We’ll specifically look at a chatbot that has been drawing a lot of attention: ChatGPT . What is ChatGPT? ChatGPT is a so-called “large language model.” This type of artificial intelligence system ...

Liquid Networks: Unleashing the Potential of Continuous Time AI in Machine Learning

In the ever-expanding realm of Artificial Intelligence (AI), a surprising source has led to a new solution. MIT researchers, seeking innovation, found inspiration in an unlikely place: the neural network of a simple worm. This led to the creation of so-called "liquid neural networks," an approach now poised to transform the AI landscape. Artificial Intelligence (AI) holds tremendous potential across various fields, including healthcare, finance, and education. However, the technology faces various challenges. Liquid networks provide answers to many of these. These liquid neural networks have the ability to adapt and learn from new data inputs beyond their initial training phase. This has significant potential for various applications, especially in dynamic and real-time environments like medical diagnosis and autonomous driving. The strengths of scaling traditional neural networks While traditional n...