Description of LLMs available

Comparing models in different use cases

There is a long list of models available in Stack AI, but before getting too overwhelmed with the description of each model, let's discuss some of the key use cases and which models are preferred.

USE CASE

DESCRIPTION

PREFERRED MODELS

Formatting a prompt

Sometimes an LLM can improve the prompt given by the user before sending it to another LLM that will perform the task. In this case, a lighter, less costly and faster model is preferred.

gpt-3.5-turbo davinci claude-instant-v1

Summarizing a large document set or websites

In this case, receiving broader context is crucial for the LLM to understand the overall meaning of the document and to summarize it correctly. A model with larger context window is preferred.

claude-v1-100K

Performing complex tasks

Imagine an LLM receiving input from the user, data from a list of documents, instructions as to how to behave, and needs to use an external tool to retrieve additional information to finally answer the user. In this case, the more powerful the best.

gpt-4 claude-v1

Performing complex tasks requiring large context

Same as the use case above, but requiring a larger context window to read entire documents. To deploy this use case at scale, it is preferred to use a well-trained, large window type of model that has been around for some time.

gpt-4-32K

List of all available LLMs

GPT 4

GPT 4 is a large multimodal model (accepting text inputs and emitting text outputs today, with image inputs coming in the future) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities.

MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

gpt-4

More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration 2 weeks after it is released.

8,192 tokens

Up to Sep 2021

gpt-4-0613

Snapshot of gpt-4 from June 13th 2023 with function calling data. Unlike gpt-4, this model will not receive updates, and will be deprecated 3 months after a new version is released.

8,192 tokens

Up to Sep 2021

gpt-4-32k

Same capabilities as the base gpt-4 mode but with 4x the context length. Will be updated with our latest model iteration.

32,768 tokens

Up to Sep 2021

gpt-4-32k-0613

Snapshot of gpt-4-32 from June 13th 2023. Unlike gpt-4-32k, this model will not receive updates, and will be deprecated 3 months after a new version is released.

32,768 tokens

Up to Sep 2021

GPT-3.5

GPT-3.5 is a mid-generation upgrade of GPT-3 with fewer parameters. It includes a fine-tuning process that involves reinforcement learning with human feedback, which helps to improve the accuracy of the responses.

MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

gpt-3.5-turbo

Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with the latest model iteration 2 weeks after it is released.

4,096 tokens

Up to Sep 2021

gpt-3.5-turbo-16k

Same capabilities as the standard gpt-3.5-turbo model but with 4 times the context.

16,384 tokens

Up to Sep 2021

gpt-3.5-turbo-0613

Snapshot of gpt-3.5-turbo from June 13th 2023 with function calling data. Unlike gpt-3.5-turbo, this model will not receive updates, and will be deprecated 3 months after a new version is released.

4,096 tokens

Up to Sep 2021

text-davinci-003

Can do any language task with better quality, longer output, and consistent instruction-following than the curie, babbage, or ada models. Also supports some additional features such as inserting text.

4,096 tokens

Up to Sep 2021

GPT-3

These models can understand and generate natural language, and were the original ones of ChatGPT. These models were superseded by the more powerful GPT-3.5 generation models, but can be still used for simpler tasks and to reduce cost and increase speed.

MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

curie

Very capable, but faster and lower cost than Davinci.

2,049 tokens

Up to Oct 2019

babbage

Capable of straightforward tasks, very fast, and lower cost.

2,049 tokens

Up to Oct 2019

ada

Capable of very simple tasks, usually the fastest model in the GPT-3 series, and lowest cost.

2,049 tokens

Up to Oct 2019

Anthropic

Anthropic's model called Claude is a is a transformer-based LLM, much like GPT-3, that leverages large-scale machine learning techniques. The model is trained on a diverse range of internet text, giving it the ability to generate text that is coherent, contextually relevant, and remarkably human-like.

Below the comparison between the different versions of Claude.

LATEST MODEL

DESCRIPTION

MAX TOKENS

claude-2

Claude 2 has improved performance, longer responses. Largest context window available in the market.

100,000 tokens

claude-1

Faster than OpenAI GPT 4 and almost as good. Largest context window available in the market.

100,000 tokens

claude-instant-1

Lighter, less expensive and much faster option.

100,000 tokens

Google

Stack AI has early access to Google's PaLM 2 model, the Large Language Model (LLM) released by Google. It is highly capable in advanced reasoning, coding, and mathematics. It's also multilingual and supports more than 100 languages. PaLM 2 is a successor to the earlier Pathways Language Model (PaLM) launched in 2022.

The two models available are the following.

LATEST MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

text-bison-001

Fine-tuned to follow natural language instructions and is suitable for a variety of language tasks

8,192 tokens

Up to Feb 2023

chat-bison-001

Fine-tuned for multi-turn conversation use cases.

4,096 tokens

Up to Feb 2023

PreviousLarge Language Models (LLMs)NextAdd memory to an LLM

Last updated 1 year ago

Description of LLMs available

Comparing models in different use cases

There is a long list of models available in Stack AI, but before getting too overwhelmed with the description of each model, let's discuss some of the key use cases and which models are preferred.

USE CASE

DESCRIPTION

PREFERRED MODELS

Formatting a prompt

Sometimes an LLM can improve the prompt given by the user before sending it to another LLM that will perform the task. In this case, a lighter, less costly and faster model is preferred.

gpt-3.5-turbo davinci claude-instant-v1

Summarizing a large document set or websites

In this case, receiving broader context is crucial for the LLM to understand the overall meaning of the document and to summarize it correctly. A model with larger context window is preferred.

claude-v1-100K

Performing complex tasks

gpt-4 claude-v1

Performing complex tasks requiring large context

gpt-4-32K

List of all available LLMs

GPT 4

MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

gpt-4

More capable than any GPT-3.5 model, able to do more complex tasks, and optimized for chat. Will be updated with our latest model iteration 2 weeks after it is released.

8,192 tokens

Up to Sep 2021

gpt-4-0613

Snapshot of gpt-4 from June 13th 2023 with function calling data. Unlike gpt-4, this model will not receive updates, and will be deprecated 3 months after a new version is released.

8,192 tokens

Up to Sep 2021

gpt-4-32k

Same capabilities as the base gpt-4 mode but with 4x the context length. Will be updated with our latest model iteration.

32,768 tokens

Up to Sep 2021

gpt-4-32k-0613

Snapshot of gpt-4-32 from June 13th 2023. Unlike gpt-4-32k, this model will not receive updates, and will be deprecated 3 months after a new version is released.

32,768 tokens

Up to Sep 2021

GPT-3.5

MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

gpt-3.5-turbo

Most capable GPT-3.5 model and optimized for chat at 1/10th the cost of text-davinci-003. Will be updated with the latest model iteration 2 weeks after it is released.

4,096 tokens

Up to Sep 2021

gpt-3.5-turbo-16k

Same capabilities as the standard gpt-3.5-turbo model but with 4 times the context.

16,384 tokens

Up to Sep 2021

gpt-3.5-turbo-0613

4,096 tokens

Up to Sep 2021

text-davinci-003

4,096 tokens

Up to Sep 2021

GPT-3

MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

curie

Very capable, but faster and lower cost than Davinci.

2,049 tokens

Up to Oct 2019

babbage

Capable of straightforward tasks, very fast, and lower cost.

2,049 tokens

Up to Oct 2019

ada

Capable of very simple tasks, usually the fastest model in the GPT-3 series, and lowest cost.

2,049 tokens

Up to Oct 2019

Anthropic

Below the comparison between the different versions of Claude.

LATEST MODEL

DESCRIPTION

MAX TOKENS

claude-2

Claude 2 has improved performance, longer responses. Largest context window available in the market.

100,000 tokens

claude-1

Faster than OpenAI GPT 4 and almost as good. Largest context window available in the market.

100,000 tokens

claude-instant-1

Lighter, less expensive and much faster option.

100,000 tokens

Google

The two models available are the following.

LATEST MODEL

DESCRIPTION

MAX TOKENS

TRAINING DATA

text-bison-001

Fine-tuned to follow natural language instructions and is suitable for a variety of language tasks

8,192 tokens

Up to Feb 2023

chat-bison-001

Fine-tuned for multi-turn conversation use cases.

4,096 tokens

Up to Feb 2023

PreviousLarge Language Models (LLMs)NextAdd memory to an LLM

Last updated 1 year ago