Quickstart Guide

Previous🤖 Welcome to Stack AI NextNew to Generative AI?

Last updated 1 year ago

Quickstart Guide

To get started with Stack AI, let's build a simple application. In this example, we will:

Create an LLM that answers questions about a website.
Add one input node for the user to ask a question.
Load data from a website given its URL.
Store the website text in a vector database.
Find the most relevant section of the website in the database.
Return the result to an output node.

Let's go step by step:

Create a new project

Let's start by creating a new project in our dashboard.

Explore the main interface

Once you create a project, you will see an interface with 3 main components:

Flow: the main component of the Stack tool, a flow is a diagram connecting different LLMs and data sources to build an application.
Nodes: nodes represent the components in the flow where data is received, processed, and returned. The Nodes sidebar includes: inputs, outputs, LLMs, vector databases, and data sources.
Control Bar: a set of commands used to test, evaluate and explore the flow.

Collecting User input

We start by adding an input and an output. Some quick facts:

Anything given in the input can be used as a parameter for an LLM prompt or for another node. Once you deploy the model, these Inputs are the body of the API.
The results of the output are always displayed in each run. Outputs are the content returned by the API upon calling it.

For our application, we will use inputs and outputs in the following way:

Input: receives the user question to the document.
Output: displays the answer generated by the LLM.

Adding an LLM

Once we have an input and output in our application, we proceed to add a Large Language Model to perform the inference needed. Language models are characterized by a prompt, which is the instruction sent by the model, and sometimes a system message, which specifies the behavior of the model.

Inputs can be used as parameters inside the prompt by wrapping them in brackets.
LLM prompts have a limit on the number of words they can process. Because of this, we can rarely insert the entire text of a data source as part of the prompt to the LLM.
The completion of the model can be streamed to the output or to another LLM.

Loading the data from a website

Once we have inputs and outputs, we need to add a data source for the LLM to answer the question. Since we want to read a website, we will use the URL node.

Note the ending of the URL node is a gray square, which indicates its output is segments of text. Because of this, the node needs to connect to a vector database.
The URL node will read the HTML of the website and return segments of text with the content of the site.

Querying the website with a vector database

Now that we have a data loader, we need to store and query its different segments of text for the LLM to process it. We do this by adding a vector database, specifically the Basic vector database.

Vector databases receive segments of text from a document and store their embeddings.
A user query is used to determine which segments of text to retrieve.
The most relevant segments of text are combined in a string that can fit in the prompt of an LLM.

In our application, we will connect our nodes in the following setup:

The URL node will be used as a data source.
The user input will be used as the input query to the vector database.
The output of the vector database will be used as a parameter to the LLM.

Adding website and user input to the LLM prompt

Now, we can combine all of our pieces to configure our LLM in the following way:

The LLM needs to know it is a web assistant that answers questions given some context on a website. We add this information as the system message.
- We also specify that the LLM should not respond if the answer is not in the context.
The LLM prompt needs to:
- Use the user input (in-0) as a question.
- Use the output of the vector database (vec-0) as context to answer the question.

Leading to the following setup:

Testing the flow

Now that our flow is complete we can write some text in the input node and click on the play button in the control panel.

After the flow is executed, we will see the completion of the LLM in the output.

Deploying the flow

Now we have a complete LLM flow that can answer questions from a website. But that's not all! We can use this flow in a product-ready environment by deploying it as an API.

Simply go to the deploy that an:

Look for your project name
Select it from the list
Pick your programming language of preference

Then you will obtain a hosted API that will call your flow in production.

To quickly test this API, you can test it in your terminal by using the cURL command:

What's next?

Now that we have built and deployed a production-ready API, we can continue maintaining and improving it. Some things to try next are:

Evaluate: Test your API with several inputs and parameters in parallel to evaluate its performance at scale.
Collect data: Collect examples of good LLM completions to use as ground truth for evaluation and further testing.
Fine-tune: use your collected data to fine-tune your own custom LLM that will perform optimally in your task.

Previous🤖 Welcome to Stack AI NextNew to Generative AI?

Last updated 1 year ago