All tutorials

Use Continue, Ollama, Codestral, and Koyeb GPUs to Build a Custom AI Code Assistant

4 min


Continue is an open-source AI code assistant that connects any models and context to build custom autocomplete prompts and chat experiences inside the IDE, like VS Code and JetBrains.

Ollama is a self-hosted AI solution to run open-source large language models on your own infrastructure, and Codestral is MistralAI's first-ever code model designed for code generation tasks.

In this guide, we will demonstrate how to use Continue with Ollama, the Mistral Codestral model, and Koyeb GPUs to build a custom, self-hosted AI code assistant.

When complete, you will have a private AI code assistant for autocomplete prompts and chat available within VS Code and JetBrains.


To successfully follow and complete this guide, you need:


To complete this guide and build a custom AI code assistant using Continue, Ollama, Codestral, and Koyeb GPUs, you need to follow these steps:

  1. Deploy Ollama on Koyeb's GPUs
  2. Install and configure the Continue package in VS Code
  3. Get started with your custom AI code assistant

Deploy Ollama on Koyeb's GPUs

To get started, we will deploy Ollama on Koyeb's GPUs. Ollama will be used to run the Mistral Codestral model on a Koyeb RTX 4000 SFF ADA which is ideal for cost-effective AI inference and running open-source large language models.

To create and deploy Ollama on Koyeb, we will use the Deploy to Koyeb button below:

Deploy to Koyeb

On the service configuration page, you can customize the Service name, Instance type, and other settings to match your requirements.

When you are ready, click the Deploy button to create the service and start the deployment process.

After a few seconds, your Ollama service will be deployed and running on Koyeb.

The next step is to pull the Mistral Codestral model to use it with Ollama. To do so, retrieve the Service URL from the Koyeb dashboard and run the following command in your terminal:

curl https://<YOUR_SUBDOMAIN> -d '{
  "name": "codestral"


Take care to replace the base URL ending in with your actual service URL.

Ollama will pull the Mistral Codestral model and prepare it for use. This might take a few moments. Once it's done, we can move to the next step and configure Continue to use the ollama provider.

Install and configure the Continue package in VS Code

With Ollama deployed, we will show how to configure Continue for VS Code to use ollama as a provider. For JetBrains, please refer to the Continue documentation.

Get started by installing the Continue VS Code extension. This will open the Continue extension page for VS Code. Click the Install button to install the extension.

Once the install has completed, open the ~/.continue/config.json file on your machine and edit it to match the format below:

  "models": [
      "title": "Codestral on Koyeb",
      "apiBase": "https://<YOUR_SUBDOMAIN>",
      "provider": "ollama",
      "model": "llama3:8b"

The above configuration tells Continue to:

  1. use the ollama provider
  2. use the Mistral Codestral model
  3. use the Ollama Instance located at the Koyeb Service URL


Take care to replace the apiBase value with your Ollama Service URL.

Restart VS Code to apply the changes and get started using the AI code assistant.

Get started with your Custom AI code assistant

Use the following shortcuts to access Continue and interact with the AI code assistant:

  • cmd+L (MacOS)
  • ctrl-L (Windows / Linux)

You can now start asking questions about your codebase, get autocomplete suggestions, and more.

Blazing-Fast AI Deployments

Enjoy automatic continuous deployment, global load balancing, real-time metrics and monitoring, autoscaling, and more.

Deploy Now


In this guide, we demonstrated how to use Continue, Ollama, MistralAI's Codestral, and Koyeb GPUs to build a custom autocomplete and chat experience inside of VS Code. This tutorial covers the basics of how to get started using Continue. To go further, be sure to check out the Continue documentation to learn more about how to use Continue.


Welcome to Koyeb

Koyeb is a developer-friendly serverless platform to deploy any apps globally.

  • Start for free, pay as you grow
  • Deploy your first app in no time
Start for free
The fastest way to deploy applications globally.