Services
Company
Services
Company
Services
Company
The Era of AI: Cloud vs. Local AI
The Era of AI: Cloud vs. Local AI
The Era of AI: Cloud vs. Local AI
Cristian Jay Duque
Chief Content Officer @ IOL Inc.
May 27, 2024
May 27, 2024
As we stand on the brink of the AI era, it's fascinating to observe how AI is integrating seamlessly into various aspects of our lives—whether at work, school, home, or even outdoors. However, most AI applications today rely on cloud computing, which provides on-demand access to computing resources, particularly for data storage and processing power. This model has revolutionized technology use by offering scalable and powerful computational resources. Yet, it comes with its drawbacks: dependency on a reliable internet connection, potential latency issues, and concerns over data privacy.
The Rise of Local AI
Imagine accessing all the AI capabilities directly on your local machine without relying on the internet. Enter local AI, also known as edge AI. This approach involves running AI algorithms on the device itself, offering significant advantages like lower latency, enhanced privacy, and reduced dependence on constant internet connectivity. Local AI shines in scenarios with poor or intermittent connectivity and in applications requiring real-time processing, such as self-driving cars, medical equipment, and industrial control systems. By keeping data processing local, it also enhances data security by minimizing the need to transmit sensitive information over the internet.
With advancements in technology, we are likely to see a hybrid model where local AI and cloud-based systems work together, creating a flexible and robust AI landscape. This integration will enable individuals and businesses to harness the full potential of AI, irrespective of internet availability or data privacy concerns.
—
Technical Stuff
How to Install Your Local AI on Your Machine
Hardware Requirements
CPU
Optimal Choice: 11th Gen Intel or Zen4-based AMD CPUs.
Reason: AVX512 support accelerates AI model operations, and DDR5 support boosts performance through increased memory bandwidth.
Key Features: CPU instruction sets are more crucial than core counts.
RAM
Minimum Requirement: 16GB.
Purpose: Sufficient for running models like 7B parameters effectively. Suitable for smaller models or larger ones with caution.
Disk Space
Practical Minimum: 50GB.
Usage: Accommodates Docker container (around 2GB for Ollama-WebUI) and model files, covering essentials without much extra buffer.
GPU
Recommendation: Not mandatory but beneficial for enhanced performance.
Model Inference: A GPU can significantly speed up performance, especially for large models.
VRAM Requirements for FP16 Models:
7B model: ~26GB VRAM
Quantized Models Support: Efficient handling with less VRAM:
7B model: ~4GB VRAM
13B model: ~8GB VRAM
30B model: ~16GB VRAM
65B model: ~32GB VRAM
Larger Models
Running 13B+ and MoE Models: Recommended only with a Mac, a large GPU, or one supporting quant formats efficiently due to high memory and computational needs.
Overall Recommendation
For optimal performance with Ollama and Ollama-WebUI:
CPU: Intel/AMD with AVX512 or DDR5 support.
RAM: At least 16GB.
Disk Space: Around 50GB.
GPU: Recommended for performance boosts, especially with models at the 7B parameter level or higher. Large or quant-supporting GPUs are essential for running larger models efficiently.
Software Requirements
Operating Systems: Windows 10/11 64-bit, Latest Linux, or Mac OS - **Software:** Docker (Latest), WSL Repo (Available in Microsoft Store for Windows installation)
How to Install
Note: For certain Docker environments, additional configurations might be needed. If you encounter any connection issues, our detailed guide on Open WebUI Documentation is ready to assist you.
Quick Start with Docker
Warning: When using Docker to install Open WebUI, make sure to include the `-v
open-webui:/app/backend/data
` in your Docker command. This step is crucial as it ensures your database is properly mounted and prevents any loss of data.
Tip: If you wish to utilize Open WebUI with Ollama included or CUDA acceleration, we recommend utilizing our official tags with either `:cuda` or `:ollama`. To enable CUDA, you must install the Nvidia CUDA container toolkit on your Linux/WSL system.
—
Installation with Default Configuration
If Ollama is on Your Computer
Use this command:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
If Ollama is on a Different Server
To connect to Ollama on another server, change the `OLLAMA_BASE_URL` to the server's URL:
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
To Run Open WebUI with Nvidia GPU Support
Use this command:
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:cuda
Installation for OpenAI API Usage Only
If you're only using the OpenAI API, use this command:
docker run -d -p 3000:8080 -e OPENAI_API_KEY=your_secret_key -v
open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
Installing Open WebUI with Bundled Ollama Support
This installation method uses a single container image that bundles Open WebUI with Ollama, allowing for a streamlined setup via a single command. Choose the appropriate command based on your hardware setup:
With GPU Support
Utilize GPU resources by running the following command:
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
For CPU Only
If you're not using a GPU, use this command instead:
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Both commands facilitate a built-in, hassle-free installation of both Open WebUI and Ollama, ensuring that you can get everything up and running swiftly.
After installation, you can access Open WebUI at http://localhost:3000. Enjoy!
Other Installation Methods
We offer various installation alternatives, including non-Docker native installation methods, Docker Compose, Kustomize, and Helm. Visit our [Open WebUI Documentation](https://docs.openwebui.io) or join our Discord community for comprehensive guidance.
Troubleshooting
Open WebUI: Server Connection Error
If you're experiencing connection issues, it’s often due to the WebUI Docker container not being able to reach the Ollama server at `127.0.0.1:11434` (host.docker.internal:11434) inside the container. Use the `--network=host`
flag in your Docker command to resolve this. Note that the port changes from `3000` to `8080`, resulting in the link: http://localhost:8080.
Example Docker Command:
docker run -d --network=host -v open-webui:/app/backend/data -e
OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Keeping Your Docker Installation Up-to-Date
In case you want to update your local Docker installation to the latest version, you can do it with Watchtower:
docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui
This implementation can benefit places and organizations with limited internet access. Enjoy
CeeJay
As we stand on the brink of the AI era, it's fascinating to observe how AI is integrating seamlessly into various aspects of our lives—whether at work, school, home, or even outdoors. However, most AI applications today rely on cloud computing, which provides on-demand access to computing resources, particularly for data storage and processing power. This model has revolutionized technology use by offering scalable and powerful computational resources. Yet, it comes with its drawbacks: dependency on a reliable internet connection, potential latency issues, and concerns over data privacy.
The Rise of Local AI
Imagine accessing all the AI capabilities directly on your local machine without relying on the internet. Enter local AI, also known as edge AI. This approach involves running AI algorithms on the device itself, offering significant advantages like lower latency, enhanced privacy, and reduced dependence on constant internet connectivity. Local AI shines in scenarios with poor or intermittent connectivity and in applications requiring real-time processing, such as self-driving cars, medical equipment, and industrial control systems. By keeping data processing local, it also enhances data security by minimizing the need to transmit sensitive information over the internet.
With advancements in technology, we are likely to see a hybrid model where local AI and cloud-based systems work together, creating a flexible and robust AI landscape. This integration will enable individuals and businesses to harness the full potential of AI, irrespective of internet availability or data privacy concerns.
—
Technical Stuff
How to Install Your Local AI on Your Machine
Hardware Requirements
CPU
Optimal Choice: 11th Gen Intel or Zen4-based AMD CPUs.
Reason: AVX512 support accelerates AI model operations, and DDR5 support boosts performance through increased memory bandwidth.
Key Features: CPU instruction sets are more crucial than core counts.
RAM
Minimum Requirement: 16GB.
Purpose: Sufficient for running models like 7B parameters effectively. Suitable for smaller models or larger ones with caution.
Disk Space
Practical Minimum: 50GB.
Usage: Accommodates Docker container (around 2GB for Ollama-WebUI) and model files, covering essentials without much extra buffer.
GPU
Recommendation: Not mandatory but beneficial for enhanced performance.
Model Inference: A GPU can significantly speed up performance, especially for large models.
VRAM Requirements for FP16 Models:
7B model: ~26GB VRAM
Quantized Models Support: Efficient handling with less VRAM:
7B model: ~4GB VRAM
13B model: ~8GB VRAM
30B model: ~16GB VRAM
65B model: ~32GB VRAM
Larger Models
Running 13B+ and MoE Models: Recommended only with a Mac, a large GPU, or one supporting quant formats efficiently due to high memory and computational needs.
Overall Recommendation
For optimal performance with Ollama and Ollama-WebUI:
CPU: Intel/AMD with AVX512 or DDR5 support.
RAM: At least 16GB.
Disk Space: Around 50GB.
GPU: Recommended for performance boosts, especially with models at the 7B parameter level or higher. Large or quant-supporting GPUs are essential for running larger models efficiently.
Software Requirements
Operating Systems: Windows 10/11 64-bit, Latest Linux, or Mac OS - **Software:** Docker (Latest), WSL Repo (Available in Microsoft Store for Windows installation)
How to Install
Note: For certain Docker environments, additional configurations might be needed. If you encounter any connection issues, our detailed guide on Open WebUI Documentation is ready to assist you.
Quick Start with Docker
Warning: When using Docker to install Open WebUI, make sure to include the `-v
open-webui:/app/backend/data
` in your Docker command. This step is crucial as it ensures your database is properly mounted and prevents any loss of data.
Tip: If you wish to utilize Open WebUI with Ollama included or CUDA acceleration, we recommend utilizing our official tags with either `:cuda` or `:ollama`. To enable CUDA, you must install the Nvidia CUDA container toolkit on your Linux/WSL system.
—
Installation with Default Configuration
If Ollama is on Your Computer
Use this command:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
If Ollama is on a Different Server
To connect to Ollama on another server, change the `OLLAMA_BASE_URL` to the server's URL:
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
To Run Open WebUI with Nvidia GPU Support
Use this command:
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:cuda
Installation for OpenAI API Usage Only
If you're only using the OpenAI API, use this command:
docker run -d -p 3000:8080 -e OPENAI_API_KEY=your_secret_key -v
open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
Installing Open WebUI with Bundled Ollama Support
This installation method uses a single container image that bundles Open WebUI with Ollama, allowing for a streamlined setup via a single command. Choose the appropriate command based on your hardware setup:
With GPU Support
Utilize GPU resources by running the following command:
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
For CPU Only
If you're not using a GPU, use this command instead:
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Both commands facilitate a built-in, hassle-free installation of both Open WebUI and Ollama, ensuring that you can get everything up and running swiftly.
After installation, you can access Open WebUI at http://localhost:3000. Enjoy!
Other Installation Methods
We offer various installation alternatives, including non-Docker native installation methods, Docker Compose, Kustomize, and Helm. Visit our [Open WebUI Documentation](https://docs.openwebui.io) or join our Discord community for comprehensive guidance.
Troubleshooting
Open WebUI: Server Connection Error
If you're experiencing connection issues, it’s often due to the WebUI Docker container not being able to reach the Ollama server at `127.0.0.1:11434` (host.docker.internal:11434) inside the container. Use the `--network=host`
flag in your Docker command to resolve this. Note that the port changes from `3000` to `8080`, resulting in the link: http://localhost:8080.
Example Docker Command:
docker run -d --network=host -v open-webui:/app/backend/data -e
OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Keeping Your Docker Installation Up-to-Date
In case you want to update your local Docker installation to the latest version, you can do it with Watchtower:
docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui
This implementation can benefit places and organizations with limited internet access. Enjoy
CeeJay
As we stand on the brink of the AI era, it's fascinating to observe how AI is integrating seamlessly into various aspects of our lives—whether at work, school, home, or even outdoors. However, most AI applications today rely on cloud computing, which provides on-demand access to computing resources, particularly for data storage and processing power. This model has revolutionized technology use by offering scalable and powerful computational resources. Yet, it comes with its drawbacks: dependency on a reliable internet connection, potential latency issues, and concerns over data privacy.
The Rise of Local AI
Imagine accessing all the AI capabilities directly on your local machine without relying on the internet. Enter local AI, also known as edge AI. This approach involves running AI algorithms on the device itself, offering significant advantages like lower latency, enhanced privacy, and reduced dependence on constant internet connectivity. Local AI shines in scenarios with poor or intermittent connectivity and in applications requiring real-time processing, such as self-driving cars, medical equipment, and industrial control systems. By keeping data processing local, it also enhances data security by minimizing the need to transmit sensitive information over the internet.
With advancements in technology, we are likely to see a hybrid model where local AI and cloud-based systems work together, creating a flexible and robust AI landscape. This integration will enable individuals and businesses to harness the full potential of AI, irrespective of internet availability or data privacy concerns.
—
Technical Stuff
How to Install Your Local AI on Your Machine
Hardware Requirements
CPU
Optimal Choice: 11th Gen Intel or Zen4-based AMD CPUs.
Reason: AVX512 support accelerates AI model operations, and DDR5 support boosts performance through increased memory bandwidth.
Key Features: CPU instruction sets are more crucial than core counts.
RAM
Minimum Requirement: 16GB.
Purpose: Sufficient for running models like 7B parameters effectively. Suitable for smaller models or larger ones with caution.
Disk Space
Practical Minimum: 50GB.
Usage: Accommodates Docker container (around 2GB for Ollama-WebUI) and model files, covering essentials without much extra buffer.
GPU
Recommendation: Not mandatory but beneficial for enhanced performance.
Model Inference: A GPU can significantly speed up performance, especially for large models.
VRAM Requirements for FP16 Models:
7B model: ~26GB VRAM
Quantized Models Support: Efficient handling with less VRAM:
7B model: ~4GB VRAM
13B model: ~8GB VRAM
30B model: ~16GB VRAM
65B model: ~32GB VRAM
Larger Models
Running 13B+ and MoE Models: Recommended only with a Mac, a large GPU, or one supporting quant formats efficiently due to high memory and computational needs.
Overall Recommendation
For optimal performance with Ollama and Ollama-WebUI:
CPU: Intel/AMD with AVX512 or DDR5 support.
RAM: At least 16GB.
Disk Space: Around 50GB.
GPU: Recommended for performance boosts, especially with models at the 7B parameter level or higher. Large or quant-supporting GPUs are essential for running larger models efficiently.
Software Requirements
Operating Systems: Windows 10/11 64-bit, Latest Linux, or Mac OS - **Software:** Docker (Latest), WSL Repo (Available in Microsoft Store for Windows installation)
How to Install
Note: For certain Docker environments, additional configurations might be needed. If you encounter any connection issues, our detailed guide on Open WebUI Documentation is ready to assist you.
Quick Start with Docker
Warning: When using Docker to install Open WebUI, make sure to include the `-v
open-webui:/app/backend/data
` in your Docker command. This step is crucial as it ensures your database is properly mounted and prevents any loss of data.
Tip: If you wish to utilize Open WebUI with Ollama included or CUDA acceleration, we recommend utilizing our official tags with either `:cuda` or `:ollama`. To enable CUDA, you must install the Nvidia CUDA container toolkit on your Linux/WSL system.
—
Installation with Default Configuration
If Ollama is on Your Computer
Use this command:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
If Ollama is on a Different Server
To connect to Ollama on another server, change the `OLLAMA_BASE_URL` to the server's URL:
docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
To Run Open WebUI with Nvidia GPU Support
Use this command:
docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:cuda
Installation for OpenAI API Usage Only
If you're only using the OpenAI API, use this command:
docker run -d -p 3000:8080 -e OPENAI_API_KEY=your_secret_key -v
open-webui:/app/backend/data --name open-webui --restart always
ghcr.io/open-webui/open-webui:main
Installing Open WebUI with Bundled Ollama Support
This installation method uses a single container image that bundles Open WebUI with Ollama, allowing for a streamlined setup via a single command. Choose the appropriate command based on your hardware setup:
With GPU Support
Utilize GPU resources by running the following command:
docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
For CPU Only
If you're not using a GPU, use this command instead:
docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Both commands facilitate a built-in, hassle-free installation of both Open WebUI and Ollama, ensuring that you can get everything up and running swiftly.
After installation, you can access Open WebUI at http://localhost:3000. Enjoy!
Other Installation Methods
We offer various installation alternatives, including non-Docker native installation methods, Docker Compose, Kustomize, and Helm. Visit our [Open WebUI Documentation](https://docs.openwebui.io) or join our Discord community for comprehensive guidance.
Troubleshooting
Open WebUI: Server Connection Error
If you're experiencing connection issues, it’s often due to the WebUI Docker container not being able to reach the Ollama server at `127.0.0.1:11434` (host.docker.internal:11434) inside the container. Use the `--network=host`
flag in your Docker command to resolve this. Note that the port changes from `3000` to `8080`, resulting in the link: http://localhost:8080.
Example Docker Command:
docker run -d --network=host -v open-webui:/app/backend/data -e
OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Keeping Your Docker Installation Up-to-Date
In case you want to update your local Docker installation to the latest version, you can do it with Watchtower:
docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui
This implementation can benefit places and organizations with limited internet access. Enjoy
CeeJay