How to Install & Run DeepSeek v3 locally?

Updated On: 1 February, 2025

DeepSeek is now the newest and trending open-source AI model that is disrupting the tech industry for a better future.

This open-source model can be accessed through Hugging Face or Ollama, while DeepSeek-R1 and DeepSeek-V3 can be directly used for inference via DeepSeek Chat.

But if you want to install and run DeepSeek v3 on your local computer, it could be tricky. You need to know how to run Ubuntu commands and the specific commands to run DeepSeek v3 on your local GPU.

In this article, I will show you how you can install and run a quantized version of DeepSeek-V3 on a local computer with GPU on Linux Ubuntu.

Just follow my instructions!

How to Install DeepSeek v3 locally on GPU in Linux Ubuntu?

To properly run and install DeepSeek-V3, we will build a Llama.cpp program from a source with CUDA GPU support. We use Llama.cpp since this program enables us to run different types of LLMs with minimal setup time.

DeepSeek v3 is a powerful Mixture-of-Experts (MoE) language model. According to the test data published by people behind DeepSeek-V3, this model outperforms Qwen2.5-72B, Llama 3.1-405B, GPT-4o-0513 and Claude-3.5-Sonnet.

It is a versatile option capable of handling a wide range of queries, from casual conversations to complex content generation.

Consequently, it is important to test the performance of DeepSeek-V3 and potentially integrate it into your project.

Now, follow these steps to Install and run DeepSeek v3 locally on your GPU in Linux Ubuntu.

Install NVIDIA CUDA Toolkit and the NVCC compiler

You have to Install the NVIDIA CUDA Toolkit and the NVCC compiler to run DeepSeek v3 locally.

For that, visit https://developer.nvidia.com/cuda-toolkit and generate the installation instructions for the NVIDIA CUDA Toolkit, as shown in the image below.

Open a Linux Ubuntu terminal and run the generated commands:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-6

Add CUDA Toolkit Binary Files

After installing the NVIDIA CUDA Toolkit on your computer, you need to add the CUDA toolkit binary files (executable files) to the system path.

The CUDA binary folder is located at:

/usr/local/cuda-12.6/bin

To add this folder to the bath, you need to edit the .bashrc file in the home folder:

cd ~
sudo nano

and add the following line at the end of the file:

export PATH=/usr/local/cuda-12.6/bin${PATH:+:${PATH}}

Save the file and restart the terminal. Next, open a terminal and type:

nvcc --version

You should get a reply if everything is properly installed.

Install Git on Your PC

After adding the CUDA Toolkit Binary Files on your local PC, you have to install Git. Give this below command on your Linux Command pannel:

sudo apt install git-all

Next, go to the home folder, clone the remote Llama.cpp repository and change the current folder to the cloned folder called llama.cpp

cd ~
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

Then, build the project

cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j $(nproc)

Download DeepSeek v3 Model Files

After creating your project, now you can download DeepSeek v3 Model Files on your local Linux system.

For that purpose go to the Huggingface website:

https://huggingface.co/unsloth/DeepSeek-V3-GGUF

and click on the desired model, and download all 5 model files.

In our case, we select the model Q2_K_XS and download the files

After the files are downloaded, copy them to the folder

~/llama.cpp/build/bin

After that, navigate to this folder

cd ~/llama.cpp/build/bin

and run the model by typing

./llama-cli --model DeepSeek-V3-Q2_K_XS-00001-of-00005.gguf

This will run the model in the interactive mode.

Conclusion

DeepSeek is quickly becoming a major player in the world of AI, offering a variety of models for developers, researchers, and everyday users.

Competing with big names like OpenAI and Gemini, DeepSeek’s affordable and high-performance models are set to attract a large following.

DeepSeek’s models can be used for many purposes, from helping with coding to providing advanced problem-solving and multimodal abilities.

With options for smooth local execution through Ollama and cloud-based solutions, DeepSeek is ready to make a big impact on AI research and development.

If you have any questions or need help, feel free to drop a comment below!

DeepSeek How to