---Advertisement---

How to Use DeepSeek Janus-Pro Locally?

By Ismail

Updated On:

Follow Us
---Advertisement---

DeepSeek is a rising AI startup from China, has made a significant impact on the global AI industry.

Its rapid advancements have caused a massive shake-up, resulting in a $1 trillion drop in U.S. stock exchanges and putting pressure on tech giants like Nvidia and OpenAI.

DeepSeek has quickly established itself as a leading force in AI development, excelling in text generation, reasoning, vision models, and image creation.

Recently, it introduced cutting-edge models, including Janus, a powerful multimodal model that can understand visual data and generate images from text.

All DeepSeek Janus-Series Models

The DeepSeek Janus-Series consists of advanced multimodal AI models designed to improve both visual understanding and image generation.

These models include:

1) Janus: A groundbreaking AI model that uses an autoregressive framework to handle visual and text-based tasks efficiently.

It separates visual encoding into different pathways for better accuracy, making it a strong contender for future AI developments.

2) JanusFlow: This model builds upon Janus by incorporating an advanced technique called rectified flow, which improves the quality and efficiency of generated images and text outputs.

It has a simple design, making it easy to integrate with other AI models.

3) Janus-Pro: The most advanced version in the series, Janus-Pro offers enhanced training methods, a larger dataset, and improved scaling capabilities.

These features significantly boost its ability to understand and generate high-quality text and images.

How to Set Up the Janus Project

Janus is still a relatively new AI model, and there are no ready-made applications available for local use.

However, DeepSeek provides a GitHub repository with a Gradio-based demo, which allows users to interact with the model.

The main challenge with the demo is that it often runs into package conflicts, making it difficult to use.

To resolve these issues, this guide will walk you through modifying the existing code, building a custom Docker container, and running the model locally.

Follow these steps:

Step 1: Install Docker Desktop

Docker is a tool that helps developers create and manage software applications inside virtual containers.

To install Docker Desktop, download it from the official website and follow the installation instructions.

For Windows users: You will need to install the Windows Subsystem for Linux (WSL) before proceeding. Open your terminal and enter the following command:

wsl --install

Step 2: Clone the Janus Repository

Next, download the Janus project from GitHub by running the following commands in your terminal:

git clone https://github.com/deepseek-ai/Janus.git
cd Janus

These commands will create a copy of the Janus project on your computer and navigate to the project directory.

Step 3: Modify the Demo Code

To ensure smooth operation, open the app_januspro.py file and make the following changes:

  • Change the model name from deepseek-ai/Janus-Pro-7B to deepseek-ai/Janus-Pro-1B. This smaller version of the model requires less computing power and is more suitable for local use.
  • Update the script’s last line to:
demo.queue(concurrency_count=1, max_size=10).launch(
    server_name="0.0.0.0", server_port=7860
)

This ensures compatibility with Docker and prevents potential issues.

Step 4: Create a Docker Image

A Docker image is like a blueprint for running software inside a container.

To build an image for Janus, create a file called Dockerfile in the project’s root directory and add the following content:

# Use the PyTorch base image
FROM pytorch/pytorch:latest

# Set the working directory inside the container
WORKDIR /app

# Copy the project files into the container
COPY . /app

# Install necessary Python packages
RUN pip install -e .[gradio]

# Launch the Gradio app
CMD ["python", "demo/app_januspro.py"]

This script sets up the AI model inside a Docker container, making it easier to use and manage.

Step 5: Build and Run the Docker Image

To build the Docker image, use the following command:

docker build -t janus .

This process may take 10 to 15 minutes, depending on your internet speed.

After building the image, run the container with this command:

docker run -it -p 7860:7860 -d -v huggingface:/root/.cache/huggingface -w /app --gpus all --name janus janus:latest

This command ensures GPU support, assigns the correct ports, and enables persistent storage for AI models.

If you open the Docker Desktop application and navigate to the “Containers” tab, you will see that the janus container is running.

However, it is not yet ready to use.

Running the Janus Image in the container.

To check its progress, click on the janus container and then go to the “Logs” tab.

Here, you will notice that the container is downloading the model file from the Hugging Face Hub.

Logs on janus Container

Once the model has been successfully downloaded, the logs will display a message indicating that the application is running.

The model have successfully downloaded and web application is running.

You can then access your application.

If you are experiencing issues, please check the updated version of the Janus project at kingabzpro/Janus: Janus-Series.

Step 6: Testing the Janus Model

Once the model is running, open your browser and go to:

http://localhost:7860/

Here, you can test the AI model’s capabilities, including text-to-image generation and visual data analysis.

Test the Janus Pro Model Locally

Multimodal Understanding

To evaluate the model’s multimodal understanding, we first load an image from the DataCamp tutorial and ask the model to explain it.

The results are impressive—even with the smaller 1B model, the response is highly accurate and detailed.                                                                                                    

Janus Web application UI Multimodal Understanding

Next, we load another image and ask the model to summarize the content of an infographic.

The model successfully understands the text within the image and provides a highly accurate and coherent response.

This demonstrates the model’s strong ability to process and interpret both visual and textual elements.

Text-to-Image Generation

Scrolling down the app, you’ll find the “Text-to-Image Generation” section. Here, you can enter a prompt of your choice and click the “Generate Images” button. The model generates five variations of the image, which may take a few minutes to complete.

Janus Web application UI Text to image generations

The results are remarkable, producing outputs that are comparable to Stable Diffusion XL in terms of quality and detail.

You can learn how to Fine-tune Stable Diffusion XL with DreamBooth and LoRA on your personal images. 

Janus Web application UI Text to image generations

Let’s try another prompt:

Prompt:

“The image features an intricately designed eye set against a circular backdrop adorned with ornate swirl patterns that evoke both realism and surrealism. At the center of attention is a strikingly vivid blue iris surrounded by delicate veins radiating outward from the pupil to create depth and intensity. The eyelashes are long and dark, casting subtle shadows on the skin around them, which appears smooth yet slightly textured as if aged or weathered over time.

Above the eye, there’s a stone-like structure resembling part of classical architecture, adding layers of mystery and timeless elegance to the composition. This architectural element contrasts sharply but harmoniously with the organic curves surrounding it. Below the eye lies another decorative motif reminiscent of baroque artistry, further enhancing the overall sense of eternity encapsulated within each meticulously crafted detail.

Overall, the atmosphere exudes a mysterious aura intertwined seamlessly with elements suggesting timelessness, achieved through the juxtaposition of realistic textures and surreal artistic flourishes. Each component—from the intricate designs framing the eye to the ancient-looking stone piece above—contributes uniquely towards creating a visually captivating tableau imbued with enigmatic allure.

Janus Web application UI Text to image generations

Once again, the results are stunning. The generated images capture the intricate details and surreal artistic elements described in the prompt.

Conclusion

DeepSeek’s Janus series is a groundbreaking advancement in AI technology, combining text and image processing in a powerful and efficient way.

While still in its early stages, this model has already demonstrated significant potential for various applications.

By following this guide, you can experiment with the Janus Pro model locally and explore its capabilities firsthand.

For more AI-related tutorials, check out our guide on Fine-Tuning DeepSeek R1 for Advanced AI Applications.

Ismail

MD. Ismail is a writer at Scope On AI, here he shares the latest news, updates, and simple guides about artificial intelligence. He loves making AI easy to understand for everyone, whether you're a tech expert or just curious about AI. His articles break down complex topics into clear, straightforward language so readers can stay informed without the confusion. If you're interested in AI, his work is a great way to keep up with what's happening in the AI world.

Join WhatsApp

Join Now

Join Telegram

Join Now

Leave a Comment