Don't Trust, Verify
In this tutorial, we will attempt to reproduce the results achieved with Decentralized ChatGPT on our local machine (which supports a 4090 GPU, as the EternalAI platform currently uses a cluster of 4090 GPU machines to ensure determinism). This will help verify that the response is genuinely generated by the Prime Intellect 1 model. Let's proceed step by step.
Step 1
Install Docker (if you haven't)
https://docs.docker.com/engine/install/
Step 2
Create the entrypoint.sh file.
#!/bin/bash
# Start Ollama in the background.
/bin/ollama serve &
# Record Process ID.
pid=$!
# Pause for Ollama to start.
sleep 5
echo "🔴 Retrieve hf.co/lmstudio-community/INTELLECT-1-Instruct-GGUF:Q8_0 model..."
ollama run hf.co/lmstudio-community/INTELLECT-1-Instruct-GGUF:Q8_0
echo "🟢 Done!"
# Wait for Ollama process to finish.
wait $pidStep 3
Create the docker-compose.yml file.
Step 4
Run docker compose up -d to start Ollama.
It will download (if haven't) and then run the hf.co/lmstudio-community/INTELLECT-1-Instruct-GGUF:Q8_0model.
The download may take a while and you can check its progress by viewing log of the docker container by running docker logs -f eternal-ollama-1 where eternal-ollama-1 is container name.
It should be complete once you see the following log.

Step 5
Once the download is complete, a chat completion api will also be run (at port 11435) on your local machine that is ready for you to "chat" directly with the model and verify against the Decentralized ChatGPT's responses.
Step 6
The response should look like:
The response's content mentioned above should be deterministically identical to the content returned by EternalAI's decentralized inference API (used by the code from Decentralized ChatGPT post).
Last updated