Quick update on my AI Rig. It has been pretty steep learning curve I must stay in the last month but I feel I was able to squeeze out of that HW as much as I could. My 64GB memory is pretty much filled up.
Kubernetes I have installed pretty much the standard stack - Cillium, CSI NFS, Grafana, Prometheus. Everything is managed with GitOps using Flux. I have decided day 1 to deploy Vault for secrets management which can get handy in the future. The plan is to deploy vLLM on this stack and probably migrate my web server on the cluster. I have Jupyter playbooks running so it is ready for AI/ML sandboxing.
vLLM VM Speaking to vLLM this was completely new. Never tried to run vLLM. I have initially tried to run it on Kubernetes but with my limited memory and no GPU (yet) I have managed to kill my cluster few times. I have decided to take a step back and isolate vLLM from anything else which showed to be better approach. I’m now running small Qwen/Qwen3-1.7B model. It is more for testing purposes but pretty cool I can run small LLM on CPU only.
Goose I was starting to play with opensource AI agents. This is completely new space for me so was starting to testing it with Qwen model running vLLM. Does Goose support it? Yes. Is it useful with small Qwen model? Not really. I will have to wait to use with larger local models. I have tested it very briefly with Gemini 3.0 Pro and it worked great.
I have done lots of backend work outside of AI stack like Portworx backups, web server, mail server, Obsidian, important Grafana dashboards etc.
I think I have now solid platform I can start building on. In the following weeks I want to spend more time on tuning vLLM and probably migrate it to Kubernetes. Plan is to buy GPU in December to start more progressing on AI and finally start focusing on that.