RetakeData / infrastructure practice

Take control of your data

RetakeData helps infrastructure teams regain control of sensitive systems — observability, automation, virtualization, private cloud, and local AI. Designed and operated by a senior SRE with 10+ years on high-stakes platforms.

Hardware to software · Full-stack infrastructure · Bare metal to private AI

FULL-STACK CONTROL

On-prem, private-cloud, and hybrid infrastructure engineering. Observability, automation, virtualization, and private AI — for teams that need to keep sensitive workloads under their own control.

OBS

Observability at scale

Grafana, Loki, Thanos, Vector, Elasticsearch. Stack audits, ingestion tuning, Fluentd-to-Vector migrations, recording rules, query performance, and SLO dashboarding.

  • Grafana
  • Loki
  • Thanos
  • Vector
IAC

Infrastructure automation & IaC

Terraform multi-provider modules, Ansible, GitOps delivery pipelines, Consul and NetBox inventory. The automation that lets small teams operate at fleet scale.

  • Terraform
  • Ansible
  • NetBox
  • Consul
ONP

Proxmox / Ceph / on-prem HA

100+ Proxmox nodes deployed with PXE automation. Storage backends including ZFS, NFS, SAN, NVMe-oF, and Ceph. Migrations from vSphere, HA cluster design.

  • Proxmox
  • Ceph
  • ZFS
  • HA

Some operational data should not leave your network: incidents, logs, runbooks, internal docs, and procedures. RetakeData builds local AI systems for those environments: vLLM model serving, private RAG pipelines, RBAC-aware assistants, and integrations with your existing Proxmox, observability, and documentation stack. No external API dependency for sensitive workflows, predictable costs, full data control.

LLM

OSS Models

  • vLLMvLLM
  • LlamaLlama
  • QwenQwen
  • MistralMistral
EMB

Embedder

  • BGE-M3BGE-M3
  • NomicNomic
  • sentence-transformerssentence-transformers
VDB

Vector DB

  • pgvectorpgvector
  • QdrantQdrant
  • ChromaChroma

A few receipts from 10 years operating infrastructure that can't quietly fail.

11 TB/day - 400 Loki pods - 50TB Thanos

Observability at high-stakes scale

Operated multi-cluster observability handling 11 TB/day across Loki, Elasticsearch, and Thanos. Fine-tuned ingestion, migrated Fluentd to Vector, built recording rules converting TB/day of load-balancer logs into metrics. In a business where minutes of downtime mean seven-figure impact.

Role: Lead SREScale: 11 TB/dayStack: Loki · Thanos · Vector
3000 VMs - 4 providers - Git to monitored

VM delivery pipeline

Built a 6-phase autonomous VM delivery pipeline: Git PR, Terraform across vSphere/Proxmox/OpenStack, Consul and NetBox auto-registration, Ansible configuration, HAProxy backend registration, Centreon monitoring. No Kubernetes needed.

Role: Infra engineerScale: 3000 VMsStack: Terraform · Ansible · Consul
6 GPUs - 1000+ PDFs - vLLM serving

On-prem AI infrastructure

Deployed GPU-equipped servers in the datacenter running vLLM with a private RAG pipeline over 1000+ documents. Built Graphia, an RBAC-aware SRE agent that abstracts Grafana complexity for engineering teams.

Role: Platform leadScale: 6 GPUs · 1000+ docsStack: vLLM · RAG · Graphia
See full CV →

Tools built around real operational pain points, kept practical, open, and useful beyond our own environment.

RetakeData project preview
OPEN SOURCE

SSHplex

Modern SSH multiplexing with multi-source inventory and tmux or iTerm2 backends.

GitHub →
RetakeData project preview
OPEN SOURCE

OpenClaw Audit TUI

Terminal audit UI for OpenClaw sessions with live events and real-time streaming.

GitHub →
RetakeData project preview
OPEN SOURCE

terraform-provider-centreon

Terraform provider for Centreon API V2, monitoring configuration managed as infrastructure as code.

GitHub →
See all open source work →

Need to keep critical infrastructure under your control?

If your team is building on-prem, private-cloud, or hybrid infrastructure for sensitive workloads, let's talk.

Get in touch