●§ 5.4 · /ai-transformation/sovereign-ai
Agentic AI where your data can't leave the building.
For operators who can't — or won't — send their data to cloud AI providers, we build and run the same agents on local hardware. Ollama on Apple Silicon for low-volume workloads. On-prem GPUs for heavier ones. Same agents, same architecture, no data ever leaves your network.
01Who this is for
Who this is for.
Healthcare operators under HSE, NHS, or HIPAA-equivalent data handling rules
Financial services firms where client data can't touch a third-party LLM
Regulated manufacturers and defence-adjacent sectors
Any operator whose customers contractually prohibit cloud AI processing of their data
Operators in jurisdictions with strict data residency requirements
02Models we run locally
Models we run locally.
Qwen 3 72B
Strong general capability, good at structured extraction.
Llama 3.3 70B
Good reasoning, mature ecosystem.
Specialised models (7B–32B)
Smaller specialised models for targeted tasks.
03Hardware options
Hardware options.
Apple Silicon (M-series Mac Studios / Pros)
For low-volume, high-quality inference at surprising cost-efficiency.
On-prem GPU servers
For higher throughput.
Air-gapped deployments
Where required.
04The honest caveat
The honest caveat.
Local models are not as capable as frontier cloud models for the hardest reasoning tasks. For 80% of mid-market operational use cases — extraction, classification, matching, drafting — they're entirely sufficient. For the remaining 20%, we'll tell you honestly which tasks are not yet viable on local inference.