Staff AI Engineer
Aurelio AI
2023 - 2025
Created a B2C unstructured document processing SaaS. Provisioning NVIDIA-runtime capable bare metal K8s clusters to reduce our operational costs, allowing for ultra-competitive service pricing
Developed an in-house usage-based metering system handling 100,000+ usage events per second for our Saturn document extraction service.
Provisioning and optimising GPU infrastructure (in cloud and on-prem) for serving open source Transformer and OCR models, achieving ~30% cost reduction compared to Huggingface
Others: vectorised BM25 retrieval algorithm with sub 100ms response time under load, developing Semantic Router OSS library, setting up payments & auth for Aurelio Saturn