Staff AI Engineer
Aurelio AI
September 2023 - January 2025
Created Aurelio Saturn, a B2C AI document extraction SaaS, for preparing unstructured documents for LLM use. Scaled the system to over 100 active users.
Developed an in-house usage-based metering system handling 100,000+ usage events per second for our Saturn document extraction service.
Provisioning and optimising GPU infrastructure (in cloud and on-prem) for serving open source Transformer and OCR models, achieving ~30% cost reduction compared to HF Inference Endpoints
Others: vectorised BM25 retrieval algorithm with sub 100ms response time under load, developing Semantic Router OSS library, setting up payments & auth for Aurelio Saturn