LLM Serving 101: Everything About LLM Deployment & Monitoring
Nicole
Jan 17, 2025
Large Model Acceleration Evangelist who declares "GPU minutes = enterprise goldmines." Bridges the gap between R&D and production deployment.
1.Deployment Engineer, Hugging Face (3 years)
2.Slashed AI financial chatbot latency from 3.2s to 0.8s
3.Inventor of WhaleFlux Real-time Monitoring System
1.MS Machine Learning, University of Toronto
2.BEng Computer Engineering, National University of Singapore