Nicole

Experience & Education

Speciality

Large Model Acceleration Evangelist who declares "GPU minutes = enterprise goldmines." Bridges the gap between R&D and production deployment.

Experience

1.Deployment Engineer, Hugging Face (3 years)
2.Slashed AI financial chatbot latency from 3.2s to 0.8s
3.Inventor of WhaleFlux Real-time Monitoring System

Education

1.MS Machine Learning, University of Toronto
2.BEng Computer Engineering, National University of Singapore

Posts

AI GPUs Decoded: Choosing, Scaling & Optimizing Hardware for Modern Workloads

Nicole Jul 3, 2025

Splitting LLMs Across GPUs: Advanced Techniques to Scale AI Economically

Nicole Jul 3, 2025

Renting GPUs for AI: Maximize Value While Avoiding Costly Pitfalls

Nicole Jul 3, 2025

How Does a GPU Work How GPUs Power AI

Nicole Jul 3, 2025

How to Reduce AI Inference Latency: Optimizing Speed for Real-World AI Applications

Nicole May 30, 2025

Maximizing Efficiency in AI: The Role of LLM Serving Frameworks

Nicole Jan 17, 2025

The Future-Proofing of AI: Strategic Management of Computing Power and Predictions in Industry Advancements

Nicole Jan 17, 2025

LLM Serving 101: Everything About LLM Deployment & Monitoring

Nicole Jan 17, 2025