


JAN
22
qui, 22 jan
On-line
0
dias
0
horas
0
minutos
0
segundos



How do you choose the right serving strategy for your model? This presentation seeks not to just introduce Dynamo to the audience but to take a practical outlook by walking through techniques and real-world scenarios in serving AI workloads in production such as disaggregated serving, optimizing deployments against memory-bound bottlenecks.
Key learning objectives: Attendees will have an intuitive understanding of how Dynamo addresses memory-bound operations to get the best performance out of GPUs. They will understand the different architectural patterns in serving AI workloads in distributed environments and a framework to determine the choice of deployment based on technical and business constraints.
Target level: Beginner - Advanced







