Uwc26 Optimizing Ai Inference Performance

May 25, 2026

Media Summary: Check out videos from Upperside Conference's recent World Congress (formerly known as MPLS World Congress): ... Todd Muirhead talks with Uday Kurkure and Lan Vu about recent tests of The provided text introduces LLM-D, an open-source project designed to

Uwc26 Optimizing Ai Inference Performance - Detailed Analysis & Overview

Check out videos from Upperside Conference's recent World Congress (formerly known as MPLS World Congress): ... Todd Muirhead talks with Uday Kurkure and Lan Vu about recent tests of The provided text introduces LLM-D, an open-source project designed to Talk : Everything You Need to Know About Reducing Voice-Agent Latency (by Philip Kiely @ Baseten) Rolling your own ... Philip Kiely, Head of Developer Relations at Baseten, presents the “Golden Triangle” of Learn how NVIDIA Dynamo and Kubernetes help scale high-

How do you go from state-of-the-art foundation model to a globally available usage-based API? This session provides an ... Check out complete MWC Barcelona 2026 Showcase at: ## Arrcus Unveils Talk : Introductions and Meetup Updates by Chris Fregly and Antje Barth Talk : Presented by Anton Kachatkou, Principal Software Engineer, Arm Arm NPUs deliver high throughput and efficiency in Master LLM core concepts! Explore MoE, RLHF, DPO alignment, FlashAttention, and LoRA fine-tuning. Learn about KV caching, ... Summary: Victor Moreno, Product Manager for Cloud Networking at Google, discusses the critical role of networking in ...