LLMLingua2 Prompt Compression
Intro
Learn how to leverage Microsoft's LLMLingua2 for efficient prompt compression, enhancing your Voiceflow agent's performance, tokens usage and reducing latency as we also explore integrating latest OpenAI's GPT-4o model with a fallback to GPT-4 Turbo using Cloudflare Al Gateway.
LLMLingua2 API code example is available on our main repo:
https://github.com/voiceflow/demos-n-examples
Cloudflare AI Gateway API documentation:
https://developers.cloudflare.com/ai-gateway/providers/universal
Video
Updated 7 months ago