Launching Empower-Functions - A GPT4 Level Function Calling Model Tailored for Real-World Use Cases

Introducing empower-functions, the industry-leading function calling LLM designed to power the real-world function calling use cases.

Yulong Liu
March 27, 2024
3 mins read

TLDR: Empower-functions is a model that offers GPT-4-level function call capabilities, focusing on real-world use cases such as multi-turn and parallel calling, but with 3 times faster latency and 10 times lower cost. Check out our doctor appointment booking bot live demo!

The Problem

The full potential of Large Language Models (LLMs) is realized not only through conversation but also through their integration with external APIs, enabling them to perform actions such as interacting with internal systems for identity verification, booking appointments, and processing checkouts. The capability to call functions is critical to empower a wide range of real-world use cases, including workflow automation and support agent tasks.

Currently, the predominant solution involves using OpenAI's models, where users face a choice between GPT-4, which offers high response quality but hindered by significant latency and high costs that limit its applicability in various use cases, and GPT-3.5, which, while faster and more affordable, is more likely to generate inaccurate responses.  The demand for a more balanced solution, a model that offers higher response quality than GPT-3.5 with much better performance than GPT-4, reveals few alternatives. While the emergence of open-source software (OSS) models broadens possibilities and flexibility, none of the current major providers such as fireworks.ai, anyscale, and together.ai adequately address this need in real-world use cases. For instance, they generally underperform in multi-turn interactions and few support parallel calling.

The Solution: Empower-functions, a Model Tailored for Real-World Function Calling Use Cases

Empower-functions is an LLM developed by empower.dev, focusing on the real-world function calling use case.

Below, we use a screenshot to showcase how the empower-functions model performs on a complex, multi-turn conversation that requires multiple function calls. For a more hands-on experience, please try our live demo

Demo of empower-functions, as a doctor appointment booking agent

Under the shell, the empower-functions model is fine-tuned based on the Mixtral-8X7B-Instruct model. We specifically collected data and tailored the model to support multi-turn conversations, and to determine whether to trigger functions automatically. These efforts ensure the best performance in real-world use cases, which typically involve multi-turn conversations interleaved with function calls. Levering our proprietary inference engine, we have reduced the TTFT(time to first token) latency to under 400ms, a substantial improvement over GPT-4’s one-second latency. We are offering this model at a price point of $1.5 per million tokens.

To comprehensively access response quality of the model, we benchmarked it across three datasets (all of the datasets can be found here):

- Single Turn Dataset: The model is evaluated for its ability to execute a precise function call, assessing both the accuracy of the selected function and the arguments.

- Parallel Call Dataset: In this scenario, the model demonstrates its capacity to handle multiple (2-6) function calls within a single message, a feature not supported by Fireworks and Anyscale.

- Multi-Turn Dataset: Designed to simulate a complex real-world environment, such as a healthcare appointment booking system, the model navigates between natural conversation, initiating function calls, asking clarifying questions, and, when necessary, transferring to customer service. The assessment focuses on the accuracy of intent classification and the correctness of function calls.

In the benchmark, we compared the model against other function-calling models including GPT-4, GPT-3.5, Firefunctions, Together.ai, and Anyscale. For Together.ai and Anyscale, we used mistralai/Mixtral-8x7B-Instruct-v0.1, as it represents their best offering. empower-functions consistently deliver superior performance in all scenarios, especially in the multi-turn dataset and the parallel-calling dataset, which are closer to real-world use cases.

How to Use

We have made the model generally available on our platform today. You can experiment with our live demo for a hands-on experience with the model in a real-world use case. To use the model in your project, simply sign up for an account and obtain an API key. We also provide free credits for your trial journey, see our quick start guide.

The completion API we provide is fully compatible with the OpenAI API, allowing you to use the empower-functions model as a drop-in replacement. More details can be found in our function calling documentation.

Ready to start?

Deploy and serve your first fine-tuned LLM in 1 minute for free!

a black and white image of a black and white backgrounda black and white image of a black and white background