Model Sets Overview
Overview
If this is your first time here, we recommend reading our Mission Statement before starting with Model Sets.
Model Sets are critical components in Influxion's adaptive orchestration. These are where you configure your application requirements and monitor deployment behaviors. They are like a virtual model that is customized and evolves just for you.
A Model Set is, at its core, a collection of model deployments and behavioral requirements. Once configured, Influxion's intelligent routing determines the best models to use to achieve better results than can be accomplished with any single model. Influxion uses a control system to constantly monitor model behaviors and adjust model routing to satisfy your custom requirements, e.g., based on:
- Performance metrics
- Cost optimization
- Reliability and availability
- Accuracy metrics from integrated eval platforms
Designing Model Sets
What do you care about?
Just as there's no single answer to this question, there's no single best way to design a Model Set. However, there are certain properties that make some Model Sets more useful than others.
A good Model Set usually includes at least several models that exhibit tradeoffs in the dimensions that you care about. Both model and application workload behaviors fluctuate over both short and long time periods, so having buffer in which to operate enables Influxion's adaptive routing to optimize better while still satisfying your requirements. Similarly, your application requirements may change, e.g., to reduce costs or improve performance.
A Model Set might include several models from the same provider with different cost/accuracy tradeoffs. For example, picking some cheaper and less performant models in combination with some more expensive frontier models.
Alternatively, a Model Set might include models from multiple providers to be more robust to provider system errors or rate limits.
Open source models can enable more predictable latency, cost, and accuracy tradeoffs. For example, a Model Set might include several models derived from a common base model, e.g., with different parameter counts and/or quantization levels.
Objective and Constraints
Good model sets also specify desired behaviors that are achievable.
Model Sets support an objective dimension that will be optimized during deployment.
Objectives are minimized or maximized, depending on the dimension.
For example, latency and cost dimensions are minimized, whereas throughput dimensions like Tokens Per Second are maximized.
Objectives are subject to constraints, i.e., the latter are stricter requirements.
Constraints are quantifiable limits that Influxion's adaptive routing should respect.
They are specified as inequalities, e.g., latency ≤ 5 seconds.
While multiple constraints may be specified, be aware that each constraint limits options at runtime.
Especially for beginners, we recommend starting simple, e.g., with 0–2 constraints, and building up from there.
Being realistic about your requirements helps you get the most out of Model Sets, and AI deployments in general.
Evals
Influxion provides integration with LLM evals platforms, starting with DeepEval. Model Sets can use a subset of DeepEval metrics, either as part of the behavior settings or simply for monitoring purposes. We currently support:
- Answer Relevancy
- Bias
- Toxicity
- PII Leakage
- Summarization
Evals require additional LLM usage, so are charged in addition to your Model Set gateway requests using the same pricing structure.
Get Started with Model Sets
See the Model Sets guide for instructions on creating and managing Model Sets.