Logo
Mistral AI of Experts logo

Mistral AI of Experts

Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model

Visit Website
Screenshot of Mistral AI of Experts

About Mistral AI of Experts

Mistral AI introduces Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. This technique increases the number of parameters of a model while controlling cost and latency, as the model only uses a fraction of the total set of ...

Key Features

6 features
  • Sparse Mixture of Experts (SMoE) architecture.
  • Router network selects two experts per token.
  • Increases the number of parameters without sacrificing cost and latency.
  • High performance on MT-Bench
  • TruthfulQA
  • and BBQ benchmarks.

Use Cases

4 use cases
  • Natural language processing.
  • Text generation.
  • Machine learning research.
  • Model fine-tuning.
Added April 23, 2024
Loading reviews...

Browse All Tools in These Categories