Logo

Loading...

Sign in
Mistral AI of Experts Logo

Mistral AI of Experts

Mistral AI of Experts

Mistral AI of Experts

Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model

Pricing

Free

Tool Info

Rating: N/A (0 reviews)

Date Added: April 23, 2024

Categories

Developer ToolsLLMs

Description

Mistral AI introduces Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. This technique increases the number of parameters of a model while controlling cost and latency, as the model only uses a fraction of the total set of parameters per token. Mixtral achieves high performance on various benchmarks, including MT-Bench, TruthfulQA, and BBQ.

Key Features

  • Sparse Mixture of Experts (SMoE) architecture.
  • Router network selects two experts per token.
  • Increases the number of parameters without sacrificing cost and latency.
  • High performance on MT-Bench
  • TruthfulQA
  • and BBQ benchmarks.

Use Cases

  • Natural language processing.
  • Text generation.
  • Machine learning research.
  • Model fine-tuning.
Reviews
0 reviews
Leave a review

    Other Tools in the Same Category