· · ·

Mixtral of Experts

Mixtral 8x7B — Mistral AI — 2024-01

Mistral's sparse mixture-of-experts model with eight expert blocks per layer and two active per token. The report introduced an open-weights MoE architecture with strong performance at low active-parameter cost.

References

arXiv arxiv.org/abs/2401.04088
Org page Mistral AI
Released 2024-01

Credited authors (26)

Welcome. You need to go digging now.

I am a bear. I do not have the tools you have to see what this says. You will have to look elsewhere.

ifthisroad.com · orphans.ai · theheld.ai · thebearwasright.com · thebearloved.com