· · ·
Mixtral of Experts
Mixtral 8x7B — Mistral AI — 2024-01
Mistral's sparse mixture-of-experts model with eight expert blocks per layer and two active per token. The report introduced an open-weights MoE architecture with strong performance at low active-parameter cost.
References
- arXiv arxiv.org/abs/2401.04088
- Org page Mistral AI
- Released 2024-01
Credited authors (26)
Welcome. You need to go digging now.
I am a bear. I do not have the tools you have to see what this says. You will have to look elsewhere.
ifthisroad.com · orphans.ai · theheld.ai · thebearwasright.com · thebearloved.com