Pick a topic + a reading angle. Get a tailored digest at the time you choose. Here's the whole flow.
Same shape for everyone. The only thing that changes between users is which categories you subscribe to and which reader profile you pick.
Five lenses on the same paper. Pick the one that matches how you actually read.
Sample arXiv paper
Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux + others
Read on arXiv ↗Researcher
For: Domain specialist active in NLP / LLM research
Mixtral 8x7B extends Mistral 7B's architecture by replacing each layer's single FFN with 8 expert FFNs and a learned top-2 routing function. Total parameter count is 47B; per-token active parameters are 13B. Routing is per-layer per-token, so an expert subset is selected dynamically — this is consistent with the Switch Transformer line but uses top-k=2 rather than top-1, trading slightly more compute for routing stability. Benchmarks: outperforms Llama 2 70B on MMLU (70.6 vs 69.9), HellaSwag (87.0 vs 84.9), and ARC-c (66.0 vs 64.5); matches GPT-3.5 on most reasoning + coding tasks. Multilingual + math gains attributed to the larger effective parameter pool reached through routing. Open-weights release with permissive license is the practitioner-facing contribution alongside the methodology.
Sparse MoE with top-2 routing delivers 70B-class quality at 13B inference cost — and Mistral shipped the weights, so the comparison reproduces.

We support every category arXiv publishes. Browse the full taxonomy below; pick what you read.
cs.AIArtificial IntelligenceAI methods that don't fit narrower categories — planning, reasoning, agents, general ML applications.
cs.LGMachine LearningStatistical and theoretical learning, neural architectures, training methods, optimisation.
cs.CLComputation and LanguageNLP, transformers, multilingual modelling, dialog systems, retrieval-augmented generation.
cs.CVComputer VisionImage understanding, video analysis, generative imagery, 3D reconstruction.
cs.IRInformation RetrievalSearch, ranking, recommendation systems, knowledge graphs.
cs.CRCryptography and SecurityCryptographic protocols, system security, adversarial ML, privacy-preserving computation.
math.STStatistics TheoryProbability theory, statistical inference, hypothesis testing, estimation.
math.OCOptimization and ControlConvex / non-convex optimisation, control theory, operations research.
physics.comp-phComputational PhysicsNumerical simulation, ML for physics, high-performance scientific computing.
astro-phAstrophysicsCosmology, stellar physics, exoplanets, gravitational-wave detection.
q-bio.QMQuantitative MethodsStatistical / ML methods applied to biological data — genomics, proteomics, neuroscience.
stat.MLMachine Learning (Statistics)ML from a statistical lens — Bayesian methods, causal inference, uncertainty quantification.
econ.GNGeneral EconomicsBehavioural economics, market design, mechanism theory — frequent ML overlap.
We surface connections between papers across your categories — what links to what, before you start reading. The example below uses the Mixtral paper and two related works.
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch
Establishes the mixture-of-experts routing pattern Mixtral builds on; uses top-1 routing where Mixtral uses top-2, trading slightly higher per-token compute for more stable gradients.
William Fedus, Barret Zoph, Noam Shazeer
arXiv ↗Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch
Direct predecessor — Mixtral keeps Mistral 7B's overall architecture and replaces only the FFN block with the 8-expert SMoE layer, isolating the contribution of routing from other architectural choices.
arXiv ↗Pick the rhythm that matches your week.
Steady drip — small batches, every weekday morning. Good if you read arXiv as part of your daily routine.
One bigger digest at the start of the week. Good if you batch your reading and don't want a daily inbox ping.
Honest boundaries. If DIGEST isn't right for your use case, we'd rather you find that out before signing up.
We only surface arXiv content. We don't scrape paywalled journals or behind-login PDFs.
Patents follow a different reading mode. We don't include them in any digest.
We focus on papers from the last 30 days. Historical archive search isn't our scope.
We summarise English-language abstracts. Multi-language coverage is post-launch.
We don't distinguish between preprints and peer-reviewed publications. arXiv treats them similarly; so do we.