How DIGEST works

Pick a topic + a reading angle. Get a tailored digest at the time you choose. Here's the whole flow.

From arXiv to your inbox — 4 steps

Same shape for everyone. The only thing that changes between users is which categories you subscribe to and which reader profile you pick.

01Pick

categories + profile

02We fetch

every morning

03We analyze

agents per profile

04You receive

inbox digest

01Pick

categories + profile

02We fetch

every morning

03We analyze

agents per profile

04You receive

inbox digest

Choose your reading angle

5 / 5 lenses

Five lenses on the same paper. Pick the one that matches how you actually read.

Sample arXiv paper

Mixtral of Experts

Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux + others

Read on arXiv ↗

Researcher

For: Domain specialist active in NLP / LLM research

Mixtral 8x7B extends Mistral 7B's architecture by replacing each layer's single FFN with 8 expert FFNs and a learned top-2 routing function. Total parameter count is 47B; per-token active parameters are 13B. Routing is per-layer per-token, so an expert subset is selected dynamically — this is consistent with the Switch Transformer line but uses top-k=2 rather than top-1, trading slightly more compute for routing stability. Benchmarks: outperforms Llama 2 70B on MMLU (70.6 vs 69.9), HellaSwag (87.0 vs 84.9), and ARC-c (66.0 vs 64.5); matches GPT-3.5 on most reasoning + coding tasks. Multilingual + math gains attributed to the larger effective parameter pool reached through routing. Open-weights release with permissive license is the practitioner-facing contribution alongside the methodology.

Sparse MoE with top-2 routing delivers 70B-class quality at 13B inference cost — and Mistral shipped the weights, so the comparison reproduces.

150+ arXiv categories

13 / 150+ categories

We support every category arXiv publishes. Browse the full taxonomy below; pick what you read.

Computer Science

6 categories

cs.AIArtificial Intelligence

AI methods that don't fit narrower categories — planning, reasoning, agents, general ML applications.

cs.LGMachine Learning

Statistical and theoretical learning, neural architectures, training methods, optimisation.

cs.CLComputation and Language

NLP, transformers, multilingual modelling, dialog systems, retrieval-augmented generation.

cs.CVComputer Vision

Image understanding, video analysis, generative imagery, 3D reconstruction.

cs.IRInformation Retrieval

Search, ranking, recommendation systems, knowledge graphs.

cs.CRCryptography and Security

Cryptographic protocols, system security, adversarial ML, privacy-preserving computation.

Mathematics

2 categories

math.STStatistics Theory

Probability theory, statistical inference, hypothesis testing, estimation.

math.OCOptimization and Control

Convex / non-convex optimisation, control theory, operations research.

Physics

2 categories

physics.comp-phComputational Physics

Numerical simulation, ML for physics, high-performance scientific computing.

astro-phAstrophysics

Cosmology, stellar physics, exoplanets, gravitational-wave detection.

Quantitative Biology

1 categories

q-bio.QMQuantitative Methods

Statistical / ML methods applied to biological data — genomics, proteomics, neuroscience.

Statistics

1 categories

stat.MLMachine Learning (Statistics)

ML from a statistical lens — Bayesian methods, causal inference, uncertainty quantification.

Economics

1 categories

econ.GNGeneral Economics

Behavioural economics, market design, mechanism theory — frequent ML overlap.

Browse the full list on arXiv →

Cross-reference analysis — Pro

Pro

We surface connections between papers across your categories — what links to what, before you start reading. The example below uses the Mixtral paper and two related works.

Mixtral of Experts

Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch

arXiv ↗

Establishes the mixture-of-experts routing pattern Mixtral builds on; uses top-1 routing where Mixtral uses top-2, trading slightly higher per-token compute for more stable gradients.

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

William Fedus, Barret Zoph, Noam Shazeer

arXiv ↗

Mistral 7B
Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch
Direct predecessor — Mixtral keeps Mistral 7B's overall architecture and replaces only the FFN block with the 8-expert SMoE layer, isolating the contribution of routing from other architectural choices.
arXiv ↗

Daily or weekly

Pick the rhythm that matches your week.

Daily

9:00 AM in your timezone

Steady drip — small batches, every weekday morning. Good if you read arXiv as part of your daily routine.

Weekly

Monday morning

One bigger digest at the start of the week. Good if you batch your reading and don't want a daily inbox ping.

What we don't do

5 known limits

Honest boundaries. If DIGEST isn't right for your use case, we'd rather you find that out before signing up.

No closed journals
We only surface arXiv content. We don't scrape paywalled journals or behind-login PDFs.
No patents
Patents follow a different reading mode. We don't include them in any digest.
Recent preprints only
We focus on papers from the last 30 days. Historical archive search isn't our scope.
English only at launch
We summarise English-language abstracts. Multi-language coverage is post-launch.
Preprints + accepted papers
We don't distinguish between preprints and peer-reviewed publications. arXiv treats them similarly; so do we.

How DIGEST works

From arXiv to your inbox — 4 steps

Choose your reading angle

Mixtral of Experts

150+ arXiv categories

Cross-reference analysis — Pro

Mixtral of Experts

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

Mistral 7B

Daily or weekly

Daily

Weekly

What we don't do

No closed journals

No patents

Recent preprints only

English only at launch

Preprints + accepted papers