Atlas / Reports
Reports
Technical reports are treated here as public evidence trails: a way to connect names, organizations, and moments in the LLM timeline.
OlmoEarth v1.1: A more efficient family of OlmoEarth models
Ai2
Earth Observation Foundation Models · 2605.20804 · 2026-05-20
Qwen-Image-VAE-2.0 Technical Report
Alibaba Qwen
Image Generation / Vision Models · 2605.13565 · 2026-05-13
Qwen-Image-2.0 Technical Report
Alibaba Qwen
Image Generation / Vision Models · 2605.10730 · 2026-05-11
EMO: Pretraining Mixture of Experts for Emergent Modularity
Ai2
Mixture-of-Experts Language Models · 2605.06663 · 2026-05-07
MolmoAct2: Action Reasoning Models for Real-world Deployment
Ai2
Vision-Language-Action Models · 2605.02881 · 2026-05-04
Nemotron 3 Super: Open, efficient mixture-of-experts hybrid mamba-transformer model for agentic reasoning
NVIDIA
Reasoning Models · 2604.12374 · 2026-04-14
MedGemma 1.5 Technical Report
Google Gemini
Medical Multimodal Models · 2604.05081 · 2026-04-06
Olmo Hybrid: From Theory to Practice and Back
Ai2
Large Language Models · 2604.03444 · 2026-04-03
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
NVIDIA
Reasoning Models · 2603.19220 · 2026-03-19
GLM-OCR Technical Report
Z.ai
OCR / Document Intelligence Models · 2603.10910 · 2026-03-11
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
Microsoft
Large Language Models · 2603.05168 · 2026-03-05
Phi-4-reasoning-vision-15B Technical Report
Microsoft
Reasoning Models · 2603.03975 · 2026-03-04
GLM-5: Thinking, Coding, and Agentic Intelligence
Z.ai
Large Language Models · 2602.15763 · 2026-02-17
Nemotron ColEmbed V2: Top-Performing Late Interaction Embedding Models for Visual Document Retrieval
NVIDIA
Retrieval Embedding Models · 2602.03992 · 2026-02-03
Kimi K2.5: Visual Agentic Intelligence
Moonshot AI
Multimodal Agentic Models · 2602.02276 · 2026-02-02
Qwen3-ASR Technical Report
Alibaba Qwen
Speech and Audio Models · 2601.21337 · 2026-01-29
Qwen3-TTS Technical Report
Alibaba Qwen
Speech and Audio Models · 2601.15621 · 2026-01-22
TranslateGemma Technical Report
Google Gemini
Translation Models · 2601.09012 · 2026-01-13
Constitutional Classifiers++: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
Anthropic
Alignment and Safety · 2601.04603 · 2026-01-08
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents
Microsoft
Multimodal Agentic Models · 2512.22047 · 2025-12-26
NVIDIA Nemotron 3: Efficient and Open Intelligence
NVIDIA
Large Language Models · 2512.20856 · 2025-12-24
Nemotron 3 nano: Open, efficient mixture-of-experts hybrid mamba-transformer model for agentic reasoning
NVIDIA
Reasoning Models · 2512.20848 · 2025-12-23
Seed-Prover-1.5: Stronger Training-Time and Test-Time Scaling for Neural Theorem Proving
ByteDance Seed
Reasoning Models · 2512.17260 · 2025-12-19
Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision
NVIDIA
Mathematical Reasoning Models · 2512.15489 · 2025-12-17
Olmo 3
Ai2
Large Language Models · 2512.13961 · 2025-12-15
LFM2 Technical Report
Liquid AI
Foundation Models · 2511.23404 · 2025-12-01
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
NVIDIA
Large Language Models · 2511.18890 · 2025-11-24
Step-Audio-EditX Technical Report
Stepfun
Speech and Audio Models · 2511.03601 · 2025-11-05
CWM: An Open-Weights LLM for Research on Code Generation with World Models
Meta AI
Code Language Models · 2509.12054 · 2025-09-24
EmbeddingGemma: Open Models for Text Similarity Search
Google Gemini
Text Embedding Models · 2509.20354 · 2025-09-24
Qwen3-Omni Technical Report
Alibaba Qwen
Multimodal Models · 2509.17765 · 2025-09-22
Continuous Audio Language Models
Kyutai
Speech and Audio Models · 2509.06926 · 2025-09-08
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
NVIDIA
Reasoning Models · 2508.14444 · 2025-08-20
xLAM-2 Technical Report
Salesforce AI Research
Agentic Language Models · 2508.14935 · 2025-08-20
MolmoAct: Action Reasoning Models that can Reason in Space
Ai2
Vision-Language-Action Models · 2508.07917 · 2025-08-11
GLM-4.5: Agentic, Reasoning, and Coding Foundation Models
Z.ai
Language Models · 2508.06471 · 2025-08-08
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving with Tree Search and Reinforcement Learning
ByteDance Seed
Reasoning Models · 2507.23726 · 2025-07-30
Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning
ByteDance Seed
Reasoning Models · 2507.19849 · 2025-07-24
Step-Audio 2: Cascaded Multimodal Large Language Models with Versatile Speech Capabilities
Stepfun
Audio Language Models · 2507.16632 · 2025-07-22
Voxtral Technical Report
Mistral AI
Speech Language Models · 2507.13264 · 2025-07-17
Apple Intelligence Foundation Language Models: Tech Report 2025
Apple
Multimodal Language Models · 2507.13575 · 2025-07-16
FlexOlmo: Open Language Models for Flexible Data Use
Ai2
Large Language Models · 2507.07024 · 2025-07-09
TxGemma: Open Therapeutic Language Models
Google Gemini
Biomedical Language Models · 2507.07023 · 2025-07-09
Evaluating the Critical Risks of Amazon's Nova Premier under the Frontier Model Safety Framework
Amazon
Model Safety / System Cards · 2507.06260 · 2025-07-07
MedGemma Technical Report
Google Gemini
Medical Multimodal Models · 2507.05201 · 2025-07-07
GLM-4.1V-Thinking and GLM-4.5V: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Z.ai
Multimodal Models · 2507.01006 · 2025-07-01
ERNIE 4.5 Tiny Technical Report
Baidu
Language Models · 2025-06-30
ERNIE 4.5 Technical Report
Baidu
Multimodal Language Models · 2025-06-30
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
MiniMax
Reasoning Large Language Models · 2506.13585 · 2025-06-16
Magistral: Efficient Training of Small Language Models for Reasoning
Mistral AI
Reasoning Models · 2506.10910 · 2025-06-12
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Alibaba Qwen
Text Embeddings and Retrieval · 2506.05176 · 2025-06-05
Seed-Coder: Let the Code Model Curate Data for Itself
ByteDance Seed
Code Models · 2506.03524 · 2025-06-04
On Gemini Diffusion
Google Gemini
Diffusion Language Models · 2505.20099 · 2025-05-27
Gemma 3n Technical Report
Google Gemini
Multimodal Large Language Models · 2025-05-20
Amazon Nova Sonic Technical Report
Amazon
Speech Language Models · 2505.11298 · 2025-05-15
Qwen3 Technical Report
Alibaba Qwen
Large Language Models · 2505.09388 · 2025-05-14
Aya Vision: Advancing the Frontier of Multilingual Multimodality
Cohere
Multimodal Language Models · 2505.08751 · 2025-05-13
MiniMax-Speech: Intrinsic Zero-Shot Speech Understanding for Advanced Foundation Models
MiniMax
Speech Language Models · 2505.07916 · 2025-05-12
Seed1.5-VL Technical Report
ByteDance Seed
Vision-Language Models · 2505.07062 · 2025-05-11
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning and Monte-Carlo Tree Search with Proof Assistant Feedback
DeepSeek
Mathematical Reasoning Models · 2504.21801 · 2025-04-30
Amazon Nova Premier Technical Report
Amazon
Large Language Models · 2025-04-30
Phi-4-mini-reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Microsoft
Reasoning Models · 2504.21233 · 2025-04-29
Phi-4-reasoning Technical Report
Microsoft
Reasoning Models · 2504.21318 · 2025-04-29
BitNet b1.58 2B4T Technical Report
Microsoft
Large Language Models · 2504.12285 · 2025-04-16
Nemotron-CrossThink: Efficient Knowledge Distillation of Long Chain-of-Thought Reasoning
NVIDIA
Reasoning Models · 2504.13941 · 2025-04-15
GLM-Z1-Rumination: An Open Frontier-Class Reasoning Model Through Test-Time Scaling
Z.ai
Reasoning Models · 2025-04-15
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Shanghai AI Laboratory
Vision-Language Models · 2504.10479 · 2025-04-14
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning
ByteDance Seed
Reasoning Models · 2504.13914 · 2025-04-10
Kimi-VL Technical Report
Moonshot AI
Vision-Language Models · 2504.07491 · 2025-04-10
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
NVIDIA
Large Language Models · 2504.03624 · 2025-04-04
Hunyuan-T1: Scaling Up Test-Time Compute with Open-Source Reinforcement Learning
Tencent Hunyuan
Reasoning Models · 2504.02234 · 2025-04-03
ShieldGemma 2: Robust and Tractable Image Content Moderation
Google Gemini
Safety and Moderation Models · 2504.01081 · 2025-04-01
Command A: An Enterprise-Ready Large Language Model
Cohere
Large Language Models · 2504.00698 · 2025-04-01
Mistral Small 3.1 Technical Report
Mistral AI
Large Language Models · 2503.23335 · 2025-03-31
Tracing the thoughts of a large language model
Anthropic
Interpretability · 2025-03-27
On the Biology of a Large Language Model
Anthropic
Interpretability · 2025-03-27
Gemini Robotics-ER: Transforming Robotic Embodiment
Google Gemini
Robotics · 2503.20031 · 2025-03-27
QwQ-32B: Embracing the Power of Reinforcement Learning
Alibaba Qwen
Reasoning Models · 2503.20735 · 2025-03-27
Gemini Robotics: Bringing AI into the Physical World
Google Gemini
Robotics Multimodal Models · 2503.20020 · 2025-03-27
Gemma 3 Technical Report
Google Gemini
Multimodal Large Language Models · 2503.19786 · 2025-03-25
Qwen2.5-Omni Technical Report
Alibaba Qwen
Multimodal Models · 2503.20215 · 2025-03-23
Falcon-H1: A Family of Hybrid-Head Language Models for Efficient Reasoning
Technology Innovation Institute
Reasoning Models · 2503.16419 · 2025-03-20
The Amazon Nova family of models: Technical report and model card
Amazon
Multimodal Language Models · 2506.12103 · 2025-03-17
EXAONE Deep: Reasoning Enhanced Language Models
LG AI Research
Reasoning Models · 2503.12524 · 2025-03-16
ERNIE-X1 Technical Report
Baidu
Reasoning Models · 2025-03-16
Auditing language models for hidden objectives
Anthropic
Alignment and Safety · 2503.10965 · 2025-03-14
Gemini Embedding: Generalizable Embeddings From Gemini
Google Gemini
Text Embedding Models · 2503.07891 · 2025-03-11
Phi-4 Technical Report
Microsoft
Language Models · 2503.01743 · 2025-03-03
Qwen2.5-VL Technical Report
Alibaba Qwen
Vision-Language Models · 2502.13923 · 2025-02-19
Baichuan-M1: Pushing the Medical Capability of Large Language Models
Baichuan
Medical Language Models · 2502.12671 · 2025-02-18
Magma: A Foundation Model for Multimodal AI Agents
Microsoft
Multimodal Agent Models · 2502.13130 · 2025-02-18
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
Anthropic
Alignment and Safety · 2501.18837 · 2025-01-31
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
DeepSeek
Multimodal Large Language Models · 2501.17811 · 2025-01-29
Qwen2.5-1M Technical Report
Alibaba Qwen
Language Models · 2501.15383 · 2025-01-26
Scaling Granite Code Models to 128K Context
IBM Research
Code Language Models · 2501.15305 · 2025-01-25
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek
Large Language Models · 2501.12948 · 2025-01-22
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
ByteDance Seed
Multimodal Agentic Models · 2501.12326 · 2025-01-21
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Moonshot AI
Large Language Models · 2501.12599 · 2025-01-21
MiniMax-Text-01
MiniMax
Large Language Models · 2025-01-15
MiniMax-VL-01
MiniMax
Vision-Language Models · 2025-01-15
MiniMax-01: Scaling Foundation Models with Lightning Attention
MiniMax
Large Language Models · 2501.08313 · 2025-01-14
2 OLMo 2 Furious
Ai2
Large Language Models · 2501.00656 · 2024-12-31
DeepSeek-V3 Technical Report
DeepSeek
Large Language Models · 2412.19437 · 2024-12-27
OpenAI o1 System Card
OpenAI
Reasoning Models · 2412.16720 · 2024-12-21
Qwen2.5 Technical Report
Alibaba Qwen
Large Language Models · 2412.15115 · 2024-12-19
Alignment faking in large language models
Anthropic
Alignment and Safety · 2412.14093 · 2024-12-18
FastVLM: Efficient Vision Encoding for Vision Language Models
Apple
Vision-Language Models · 2412.13303 · 2024-12-17
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
DeepSeek
Vision-Language Models · 2412.10302 · 2024-12-12
Large Concept Models: Language Modeling in a Sentence Representation Space
Meta AI
Language Models · 2412.08821 · 2024-12-11
EXAONE 3.5: Series of Language Models for Real-world Use Cases
LG AI Research
Large Language Models · 2412.04862 · 2024-12-06
NVLM: Open Frontier-Class Multimodal LLMs
NVIDIA
Multimodal Language Models · 2412.04468 · 2024-12-05
PaliGemma 2: A Family of Versatile VLMs for Transfer
Google Gemini
Vision-Language Models · 2412.03555 · 2024-12-04
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbots
Z.ai
Audio Language Models · 2412.02612 · 2024-12-04
Arctic-Embed 2.0: Multilingual Retrieval Without Compromise
Snowflake
Text Embeddings and Retrieval · 2412.04506 · 2024-12-03
Yi-Lightning Technical Report
01.AI
Large Language Models · 2412.01253 · 2024-12-02
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Ai2
LLM Post-Training · 2411.15124 · 2024-11-22
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation
DeepSeek
Vision-Language Models · 2411.07975 · 2024-11-11
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Tencent Hunyuan
Large Language Models · 2411.02265 · 2024-11-04
GPT-4o System Card
OpenAI
Model Safety / System Cards · 2410.21276 · 2024-10-25
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
DeepSeek
Vision-Language Models · 2410.13848 · 2024-10-18
Pixtral 12B
Mistral AI
Multimodal Large Language Models · 2410.07073 · 2024-10-09
Falcon Mamba 7B: The First Competitive Attention-free 7B Language Model
Technology Innovation Institute
Large Language Models · 2410.05355 · 2024-10-07
UGROUND-V1: A Fully Open Large Multimodal GUI Agent Model
OSU NLP Group
Multimodal Agentic Models · 2410.05243 · 2024-10-07
Moshi: a speech-text foundation model for real-time dialogue
Kyutai
Speech Language Models · 2410.00037 · 2024-09-30
MM1.5: Methods, Analysis and Insights from Multimodal LLM Fine-tuning
Apple
Multimodal Language Models · 2409.20566 · 2024-09-30
Emu3: Next-Token Prediction is All You Need
BIGAI
Multimodal Models · 2409.18869 · 2024-09-27
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
Ai2
Vision-Language Models · 2409.17146 · 2024-09-25
Qwen2.5-Math Technical Report
Alibaba Qwen
Reasoning and Math Models · 2409.12122 · 2024-09-18
Qwen2.5-Coder Technical Report
Alibaba Qwen
Code Language Models · 2409.12186 · 2024-09-18
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Alibaba Qwen
Vision-Language Models · 2409.12191 · 2024-09-18
xLAM: A Family of Large Action Models to Empower AI Agent Systems
Salesforce AI Research
Agentic Language Models · 2409.03215 · 2024-09-05
OLMoE: Open Mixture-of-Experts Language Models
Ai2
Large Language Models · 2409.02060 · 2024-09-03
Jamba 1.5 Technical Report
AI21 Labs
Language Models · 2408.12570 · 2024-08-22
XGen-MM (BLIP-3): A Family of Open Large Multimodal Models
Salesforce AI Research
Vision-Language Models · 2408.08872 · 2024-08-16
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
DeepSeek
Mathematical Reasoning Models · 2408.08152 · 2024-08-14
ShieldGemma: Generative AI Content Moderation Based on Gemma
Google Gemini
Safety and Moderation Models · 2407.21772 · 2024-07-31
Gemma 2: Improving Open Language Models at a Practical Size
Google Gemini
Large Language Models · 2408.00118 · 2024-07-31
The Llama 3 Herd of Models
Meta AI
Large Language Models · 2407.21783 · 2024-07-31
Apple Intelligence Foundation Language Models
Apple
Multimodal Language Models · 2407.21075 · 2024-07-29
Falcon2-11B Technical Report
Technology Innovation Institute
Language Models · 2407.14885 · 2024-07-20
Qwen2-Audio Technical Report
Alibaba Qwen
Audio Language Models · 2407.10759 · 2024-07-14
PaliGemma: A versatile 3B VLM for transfer
Google Gemini
Vision-Language Models · 2407.07726 · 2024-07-10
InternLM-XComposer2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Shanghai AI Laboratory
Vision-Language Models · 2407.03320 · 2024-07-03
Open Instruct: A Simple Method for Aligning Language Models with Human Preferences
Ai2
Large Language Models · 2406.18405 · 2024-06-26
Nemotron-4 340B Technical Report
NVIDIA
Large Language Models · 2406.11704 · 2024-06-17
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
DeepSeek
Code Language Models · 2406.11931 · 2024-06-17
CodeGemma: Open Code Models Based on Gemma
Google Gemini
Code Language Models · 2406.11409 · 2024-06-17
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
Microsoft
Speech and Audio Models · 2406.05370 · 2024-06-08
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
DeepSeek
Reasoning and Math Models · 2405.14333 · 2024-05-23
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Meta AI
Multimodal Large Language Models · 2405.09818 · 2024-05-16
Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
Snowflake
Text Embeddings and Retrieval · 2405.05374 · 2024-05-08
Granite Code Models: A Family of Open Foundation Models for Code Intelligence
IBM Research
Code Language Models · 2405.04324 · 2024-05-07
DeepSeek-V2 Technical Report
DeepSeek
Large Language Models · 2405.04434 · 2024-05-07
Advancing Multimodal Medical Capabilities of Gemini
Google Gemini
Medical Multimodal Models · 2405.03162 · 2024-05-06
Snowflake Arctic: An Enterprise LLM
Snowflake
Large Language Models · 2405.00492 · 2024-04-30
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation
ByteDance Seed
Multimodal Models · 2404.14396 · 2024-04-22
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Apple
Large Language Models · 2404.14619 · 2024-04-22
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Microsoft
Language Models · 2404.14219 · 2024-04-22
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Google Gemini
Large Language Models · 2404.07839 · 2024-04-11
Jamba: A Hybrid Transformer-Mamba Language Model
AI21 Labs
Language Models · 2403.19887 · 2024-03-28
InternLM2 Technical Report
Shanghai AI Laboratory
Large Language Models · 2403.17297 · 2024-03-26
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Apple
Multimodal Language Models · 2403.09611 · 2024-03-14
Gemma: Open Models Based on Gemini Research and Technology
Google Gemini
Large Language Models · 2403.08295 · 2024-03-13
DeepSeek-VL: Towards Real-World Vision-Language Understanding
DeepSeek
Vision-Language Models · 2403.05525 · 2024-03-08
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Google Gemini
Multimodal Models · 2403.05530 · 2024-03-08
Yi: Open Foundation Models by 01.AI
01.AI
Large Language Models · 2403.04652 · 2024-03-07
DBRX: A Generalist Open Source LLM
Databricks
Large Language Models · 2402.19427 · 2024-02-29
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Microsoft
Large Language Models · 2402.17764 · 2024-02-27
Nemotron-4 15B Technical Report
NVIDIA
Large Language Models · 2402.16819 · 2024-02-26
Many-shot Jailbreaking
Anthropic
Alignment and Safety · 2402.03206 · 2024-02-12
SPIrit-LM: Interleaved Spoken and Written Language Model
Meta AI
Speech Language Models · 2402.05755 · 2024-02-09
Multilingual E5 Text Embeddings: A Technical Report
Microsoft
Text Embeddings and Retrieval · 2402.05672 · 2024-02-08
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
DeepSeek
Mathematical Reasoning Models · 2402.03300 · 2024-02-06
OLMo: Accelerating the Science of Language Models
Ai2
Large Language Models · 2402.00838 · 2024-02-01
DeepSeek-Coder: When the Large Language Model Meets Programming
DeepSeek
Code Models · 2401.14196 · 2024-01-25
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
DeepSeek
Mixture-of-Experts Language Models · 2401.06066 · 2024-01-11
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Anthropic
Alignment and Safety · 2401.05566 · 2024-01-10
Mixtral of Experts
Mistral AI
Large Language Models · 2401.04088 · 2024-01-08
DeepSeek LLM Technical Report
DeepSeek
Large Language Models · 2401.02954 · 2024-01-05
Gemini: A Family of Highly Capable Multimodal Models
Google Gemini
Multimodal Models · 2312.11805 · 2023-12-19