This is a cloaked model provided to the community to gather feedback. It’s a powerful, all-purpose model supporting long-context tasks, including code generation. All prompts and completions for this model are logged by the provider as well as OpenRouter.
Note that this is a base model mostly meant for testing, you need to provide detailed prompts for the model to return useful responses.
DeepSeek-V3 Base is a 671B parameter open Mixture-of-Experts (MoE) language model with 37B active parameters per forward pass and a context length of 128K tokens. Trained on 14.8T tokens using FP8 mixed precision, it achieves high training efficiency and stability, with strong performance across language, reasoning, math, and coding tasks.
DeepSeek-V3 Base is the pre-trained model behind DeepSeek V3

Meta's latest class of model (Llama 3.1) launched with a variety of sizes & flavors. This is the base 405B pre-trained version.
It has demonstrated strong performance compared to leading closed-source models in human evaluations.
To read more about the model release, click here. Usage of this model is subject to Meta's Acceptable Use Policy.