Verify critical details — pricing, licensing, availability — with the model's source before business decisions. Full methodology →
BAGEL-7B-MoT
Model family: bagel
An open-weightA model where the trained weights are freely downloadable — you can run it yourself without contacting the creator. Llama, Mistral, Qwen, and Gemma are open-weight. Open-weight does not mean open-source: the training data and code often stay private. The license still governs what you can do with the weights, including whether you can use them commercially. "unified" multimodalA model that can handle more than one type of input or output — typically text plus images, sometimes plus audio or video. "GPT-4 Vision" and "Llama 3.2 11B Vision" are multimodal models that accept both text and images. A text-only model is called "unimodal" but nobody uses that term; text-only is the assumed default. model — one model that both understands and generates text, images, and video — under a permissive Apache 2.0 license. Useful as a single self-hostable building block for mixed media tasks.
Identity
- Creator
- ByteDance
- Model family
- bagel
- Release date
- 2025-05-19
Technical specs
- Parameter count
- 7B
- Context window
- 33K tokens
- Modalities
- Image Input
- Image Output
- Text
- Video Input
- Primary capabilities
- Chat
- Image Generation
- Instruction Following
- Reasoning
- Vision
License
- License
- Apache License 2.0
- Commercial use
- Allowed
- Terms
- Modification ✓
- Redistribution ✓
- Attribution ✓
Access
- Openness
- Open Weight
- Access methods
- Local Runtime Vllm
- Weights Download Hf
- Cost tier
- Self Hosted Only
- open-weight
- multimodal
- image-generation
- vision
- small
- china-based
- apache-2-0