HomeModelsBlog & PublicationJoin Us
EN
中文
HomeModelsBlog & PublicationJoin Us
Multimodal Interaction & World Model
The Seed Multimodal Interaction and World Model team is dedicated to developing models that have human-level multimodal understanding and interaction capabilities. The team is working to advance the exploration and development of multimodal assistant products.
Latest advancements

Seed1.5-VL

Seed1.5-VL
Vision-language multimodal large models demonstrate outstanding performance in tasks such as visual reasoning, image question answering, chart understanding and question answering, visual grounding/counting, video understanding, and GUI agent tasks.

BAGEL

BAGEL
An open-source Unified Multimodal Model which possesses multiple capabilities such as image generation, image editing, style transformation, and image expansion, and is capable of delivering precise, accurate, and photorealistic outputs.

UI-TARS

UI-TARS
An open-source multimodal agent built upon a powerful vision-language model. It is capable of effectively performing diverse tasks within virtual worlds.

Selected papers

May 20, 2025
Emerging Properties in Unified Multimodal Pretraining
Computer Vision
May 13, 2025
Seed1.5-VL Technical Report
LLM
Jan 21, 2025
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Computer Vision
View More

Featured roles

Research Scientist/Engineer - Multimodal Interaction & World Model
Singapore
Experienced Hiring
Apply Now
Research Scientist- Foundation Model, Vision and Language
San Jose / Seattle
Experienced Hiring
Apply Now
Research Scientist, Multimodal Interaction & World Model - 2025 Start
Singapore
Campus Recruitment
Apply Now
Research Scientist Graduate- (Foundation Model, Vision and Language) - 2025 Start (PhD)
San Jose / Seattle
Campus Recruitment
Apply Now
Research Scientist Intern - Multimodal Interaction & World Model - 2025 Start
Singapore
Internship
Apply Now
Student Researcher (Seed - Foundation Model - Vision and Language) - 2025 Start (PhD)
San Jose / Seattle
Internship
Apply Now
View More
Models
Seed2.0Seedance 2.0Seedream 5.0 LiteSeed Realtime VoiceSeed GR-RL
Teams
LLMInfrastructuresVisionSpeechMultimodal Interaction & World ModelAI for ScienceRoboticsResponsible AI
Learn More
BlogSeed EdgeSeed Campus Recruitment
Models
Seed2.0
Seedance 2.0
Seedream 5.0 Lite
Seed Realtime Voice
Seed GR-RL
Teams
LLM
Infrastructures
Vision
Speech
Multimodal Interaction & World Model
AI for Science
Robotics
Responsible AI
Learn More
Blog
Seed Edge
Seed Campus Recruitment
Advancing the frontier of intelligence, in service of humanity
Join ByteDance Seed
Copyright © 2026 Bytedance Seed
Disclaimer
Contact us : seed.feedback@bytedance.com
Join ByteDance Seed
Copyright © 2026 Bytedance Seed
Disclaimer