ST
SenseTimeSenseNova 6.7 Flash-Lite
Lightweight multimodal agent — native vision, 60% lower token use in search
ParametersSmaller parameter scale (multimodal agent architecture)Context WindowNot publicly disclosedKnowledge CutoffNot publicly disclosed
Overview
SenseNova 6.7 Flash-Lite is SenseTime's lightweight multimodal agent model with a native multimodal architecture that removes the vision-to-text intermediate layer.
It can directly interpret complex web layouts, document structures, and financial charts for integrated see-think-act workflows; token consumption in information-search scenarios is reported to drop by 60%, with millisecond-level feedback.
Suited to data analysis, in-depth research, and PPT generation and other long-chain tasks; SenseNova-Skills is open source on GitHub—showing that lighter, lower-cost multimodal agents can still reach high performance on complex workflows.
Use Cases
Data AnalysisContent WritingImage Generation