One of the largest video captioning datasets, E-commercial Multimodal Advertising Dataset, which contains 100,000+ valid data elaborately picked out from 1,000,000+ Real Product Examples in both Chinese and English.
About
We present an e-commercial multi-modal advertising dataset, E-MMAD, which contains 120 thousand valid data elaborately picked out from 1.3 million real product examples in both Chinese and English. Our dataset sources are the Chinese largest e-commerce website shopping platform (www.taobao.com). Noticeably, it is one of the largest video captioning datasets in this field, in which each example has its product video (around 30 seconds), title, caption and structured information table that is observed to play a vital role in practice.
100,000 +
high-level multimodal data
4000 +
e-commercial product categories about various fields
300,000 +
Structure Information Words
Diversity
The real world advertising description is vivid and various. So we collect a large-scale high-quality and reliable e-commercial multimodal advertising dataset. It is one of the largest video captioning datasets in this field. E-MMAD is completely collected from human real life and carefully selected so that it is qualified to meet the needs of real life.
License
The E-MMAD dataset is available to download for non-commercial purposes under a Creative Commons Attribution 4.0 International License.
DOWNLOAD
We will release the full data before the conference starts.
Sponsor
Let's guess together and the answer will be revealed soon.
ACKNOWLEDGEMENTS
This work is created in the face of Real needs. Thanks to our data labeling teams and others for their hard work and suggestions on this work.