DeepFashion
Dataset + model · 2016 / CVPR
The major fashion resources: DeepFashion, DeepFashion2, DeepFashion3D, MMFashion, and Fashionpedia. This synthesis integrates dataset scope, benchmark role, annotation philosophy, and how each resource relates to the Fashion Emotion Lexicon (FEL).
Dataset + model · 2016 / CVPR
Dataset + benchmark · 2019 / CVPR
Dataset + methodology · 2020 / CVPR
Platform / toolkit · 2020 / arXiv
Ontology + dataset · 2020 / CVPR
| Resource | Core Paper Contribution | What It Adds Beyond Earlier Resources | Most Defining Evidence |
|---|---|---|---|
| DeepFashion | Introduces a very large 2D fashion dataset and FashionNet, which jointly learns attributes and landmarks. | Moves the field from small, weakly annotated collections to rich, large-scale supervised clothing understanding. | 800K+ images, 1K attributes, 300K+ consumer–shop pairs, landmark-aware recognition and retrieval gains. |
| DeepFashion2 | Builds a clothing-centered multi-task benchmark for detection, pose, segmentation, and re-ID under realistic variation. | Adds dense landmarks, masks, deformation labels, and integrated evaluation under occlusion, zoom, and overlap. | 491K images, 801K+ items, 39 landmarks, 873K same-clothing pairs, stronger real-scene benchmark design. |
| DeepFashion3D | Shifts from 2D appearance to true 3D garment geometry with reconstruction, registration, and texture recovery tasks. | Introduces mesh-level supervision, UV space, and physically meaningful garment geometry. | 200K+ images, 2,078 3D garments, mesh/UV/texture annotations, Chamfer and 3D IoU evaluation. |
| MMFashion | Provides a unified PyTorch toolbox covering major fashion vision tasks through modular engineering design. | Does not mainly add a new dataset; instead it operationalizes task development and experimentation at scale. | Shared backbone–head framework, config-driven pipeline, support for attribute, retrieval, parsing, and compatibility tasks. |
| Fashionpedia | Introduces the first large-scale benchmark integrating categories, parts, attributes, and ontology structure. | Adds explicit semantic structure and part-level attribute localization beyond image-level labels. | 27 categories, 19 parts, 46 attributes, pixel-level masks, ontology-aware evaluation and localization tasks. |
In FEL, these resources are not treated as redundant competitors. They act as complementary evidence layers contributing different kinds of knowledge about fashion items.
DeepFashion anchors the system in 2D appearance evidence: category, attribute, landmark, and paired retrieval supervision. DeepFashion2 strengthens this with structure-aware real-world evidence, especially detection, dense landmarks, segmentation, and re-identification under difficult conditions.
Fashionpedia provides the semantic grounding layer by explicitly organizing categories, parts, and attributes into a more ontology-like structure. DeepFashion3D extends the system into physical garment geometry, supplying pose-aware 3D evidence rather than only 2D appearance. MMFashion serves as the engineering layer, helping operationalize experiments and benchmark implementations across tasks.
DeepFashion: “What is visible?”
DeepFashion2: “How is it arranged under real-world conditions?”
Fashionpedia: “Which part has which meaning or property?”
DeepFashion3D: “What is the garment’s physical form?”
MMFashion: “How do we implement, benchmark, and iterate efficiently?”
From an FEL perspective, the progression is: appearance → structure → semantics → geometry, while the toolkit layer enables reproducible experimentation around all of them.