Image Classification for the Fashion Industry with Convolutional Nueral Networks

Bob Sebastian – Poster

In the world of online shopping, fashion classification is more important than it seems, because when shoes accidentally show up in the bag section or jackets are labeled as pants, the whole shopping experience feels off. The biggest hurdle in building a system to prevent this is the data itself, since fashion datasets are rarely balanced. Some categories are overflowing with images while others barely have enough, which makes a model naturally biased toward the bigger groups. To overcome this, the smaller categories were boosted using oversampling combined with data augmentation, giving them extra variety to learn from, while the overrepresented ones were carefully reduced with random undersampling so they would not dominate the training. Once the data was on fairer ground, it was time to build the models. Convolutional Neural Networks served as the starting point, but to capture the fine details like textures, shapes, and patterns, EdgeNet was introduced as a powerful feature extractor. At the same time, a vision transformer called BeiT was tested, since transformers are known for their ability to understand complex visual structures. The real surprise came when these two approaches were combined, where EdgeNet’s strength in detailed features complemented BeiT’s transformer power, resulting in the highest accuracy of 83.47%. What this shows is that balancing data on one side and blending different architectures on the other can create a much stronger solution, and in practice it could help e-commerce platforms organize fashion products more accurately, reduce errors that frustrate shoppers, and ultimately create a smoother and more enjoyable online