Flipkart Grid 6
Flipkart Grid 6
0
TEAM NAME : MaRS IITR
College/University: Indian Institute of Technology Roorkee
Tech stack:
coding languages: python, CSS ,HTML, Javascript, React.js.
DataBase : MongoDB
DataBase Schema :
Input source:
Images can be processed either by capturing them in real-time using an external device
or by uploading stored images from a device. This flexibility allows for seamless
handling of both on-the-spot analysis and processing of previously saved files.
Solution:
To address the outlined requirements, we have utilized a fine-tuned version of the
Gemini API model, a large language model (LLM) with integrated multimodal
capabilities and using prompt engineering to get desired results. Before selecting
Gemini API, we rigorously tested other models and APIs, including OpenAI’s API, as
well as vision-based architectures like VGG16, ResNet50, VOLO V8, YOLO V11,
Paddle OCR, and SAM 2 segmentation models. While these models performed well in
specific areas, Gemini API emerged as the most reliable and efficient solution due to its
superior accuracy, multimodal integration,wider data analysis and ability to handle
complex real-world scenarios seamlessly. Below is a detailed explanation of how the
architecture and logic are applied to each function.We also added the option to view
history so that all the billing details can be saved in our Database.
https://universe.roboflow.com/college-74jj5/freshness-fruits-and-vegetables/dataset/7/im
ages
A. Brand Detection
Custom Object Detection Models: Pre-trained on large datasets of brand logos and
fine-tuned with specific shipping scenarios to achieve a 95%+ identification rate.
Training on Multiformat Data: Model fine-tuned with real-world data containing diverse
date formats (e.g., MM/YY, DD-MM-YYYY) for high robustness.
C. Item Counting
D. Freshness Detection
Freshness using LLMs : The freshness index detection system uses a sophisticated
deep learning architecture primarily based on Convolutional Neural Networks (CNNs)
and Vision Transformers. By analyzing high-resolution images of fruits and vegetables,
the model extracts critical visual features like color gradients, surface texture, bruise
detection, and microscopic changes. The system is trained on extensive datasets
containing images annotated with precise age, ripeness stages, and environmental
conditions. The neural network learns to recognize subtle indicators of degradation,
enabling it to predict the remaining shelf life, freshness percentage, and optimal storage
conditions.
E. Data storage: