by Ekaterina Butyugina

We’re thrilled to celebrate the achievements of our latest graduates from Zurich’s online Batch #33, who have just wrapped up their Data Science journey with two remarkable, real-world projects.
This round of final presentations showcased how data science and AI can drive tangible impact across industries, from transforming business development workflows to reinventing the way educators design learning experiences.
Take a look at how our graduates are using data science to generate insights, push boundaries, and create real-world impact.
Students: Tobias Rieker, Amruthraj Gudimalla, Mastanvali Shaik, Pratik Shroff
VSL is a global leader in construction and infrastructure maintenance, specializing in advanced solutions for bridges, buildings, and complex civil engineering structures. A critical part of VSL’s work is the regular inspection of stay cables used in bridge structures to ensure safety, reliability, and long-term performance.
These inspections generate thousands of high-resolution images, which traditionally must be manually reviewed and labeled by experts - a process that is time-consuming, costly, and prone to human inconsistency as data volumes grow.
The goal of this project was to automate defect detection and segmentation on stay cables using deep learning, in order to:
Stage 1 – Defect vs. Non-Defect Filtering

An EfficientNet-V2 classifier rapidly filters incoming images, eliminating approximately 99% of non-defect images from further processing. This drastically reduces unnecessary downstream computation and human review.
Stage 2 – Defect Detection & Segmentation
Images containing defects are passed to instance segmentation models that both localize and precisely outline defects:
The system detects and segments multiple defect types, including critical defects such as Deep Scratches and Damaged Junctions / Weld Openings.
The models were evaluated using AP50 (Average Precision at IoU = 0.5).
Key outcomes:
This performance translates directly into substantial time savings and cost reduction, significantly lowering the effort required for manual defect mask creation and review.
Figure 2: YOLOv8-Seg vs. Mask R-CNN Performance Comparison

This project demonstrates the practical feasibility of AI-assisted infrastructure inspection:
AI does not replace human experts - but it amplifies their effectiveness, allowing engineers to focus on high-risk, high-value decisions.
An Overview of an Agentic RAG System for Marle SMB Medical
Students: Johannes Hörl, Sven Mayer
Marle SMB Medical is a Swiss manufacturer of orthopedic implants, operating in a highly regulated and data-intensive environment. Like many data-driven organizations, Marle relies on ERP systems, CSV exports, and BI dashboards to manage production, inventory, and product lifecycles.
While the data existed, access to insights was limited: answering even simple business questions often required SQL expertise or deep system knowledge, creating bottlenecks and reducing data accessibility across the organization.
The goal of this project was to build an AI-powered intelligent agent that allows business users to query enterprise data in natural language, without SQL or BI expertise, while maintaining:
The result is GUIDE — Generative Understanding and Intelligent Data Extraction.
GUIDE is a hybrid Agentic RAG system that combines deterministic data access with semantic reasoning. Instead of relying purely on vector search or LLM reasoning, the system fuses physical evidence, metadata intelligence, and algorithmic decision logic.
GUIDE is built as a 4-layer hybrid intelligence system:

GUIDE successfully transformed how business users interact with enterprise data: Natural-language questions translated into auditable answers; No SQL or BI expertise required; Fast execution via in-memory Pandas and DuckDB; Protection against hallucinations through metadata grounding, explicit business rules, blind evaluation logic.

Figure 2: GUIDE’s User Interface
Some future improvements include increasing reliability through better knowledge. By systematically enriching metadata with clear column descriptions, business rules, and usage examples, the agent will gain stronger grounding and improved protection against hallucinations. Validated user queries can then be stored as reusable “golden examples” allowing the system to recognize recurring patterns and resolve ambiguity more confidently over time. Finally, lightweight user feedback combined with performance analytics enables continuous learning, ensuring the agent evolves based on real usage rather than assumptions.
In conclusion, Johannes and Sven demonstrated that enterprise AI systems do not need to choose between flexibility and reliability. By combining deterministic search, semantic reasoning, and algorithmic scoring, the system enables trustworthy AI-driven data access.
Students: Laura Kajtazi, Farhod Omonov, Daria Tsarova-Lenska
What drives a customer’s decision to purchase a fashion item? Can future sales be predicted solely from past behavior – or are there signals hidden in plain sight that could improve those predictions?
This question formed the basis of our capstone project, conducted in collaboration with BestSecret, a European members-only online fashion platform specializing in off-price premium and luxury brands. BestSecret offers a curated assortment of clothing, shoes, and accessories from well-known international brands, combining dynamic pricing with limited-availability inventory.
Figure 1: Project Structure
As part of an ongoing effort to improve revenue forecasting, BestSecret challenged us to investigate whether product images could serve as an additional predictive signal. The project goal was clear: to outperform an established baseline revenue prediction model by incorporating visual information extracted from product images.
To tackle this challenge, we treated product images as an additional feature space and evaluated their impact on revenue prediction. Our approach focused on extracting fixed-size image embeddings and integrating them with structured product metadata. The dataset comprised approximately 10,000 product images, each aligned with corresponding sales and product metadata.
We generated visual embeddings using several state-of-the-art pre-trained vision models, including CLIP, EVA, and DINO. These embeddings capture high-level visual properties such as color distribution, shape, texture, and style, without requiring task-specific training. Dimensionality reduction techniques were applied where necessary to balance information preservation and model efficiency.
Figure 2: What an Image Can Tell
In parallel, we experimented with prompt-based image analysis using large multimodal models such as OpenAI, Gemma, and a pre-trained CLIP-based model.

Figure 3: Scoring Results for the Images
Our goal was to generate semantic descriptions from product images and assess whether higher-level visual concepts could improve predictive performance.
Figure 4: Modeling Results

The extracted visual features were combined with existing metadata and used to train gradient boosting regression models, including LightGBM and CatBoost, for future revenue prediction. Model performance was evaluated against a metadata-only baseline using standard regression metrics.
The multimodal models consistently outperformed the baseline, confirming that visual information provides complementary predictive signal beyond traditional product attributes.
This project demonstrates that incorporating image-based features into revenue prediction pipelines leads to measurable performance improvements. Visual embeddings extracted from product images represent a scalable and reusable signal that can enhance forecasting accuracy in fashion e-commerce.
With the completion of the Data Science capstone projects of the online Batch #33, we celebrate the impressive achievements of our graduates. Their work clearly demonstrates what happens when technical expertise meets creativity: data science unlocks its full potential for innovation.
A huge thank you goes to our partners, mentors, and instructors who supported the teams throughout their journey. Your knowledge, guidance, and collaboration helped transform ideas into working, real-world prototypes.
To our graduates: your curiosity, perseverance, and ambition to solve real problems are what make this community so special. We’re excited to see where your path leads next - and how you will continue shaping the future of AI and data-driven innovation.
If these stories inspire you, learn here how you can become part of the next Data Science batch at Constructor Academy and start your own innovative project.