Volg ICTI

Computer Science research results

| Pavlo Burda | Artificial Intelligence Research Research Software
Photo by Jeswin Thomas on Unsplash

In 2025 more than 20 computer science students completed their thesis project with support from ICT institute. In this article you can find the titles and topics of the completed research projects as well as some main results. The projects ranged from data science e-health AI applications to software prototypes for cybersecurity analytics.

Research approach

Many computer science students want to do research focused on a real world problem, but find it difficult and time consuming to find companies willing to share their data. Likewise for companies it is challenging to formulate projects that are doable in the scope of a Bachelor project. At ICT Institute we try to bridge this gap by organising a business track where we select students and create teams and approach companies beforehand. Each year we invite companies to contribute with projects, see for instance the business track invitation from 2024 and the business track call for projects from 2023. As you can see the business track has a structured format and a fixed timeline, since it allows us to offer better support for students. We are very happy to share the highlights of this year’s project.

e-Health and AI

One team investigated nutrient estimation from food photos using neural networks, showing that BERT-like feed-forward networks perform comparably to convolutional neural networks (CNNs) at lower computational cost. In the second thesis, a GPT4-based approach achieved lower-but-comparable accuracy to CNNs, but offers a more scalable and interpretable alternative in certain scenarios. The research question was provided by Ancora Health. Resulting theses:

  • Andrei Chirita – Automated Nutrition Estimation from Food Images Using Machine Learning and Image Segmentation
  • Yaroslav Zyryanov – Nutrition Estimation from Food Images via Multimodal Retrieval-Augmented Generation and CLIP-Based Retrieval

Another student analysed data from the  Artros project of Hogeschool Utrecht and evaluated ML models for predicting diagnostic signals from inertial sensors in osteoarthritis patients. He concluded that a more compact sensor setup could enable practical, real-time feedback in therapy. In another project from Delft Imaging the student has built a 3D-CNN to automate quality assessment of obstetric ultrasound blind sweeps, achieving test ROC-AUC 0.845 and demonstrating the potential for real-time feedback in low-resource settings, but the approach is not yet clinically ready.

  • Homer O. Christou – Predicting KAM in knee osteoarthritis patients using wearable sensors: comparing ANN with Gradient boosting
  • Joana R. Gomes – An AI Quality Checker for Obstetric Ultrasound

Scatter plots comparing predicted and actual nutrient values in the test set [A. Chiriţă].

Software Engineering and code quality

Students used data science to understand technical and IT challenges based on questions from DTACT and SIG. One student developed prototypes for text-to-SQL querying on threat-intelligence data. The second thesis developed an automated system to generate support tickets, also using a large language model (LLM). The third thesis explored the use of machine learning to identify unusual events in Windows Event Logs to help security analysts.

  • Kirill Nikolaevskii – Benchmarking Small LLMs for SQL Generation
  • Sohaib Ezzahir – Automated Ticket Generation Using LLMs for Customer Support
  • Duarte M. Moreira – Anomaly Detection for Cybersecurity Using Machine Learning

Another team studied software engineering as an activity, using data from open source projects. They collected data from a sample of large GitHub projects to understand how fast issues were resolved.

  • Ayman Errahmouni  – Towards Analyzing Developer Efficiency in Enterprise-Driven Open Source Projects through Commit and Issue Tracking Data
  • Ahmet Y. Uckun – Benchmarking Software Maintainability in Enterprise-Driven Open Source Projects in Terms of Developer Engagement
  • Yusuf Z. Yokuş – Correlating Process and Quality Metrics in Enterprise OSS

High-level RAG-based LLM chatbot architecture [Ezzahir].

Another team evaluated a new programming tool: Microsoft Fabric. Fabric is a low code tool that should help companies build data processing scripts with minimal code. The tool contains several AI-based features that have been evaluated. They used anonymized data provided by Experience Fruit Quality.

  • Oliwer B. Dembicki – Enhancing Data Reliability for Power BI: A Practical Evaluation of Microsoft Fabric’s Preprocessing Capabilities
  • Manav S. Duarcadas – Leveraging AI in Power BI to Enhance Dashboard Usability and Business Decision-Making
  • Franco Marcon – A Comparison of Microsoft Fabric and Python based development of Data Pipelines

Another team handled the problem of making better, more personal recommendations, using houses as suggested by Nieuw Wonen Nederland. The students developed and evaluated recommender systems based on content and collaborative filtering techniques.

  • Daan Fortuijn Harreman – Creating an Explainable and Adjustable Content Based Recommendation System
  • Raihan R. Kaskandar  – Enhancing Real Estate Recommendations with Collaborative Filtering

Prototype BI dashboards comparison [Duarcadas].

Analysing industry data

In the business track we encourage students to use existing datasets that related to important problems or concerns in society. One such concern is air pollution. In The Netherlands there is air quality measurement data available via luchtmeetnet, and the city of Utrecht has its own air quality data based on their own sensor network. One team of two students studied the effectiveness of past policy interventions by the Utrecht municipality. They developed SARIMAX models to show reductions of particulate and NO2 concentration and provided emission forecasts for Utrecht through 2030.

  • Michael Tedeev – How can SARIMAX models be used not only to predict PM2.5 and PM10 concentrations in Utrecht, but also to evaluate the effectiveness of past policy interventions?
  • Egor Vasiliev  – A SARIMAX-Based Analysis of Low Emission Zones Impact and Forecasting of NO2 Concentrations

One  group deployed multiple machine-learning methods (multilayer perceptron, random forest, and XGBoost) to classify human vs. machine clicks in phishing simulations. They achieved up to 99% accuracy and precision with the various methods. This was done based on data from Awareways, a company specialised in security awareness solutions.

Another student explored a waste-management dataset provided by SeenOns to identify patterns, such as waste composition and collection location, for potential recycling improvements and pollution reduction opportunities.

  • Arnav Biswas – Identifying System Clicks in Simulated Email Phishing
  • Orestis Kalaydjian  – Enhancing and Automating System-Generated Click Identification Using Machine Learning and Deep Learning Algorithms
  • Madhav Chawla – Environmental Performance Analysis and Waste Profile Optimization in Waste Management

Particulate concentration forecast for air quality dataset [Tedeev].

 

BT Excellent Thesis Award

Oliwer Dembicki received the Business Track Excellent Thesis Award 2025 for producing the best thesis in the Business Track (top 5%, ranked first among more than 20 students).

Oliver’s thesis “Enhancing Data Reliability for Power BI: A Practical Evaluation of Microsoft Fabric’s Preprocessing Capabilities” explores whether Microsoft Fabric can realistically improve or replace the company’s current manual preprocessing workflows. This research gives a clearer picture of what Fabric can and cannot do and how small companies can approach automation without sacrificing accuracy and control over their data.

The award recognises the very good execution and valuable – practical- results within real-world, small company constraints: getting practical insight of Fabric’s utility with a mix of methods while disentangling workflows and interacting with busy employees.

Usability heatmap for Copilot and Manual preprocessing tasks across the synthetic dataset [Dembicki].

The BT excellent thesis award is awarded annually to the best thesis from the business track. See this article for the previous winner.

The companies

None of the projects would have been possible without the company supervisors who led the initiative and committed time to co-supervise the students. We would like to thank, in no particular order, Ancora Health, Awareways, DTACT, Experience Fruit Quality from Experience Data, Seenons, Nieuw Wonen Nederland, Software Improvement Group (SIG), Hogeschool Utrecht, Gemeente Utrecht, and Delft Imaging Systems.

Image source: Jeswin Thomas via Unsplash

Author: Pavlo Burda
Dr. Pavlo Burda is an IT consultant and researcher specializing in emerging cybersecurity threats and people analytics for security.