TARO (Taming Robots)
research in the area of artificial intelligence and machine learning

Main Research Areas

  • Applied Artificial Intelligence
  • Optimization/Automation with AI Technologies
  • Trust & Transparency
  • Social Responsibility
  • Privacy & Data Protection
  • Reproducibility & Accountability

Topics for Projects / Bachelor Theses

Bachelor Thesis, Computer Science Project: Using Large Language Models On-Premise

Design and implement prototypes utilizing large language models (LLMs) on consumer-grade hardware, such as laptops, desktop PCs or servers, to explore their feasibility and efficiency in a resource-constrained environment.

  • Systematically compare various open-source machine learning frameworks and libraries to identify the most suitable options for deploying LLMs on consumer hardware.
  • Develop a methodology for fine-tuning on-premise large language models with custom datasets, ensuring the models remain robust and effective within the hardware constraints.
  • Assess the practicality and performance of LLMs on standard consumer hardware, focusing on computational efficiency, memory usage, and scalability.
  • Investigate and various IDE plugins that support LLM integration, highlighting their impact on development workflow and efficiency.
  • Establish benchmarks for model performance and resource consumption, comparing the optimized LLMs on consumer hardware against their performance on dedicated AI servers.
Examples:

Bachelor Thesis, Computer Science Project: Airplane Control based on machine learning

This project contains 3 main objectives to choose from

  • flight control of an aircraft based on a machine learning, using x-plane as simulator
  • evaluation of trainings-flights based on machine learning
  • understanding and handling of ATC (air traffic control) - communication
Variations of the above topic are also possible.

Bachelor Thesis, Computer Science Project: Machine Learning Model Operationalization Management (MLOps)

Design an implement a state-of-the-art machine learning platform using open-source tools.

  • Compare open-source machine learning tools and libraries
  • Use GPUs in a Kubernetes environment, incl. multi-GPU training
  • Data version control for machine learning projects
  • Continuos training, delivery and deployment of models
  • Automated distribution and scheduling of training pipelines on limited resources
Examples:

Bachelor Thesis, Computer Science Project: Understanding of Business Documents

Invoices, orders, credit notes and similar business documents contain information needed for trade to occur between companies, much of it on paper or in semi-structured formats such as PDFs. These documents carry information not only in text, but also in the spacial location. Ordinary Natural Language Processing or Computer Vision methods are not sufficient to understand these kind of documents.

In particular invoices carry lots of information in the spacial location of the text and other characters like lines and signs e.g. for tables. Different approaches can be followed for different problems. Simple pattern matchings or mask techniques can already achieve good results. But these approaches does not work well on unseen data. Deep learning is a different approach to understand these invoices.

The problem is divided in two to major areas:
  • Preprocessing and OCR for text extraction
  • Document understand and information extraction

Preprocessing and OCR for text extraction

This project focuses on the OCR part of the overall problem to understand business documents:
  • Identify and document requirements to an OCR system to be used for understanding business documents
  • Research and select appropriate libraries or algorithms
  • Document preprocessing
  • Actual extraction of text information while preserving the spacial position
  • Post-processing, error correction and optimization for further document understanding processing

Document understand and information extraction

In this project and/or thesis focuses on the document understanding and information extraction.
  • State of the art analysis - new papers on this topic are released on monthly basis
  • Analyse currently available products or libraries (such as rossum.ai)
  • Develop different approaches
  • Implement and verify selected approaches

Bachelor Thesis, Computer Science Project: Environmental aspects of machine learning

The goal of this project is to measure the impact of various machine learning systems, tools, models and libraries on the environment.

Tasks include but are not limited to:

  • Measure the energy consumption of various machine learning tasks
  • Optimize machine learning training and inference for lower environmental impact
  • Literatur research on regulatory aspects, e.g. EU Green Deal
Examples:

Variations of the above topic are also possible.

OCR for Medieval Manuscripts

The goal of this project is to build an application that converts images of medieval handwritten manuscripts into machine-encoded text.

Tasks include but are not limited to:

  • Find and adapt suitable open-source solutions
  • Train machine learning models to recognize hard-written manuscripts
  • Build an application that converts images into machine-encoded text
Examples:

Variations of the above topic are also possible.

Bachelor Thesis, Computer Science Project: Network Security by implementing an Intrusion Detection System using Open-Source ML Frameworks (NIDS)

The goal of this project is to build a realtime NIDS using machine learning.

Tasks include but are not limited to:

  • Find and adapt suitable open-source solutions
  • Use self-learning algorithms instead of static rulesets
  • Evaluate and optimize performance in terms of throughput and hardware

Variations of the above topic are also possible.

Bachelor Thesis, Computer Science Project: Mobile Dashcam App using Open-Source ML Frameworks

The goal of this Project is to build a mobile application that uses the camera to extract traffic information in real time.

Tasks include but are not limited to:

  • Find open-source models or train models to recognize various objects e.g. traffic signs
  • Evaluate and optimize the performance of model inference on mobile devices
  • Evaluate and optimize the model for mobile devices in terms of speed, battery usage and other criteria

Variations of the above topic are also possible.

Computer Science Project: Learn Computers play Arcade Games with Reinforced Learning

Arcade games form the 80s are fun and have very handy features: The graphics are simple, the controls are very limited and the game mechanic is in general very simple. This makes it perfect for a computer to teach itself playing a game.

Teach the computer to play a arcade game.

  • find a game suitable to be controlled by a computer
  • research on reinforcement learning for games
  • select framework (e.g. [1], [2], [3])
  • implement learning and training (e.g. [4])
  1. https://github.com/google/dopamine
  2. https://aws.amazon.com/about-aws/whats-new/2018/11/amazon-sagemaker-announces-support-for-reinforcement-learning/
  3. https://github.com/uber-research/atari-model-zoo
  4. https://deepmind.com/research/dqn/

Other Topics

  • Software Defined Storage in the Cloud Native area
  • Streaming in conjunction with AI
  • Debugging and Verification of AI Output
  • Moral Responsibility/Ethics & Bias of AI
  • Continuous Delivery with Machine Learning
  • ... or ... bring your own topic!