Moderator: Herzlich willkommen zu unserer Sendung "Digitalblick". Heute sprechen wir mit Professor Dr. Simon Mertens vom Institut für Angewandte Informatik über ein Forschungsprojekt, das im letzten Jahr für einige Schlagzeilen sorgte – allerdings nicht ganz so, wie ursprünglich geplant. Herr Professor Mertens, schön, dass Sie da sind.
Prof. Mertens: Guten Abend, Herr Meier. Danke für die Einladung.
Moderator: Professor Mertens, Ihr Projekt, das Anfang 2024 gestartet ist, sollte sich ja mit der lokalen Implementierung und Optimierung von Large Language Models befassen. Können Sie uns kurz den ursprünglichen Ansatz erläutern?
Prof. Mertens: Gerne. Unsere Idee war es, die damals aufkommenden, immer leistungsfähigeren Large Language Modelle (LLMs) nicht nur in der Cloud zu nutzen, sondern sie lokal auf hochperformanten Workstations betreiben zu können. Wir wollten die Abhängigkeit von externen Servern reduzieren und neue Anwendungsfelder für den Edge-Bereich erschließen – also direkt dort, wo die Daten entstehen. Das war ein sehr ambitioniertes Ziel, aber wir waren optimistisch.
Moderator: Und wie liefen die ersten Schritte? Man hatte ja den Eindruck, dass Sie schnell auf Probleme gestoßen sind.
Prof. Mertens: Das ist leider korrekt. Unser erster großer Stolperstein war die Hardware. Wir hatten eine der schnellsten Workstations des Jahres 2024 im Einsatz, ausgestattet mit neuesten GPUs und reichlich RAM. Doch selbst diese Maschine war den Anforderungen der LLMs, die wir testen wollten, schlichtweg nicht gewachsen. Die Modelle, auch wenn es kleinere Varianten gab, benötigten exorbitante Mengen an VRAM und Rechenleistung. Wir sprechen hier von Modellen mit Dutzenden oder gar Hunderten von Milliarden Parametern. Selbst das Laden der Modelle führte oft zum Absturz oder zu extrem langen Wartezeiten, die jegliche Forschung unmöglich machten. Das war ein ernüchterndes Erwachen aus unserem Optimismus.
Moderator: Das klingt nach einer technischen Sackgasse. Aber es gab ja wohl auch persönliche Schicksalsschläge im Projektteam, wenn ich richtig informiert bin?
Prof. Mertens: Ja, leider. Und das war ein schwerer Schlag, der uns alle tief getroffen hat. Mitte des Jahres verstarb unser geschätzter Kollege und Leiter der Hardware-Optimierung, Dr. Anton Gruber, völlig unerwartet an Herzversagen. Er war über 70 und ein brillanter Kopf, dessen Erfahrung und ruhige Art für unser Team von unschätzbarem Wert waren. Sein Tod riss nicht nur eine fachliche Lücke, sondern traf uns auch menschlich sehr hart. Er war eine treibende Kraft und sein Verlust war extrem demotivierend.
Moderator: Mein aufrichtiges Beileid. Und zu allem Überfluss kam dann auch noch das Aus für die Finanzierung, richtig?
Prof. Mertens: So ist es. Ende 2024 wurden uns die Forschungsgelder gekürzt. Offiziell hieß es, aufgrund der "ausbleibenden greifbaren Erfolge" und der "fehlenden Demonstration der Machbarkeit" der lokalen LLM-Implementierung. Ich verstehe die Entscheidung aus einer rein wirtschaftlichen Perspektive – wir hatten keine funktionierenden Prototypen vorzuweisen. Aber es war frustrierend, weil wir wussten, dass wir an der Grenze des damals technisch Machbaren waren und eben auf die extremen Anforderungen gestoßen sind. Die Kombination aus technischen Hürden, dem Verlust von Dr. Gruber und der Geldknappheit hat das Projekt dann quasi zum Erliegen gebracht.
Moderator: Ein wahres Lehrstück über die Tücken der Forschung. Wenn Sie heute, im Jahr 2025, zurückblicken, was nehmen Sie aus diesem gescheiterten Projekt mit?
Prof. Mertens: Nun, man lernt aus Misserfolgen oft mehr als aus Erfolgen. Wir haben gelernt, dass die Skalierung von LLMs auf Consumer-Hardware noch eine größere Herausforderung ist, als wir dachten. Es braucht massive technologische Sprünge in der Hardware-Effizienz oder völlig neue Architekturen. Und es hat uns wieder gezeigt, wie wichtig der menschliche Faktor in der Forschung ist. Der Verlust eines Teammitglieds kann ein ganzes Projekt zum Scheitern bringen, unabhängig von der Technologie. Es war ein teures, aber lehrreiches Scheitern, das die Grenzen des damals Machbaren aufgezeigt hat.
Moderator: Professor Mertens, vielen Dank für diese offenen Einblicke in Ihr Forschungsprojekt.
Prof. Mertens: Gerne, Herr Meier.
July 30, 2025
Interview: Gescheiterte LLM-Forschung Ein Blick zurück ins Jahr 2024
July 29, 2025
Estimate the hardware requirement for large language models
Since the advent of chatgpt in 2023, most people are familiar how to use these AI systems for executing prompts. Even non programmers are able to generate stories and create summaries of existing content in the internet. Endless amount of tutorials are available who explaining what an LLM is and how to use it to answer questions.
|
task
|
price US$
|
|
desktop PC
|
1000
|
|
vector database with wikipedia
|
10000
|
|
vector database for multiple documents
|
50k
|
|
minimalist large language model
|
100k
|
|
Large language model
|
1000k
|
The assumption is that the same dilemma is available today in the year 2025. If the goal is to run a state of the art large language model, there is a need to use a supercomputer grade hardware for around 1 million US$.
July 20, 2025
Simple chatbot in python
The most basic implementation of a chatbot works with predefined questio answer pairs stored in a Python dictionary. The human user has to enter exactly the predefined question to get an answer so the chatbot is a database lookup tool. Even if the program is less advanced than current large language models and its less mature than the Eliza software, its a good starting point to become familiar with chatbot development from scratch. The sourcecode consists of less than 50 lines of code in Python including the dataset.
import re
def run_chatbot():
knowledge_base = {
"hello": "Hi there! How can I help you today?",
"how are you": "I'm a computer program, so I don't have feelings, but thanks for asking!",
"what is your name": "I am a simple chatbot.",
"who created you": "I was created by a programmer.",
"what can you do": "I can answer questions based on my internal knowledge base.",
"tell me a joke": "Why don't scientists trust atoms? Because they make up everything!",
"what is the capital of france": "The capital of France is Paris.",
"what is the largest ocean": "The Pacific Ocean is the largest ocean.",
"what is the highest mountain": "Mount Everest is the highest mountain in the world.",
"what is the square root of 9": "The square root of 9 is 3.",
"what is the weather like today": "I'm sorry, I cannot provide real-time weather information.",
"how old are you": "I don't have an age in the human sense.",
"what is python": "Python is a high-level, interpreted programming language.",
"what is AI": "AI stands for Artificial Intelligence, which is the simulation of human intelligence processes by machines.",
"where are you from": "I exist in the digital realm!",
"can you learn": "I don't learn in the same way humans do. My responses are pre-programmed.",
"what is gravity": "Gravity is a fundamental force of nature that attracts any two objects with mass.",
"what is photosynthesis": "Photosynthesis is the process used by plants, algae, and cyanobacteria to convert light energy into chemical energy.",
"what is the speed of light": "The speed of light in a vacuum is approximately 299,792,458 meters per second.",
"thank you": "You're welcome! Is there anything else I can assist you with?"
}
print("Welcome to the simple Q&A Chatbot!")
print("Type 'quit' or 'exit' to end the conversation.")
print("-" * 40)
while True:
user_input = input("You: ").strip().lower()
if user_input in ["quit", "exit"]:
print("Chatbot: Goodbye! Have a great day.")
break
found_answer = False
for question, answer in knowledge_base.items():
if question in user_input:
print(f"Chatbot: {answer}")
found_answer = True
break
if not found_answer:
print("Chatbot: I'm sorry, I don't understand that question. Can you please rephrase it?")
if __name__ == "__main__":
run_chatbot()
July 17, 2025
Can AI replace human programmers?
In the year 2025, there is no clear answer available to this question. Maybe its possible to replace human programmers with an AI, or maybe not. What we can say for sure is, that for simpler tasks AI is more powerful than a human.
These simpler tasks are the chess game, the Tetris videogame and also the ability to answer programming related questions. The first chess AI which was superior over a human grandmaster was Deep blue from 1997. Playing Tetris on a grandmaster level was also demonstrated. What is missing is the proof for more advanced tasks. Despite the existence of Large language models, most existing software was written by human programmers. There are some tools available like git and programmer friendly IDE which claim to improve the efficiency but coding remains a human task. What current AI systems are able to do is to solve minor tasks within a programming project, for example to program a hello world app in python or answer a detail programming question.
The task of creating an entire application which consists of thousands lines of code is a demanding problem. Some progress was made into this direction but the outcome remains unclear. What we can say for sure is, that in the future the importance of Large language models for programming tasks will growth.
A possible benchmark to judge about an AI is its ability to contribute to existing software projects. The AI needs to create a commit which gets accepted within a project as a sense making contribution. If an AI is able to do so mulitiple times for different projects this would be a proof, that the AI can replace human programmers.
From a technical perspective, a commit is changeset in an existing project. It can be a bugfix or an additional feature. At least for simpler projects like a prime number generator or a tgictactoe videogame, current LLMs are able to do so out of the box with current technology. The open question is, if they are able to do so for more advanced projects like larger video game or en entire operating system.
Real world software projects consists of 10000 and more lines of code. In addition there is longer documentation and discussion available in a forum which needs to understood before a commit can be created. Even for an AI from the year 2025, it would a complex task.
July 13, 2025
AI generated window desktop
A minimalist GUI prototype written in Python and pygame was generated with an AI. Its possible to click on the file bar but executing additional programs is not possible. The source code consists of 180 lines of code and was entirely created by a large language model:
import pygame
import sys
# --- Pygame Initialization ---
pygame.init()
# --- Screen Dimensions ---
SCREEN_WIDTH = 1000
SCREEN_HEIGHT = 700
screen = pygame.display.set_mode((SCREEN_WIDTH, SCREEN_HEIGHT))
pygame.display.set_caption("Pygame: Desktop Simulation")
# --- Colors ---
WHITE = (255, 255, 255)
BLACK = (0, 0, 0)
LIGHT_BLUE = (173, 216, 230)
LIGHT_GREEN = (144, 238, 144)
DARK_GRAY = (50, 50, 50)
TOOLBAR_GRAY = (70, 70, 70)
BUTTON_HOVER = (90, 90, 90)
BUTTON_ACTIVE = (120, 120, 120)
# --- Font for text ---
font_small = pygame.font.Font(None, 24) # For menu items, etc.
font_medium = pygame.font.Font(None, 30) # For window titles
font_large = pygame.font.Font(None, 36) # For main elements
# --- Desktop Background ---
desktop_bg_color = (60, 60, 100) # A dark blue/purple for a desktop feel
# --- Taskbar/Top Bar Properties ---
taskbar_height = 40
taskbar_rect = pygame.Rect(0, 0, SCREEN_WIDTH, taskbar_height)
start_button_rect = pygame.Rect(5, 5, 80, 30) # x, y, width, height
start_button_text = "Start"
start_menu_active = False
start_menu_rect = pygame.Rect(5, taskbar_height, 150, 150) # Example menu size
start_menu_items = ["Terminal", "Browser", "Editor", "Settings"]
start_menu_item_rects = [] # To store rects for click detection
# --- Window Properties (as classes for easier management) ---
class Window:
def __init__(self, x, y, width, height, color, title, content_text=""):
self.rect = pygame.Rect(x, y, width, height)
self.title_bar_height = 25
self.title_bar_rect = pygame.Rect(x, y, width, self.title_bar_height)
self.content_rect = pygame.Rect(x, y + self.title_bar_height, width, height - self.title_bar_height)
self.color = color
self.title = title
self.content_text = content_text
self.active_menu_message = "" # To show what menu item was clicked
# Menu button rects (File and Edit)
self.file_menu_rect = pygame.Rect(self.title_bar_rect.x + 5, self.title_bar_rect.y + 2, 40, self.title_bar_height - 4)
self.edit_menu_rect = pygame.Rect(self.title_bar_rect.x + 50, self.title_bar_rect.y + 2, 40, self.title_bar_height - 4)
def draw(self, surface):
# Draw window content area
pygame.draw.rect(surface, self.color, self.content_rect)
pygame.draw.rect(surface, BLACK, self.content_rect, 2) # Border
# Draw title bar
pygame.draw.rect(surface, TOOLBAR_GRAY, self.title_bar_rect)
pygame.draw.rect(surface, BLACK, self.title_bar_rect, 2) # Border
# Draw title text
title_surface = font_medium.render(self.title, True, WHITE)
title_rect = title_surface.get_rect(centerx=self.title_bar_rect.centerx, centery=self.title_bar_rect.centery)
surface.blit(title_surface, title_rect)
# Draw menu buttons (File, Edit)
pygame.draw.rect(surface, DARK_GRAY, self.file_menu_rect)
file_text = font_small.render("File", True, WHITE)
file_text_rect = file_text.get_rect(center=self.file_menu_rect.center)
surface.blit(file_text, file_text_rect)
pygame.draw.rect(surface, DARK_GRAY, self.edit_menu_rect)
edit_text = font_small.render("Edit", True, WHITE)
edit_text_rect = edit_text.get_rect(center=self.edit_menu_rect.center)
surface.blit(edit_text, edit_text_rect)
# Draw content text
content_surface = font_small.render(self.content_text, True, BLACK)
content_rect = content_surface.get_rect(topleft=(self.content_rect.x + 10, self.content_rect.y + 10))
surface.blit(content_surface, content_rect)
# Draw active menu message
if self.active_menu_message:
message_surface = font_small.render(self.active_menu_message, True, BLACK)
message_rect = message_surface.get_rect(topleft=(self.content_rect.x + 10, self.content_rect.y + 40))
surface.blit(message_surface, message_rect)
def handle_click(self, pos):
if self.file_menu_rect.collidepoint(pos):
self.active_menu_message = "File menu clicked!"
return True
elif self.edit_menu_rect.collidepoint(pos):
self.active_menu_message = "Edit menu clicked!"
return True
return False
# Create our two custom windows
window1 = Window(100, 100, 350, 250, LIGHT_BLUE, "My Documents", "Welcome to Window One!")
window2 = Window(500, 350, 400, 280, LIGHT_GREEN, "Application", "This is Window Two.")
# --- Game Loop ---
running = True
while running:
for event in pygame.event.get():
if event.type == pygame.QUIT:
running = False
elif event.type == pygame.MOUSEBUTTONDOWN:
mouse_pos = event.pos
# Handle Start button click
if start_button_rect.collidepoint(mouse_pos):
start_menu_active = not start_menu_active # Toggle menu visibility
elif start_menu_active and start_menu_rect.collidepoint(mouse_pos):
# Check if a start menu item was clicked
for i, item_rect in enumerate(start_menu_item_rects):
if item_rect.collidepoint(mouse_pos):
# In a real app, you'd launch something here
print(f"Launched: {start_menu_items[i]}")
window1.content_text = f"Launched: {start_menu_items[i]}"
start_menu_active = False # Close menu after selection
else: # If click outside start menu, close it
start_menu_active = False
# Handle clicks on window menus
window1.active_menu_message = "" # Clear previous messages
window2.active_menu_message = ""
if window1.handle_click(mouse_pos):
pass # Handled by window object
elif window2.handle_click(mouse_pos):
pass # Handled by window object
# --- Drawing ---
screen.fill(desktop_bg_color) # Desktop background
# Draw Taskbar/Top Bar
pygame.draw.rect(screen, TOOLBAR_GRAY, taskbar_rect)
pygame.draw.rect(screen, BLACK, taskbar_rect, 1) # Border
# Draw Start button
pygame.draw.rect(screen, DARK_GRAY, start_button_rect)
pygame.draw.rect(screen, BLACK, start_button_rect, 1)
start_text_surface = font_medium.render(start_button_text, True, WHITE)
start_text_rect = start_text_surface.get_rect(center=start_button_rect.center)
screen.blit(start_text_surface, start_text_rect)
# Draw Start Menu if active
if start_menu_active:
pygame.draw.rect(screen, TOOLBAR_GRAY, start_menu_rect)
pygame.draw.rect(screen, BLACK, start_menu_rect, 2)
start_menu_item_rects = [] # Clear and re-populate for current frame
for i, item in enumerate(start_menu_items):
item_y = start_menu_rect.y + 10 + i * 30
item_rect = pygame.Rect(start_menu_rect.x + 5, item_y, start_menu_rect.width - 10, 25)
start_menu_item_rects.append(item_rect)
# Check for hover effect (optional but nice for menus)
if item_rect.collidepoint(pygame.mouse.get_pos()):
pygame.draw.rect(screen, BUTTON_HOVER, item_rect)
item_text_surface = font_small.render(item, True, WHITE)
item_text_rect = item_text_surface.get_rect(topleft=(item_rect.x + 5, item_rect.y + 2))
screen.blit(item_text_surface, item_text_rect)
# Draw Windows
window1.draw(screen)
window2.draw(screen)
# --- Update the Display ---
pygame.display.flip()
# --- Quit Pygame ---
pygame.quit()
sys.exit()
July 02, 2025
VLA models for reproducing motion capture trajectories
Over decades, an important but unsolved problem was available in robotics: How to rpreoduce motion capture demonstration? The initial situation was, that a teleoperated robot was able to pick&place objects in a kitchen, all the data were recorded with a computer, but the replay of these data wasn't working. The reason is, that if the same motor movements are submitted during the replay step to the robot, these motor movements will result into chaotic behavior, because the objects are in different position, and new obstacle might be there not available during motion demonstration.
The inability to replay recorded movement prevented to develop more advanced robots and it was a major criticism against motion capture and teleoperation in general. Some attempts like Kinesthetic Teaching were used Robotics to overcome the bottleneck including preprogramming of keyframes, but these minor improvement didn't solved the underlying problem.
A possible answer to the replay problem in motion capture are Vision Language action models which should be explained briefly. The idea is to create an additional layer which is formulated in natural language. A neural network converts the mocap recording into natural language and then action are generated for the perceived symbols. The natural language layer increases the robustness and it allows to fix possible errors in the motion planner. The AI engineer can see in the textual logfile, why the robot has failed in a certain task. For example, a certain object was labeled wrong, or the motion planner has generated a noise trajectory. These detail problems can be fixed within the existing pipelines.
Vision language action models (short VLA model) are solving the symöbol grounding problem. They are translating low level sensory perception into high level natural language. The resulting symbolic statespace has the same syntax like a text adventure and can be solved with existing PDDL like planners. Let me give an example for a longer planning horizon.
Suppose a robot should clean up a kitchen. At first, the needed steps on a high level layer are generated, e.g. removing the objects from the table, transport the objects into the drawer, cleaning the table, cleaning the floor. These abstract steps are formulated in words, similar to a plan in a text adventure. In the next step the high level actions are translated into low level servo commands. The servo commands are submitted to the robot which cleans up the kitchen.
The single cause of failure is the translation between the high level and the low level layyer. The robot needs to convert sensory perception into language, and language into motor actions. A VLA model implements such a translation.
