June 21, 2026

Vision and language dataset generator

The screenshot consists of a random scene generator plus a textual annotation for a food collecting robot. The algorithm generates a maze including food items, and the text widget shows the description of the scene.

Such a setup is useful to generate a synthetic dataset with picture/text pairs to train a neural network.

No comments:

Post a Comment