Artists can now check whether their tracks appear in AI training datasets
Developed by researcher Alex Reisner and unveiled by The Atlantic, the tool allows artists to explore whether their work appears in datasets linked to AI training
American magazine The Atlantic has unveiled a new tool that allows artists to check whether their music appears in datasets used to train generative AI models.
Developed by researcher Alex Reisner, AI Watchdog draws on four large-scale datasets that have circulated within the AI development community. Together, the collections contain millions of tracks, enabling users to search for works by artists ranging from global chart names to independent and underground musicians.
The tool builds on AI Watchdog's broader mission of tracking the materials used in AI training. First launched in 2025 to document books, academic research and video content incorporated into generative AI systems, the platform has now expanded to include music datasets.
According to The Atlantic, the tool currently draws on four datasets identified by Reisner through research papers and AI data-sharing platforms. Two of the datasets contain around 10 million tracks each, while the remaining datasets feature more than 100,000 recordings apiece.
The platform's dataset exploration page explains the limitations of the search tool, stating: “See whose work tech companies are using to train their generative-AI models. Search for an author, musician, YouTube channel, screenwriter, or actor. AI companies may omit certain works when training, so the presence of a work in a dataset is not definitive proof that it was used. Companies often use multiple datasets in training, so the absence of a given work is also not proof that it hasn't been used. Note that some datasets contain multiple copies of certain works.”
The launch arrives at a time of heightened scrutiny around the relationship between artificial intelligence and copyright. Over the past year, artists, publishers, collecting societies and record labels have increasingly called for greater transparency regarding the datasets used to train commercial AI systems. Several high-profile legal disputes involving major music companies and AI developers have centred on allegations that copyrighted recordings were used without authorisation, while policymakers in both Europe and North America continue to examine how existing copyright frameworks apply to generative AI technologies.
For independent artists in particular, tools such as AI Watchdog may offer a clearer picture of how their work circulates within the AI ecosystem. Although the platform cannot confirm whether a specific recording was ultimately used to train a model, it provides a rare public window into datasets that have often remained difficult for creators to inspect.
Read Alex Reisner's full article on The Atlantic here and explore the AI Watchdog database here.
