Adobe used Midjourney AI images to train its own AI model Firefly

April 15, 2024

0 Views 0

SaveSavedRemoved 0

Adobe’s generative AI model Firefly has been marketed as a leader in ethical AI, trained on licensed stock photos from their own vast Adobe Stock library. However, a recent report claims differently. It states that some of the training data for Firefly may have come from a less-than-ideal source: competitor Midjourney. And Midjourney’s training data origins are murky, with many suspecting it scraped unlicensed images from the internet.

As Bloomberg reports, Adobe downplays the issue, claiming only a small percentage (around 5%) of training images originated from this source. Still, when launching Firefly, Adobe offers an indemnity against copyright claims for enterprise users, positioning it as the “safe” alternative to tools like Midjourney.

So, is Firefly still copyright safe? Adobe maintains that the non-human images remain safe due to their rigorous moderation process. They claim all submitted images, including AI-generated ones, are checked for copyrighted material, trademarks, and recognizable characters.

This news comes amid reports of Adobe paying artists for their videos to train AI. Based on Bloomberg’s report, can we conclude that Adobe will also train its video data on Sora’s videos?

Training AI on AI?

Training AI with AI images raises some interesting questions and potential issues on its own. First off, AI models learn from the data they’re trained on. If the AI images used for training contain errors, biases, or inconsistencies, the new AI model will inherit those flaws. This can lead to the new model perpetuating stereotypes, generating nonsensical content, or simply not functioning well.

This also leads to the question of creativity. AI models trained solely on AI images might struggle with true creativity. They may simply become adept at remixing existing styles and concepts rather than generating truly novel ideas.

Then there are copyright issues. If the AI images used for training are themselves derived from copyrighted material without permission, it can create a legal gray area. The new AI model might unintentionally generate content that infringes on copyrights. It’s also challenging to ensure proper licensing and identify potential biases that might have been introduced at the earlier stages in the creation process.

Only time will tell how this shakes out for Adobe and the future of AI-generated content. One thing’s for sure: the conversation about ethical AI training data is far from over.

[via Tom’s Guide, Bloomberg]