[bsfp-cryptocurrency style=”widget-18″ align=”marquee” columns=”6″ coins=”selected” coins-count=”6″ coins-selected=”BTC,ETH,XRP,LTC,EOS,ADA,XLM,NEO,LTC,EOS,XEM,DASH,USDT,BNB,QTUM,XVG,ONT,ZEC,STEEM” currency=”USD” title=”Cryptocurrency Widget” show_title=”0″ icon=”” scheme=”light” bs-show-desktop=”1″ bs-show-tablet=”1″ bs-show-phone=”1″ custom-css-class=”” custom-id=”” css=”.vc_custom_1523079266073{margin-bottom: 0px !important;padding-top: 0px !important;padding-bottom: 0px !important;}”]

Oxylabs Unveils First-of-its-kind YouTube Datasets to Power Responsible AI

Oxylabs, a leading web intelligence platform and proxy provider, introduces industry-first YouTube datasets composed entirely of consent-based data. All of the millions of original videos in the datasets have the explicit consent of the creators to be used for AI training, allowing to bridge the gap between creators and innovators.

Also Read: Why Q-Learning Matters for Robotics and Industrial Automation Executives

“In the ecosystem aiming to find a fair balance between respecting copyright and facilitating innovation, YouTube streamlining consent giving for AI training and providing creators with flexibility is an important step forward. Many channel owners have already opted in for their videos to be used in developing the next generation of AI tools. This enables us to create and provide high-quality, structured video datasets. Meanwhile, AI developers have no trouble verifying the data’s legitimate origin,” said Julius Černiauskas, CEO at Oxylabs.

All datasets offered by Oxylabs include videos, transcripts, and rich metadata. While such data has many potential use cases, Oxylabs refined and prepared it specifically for AI training, which is the use that the content creators have knowingly agreed to.

Large volumes of high-quality video data are fundamental for developing multimodal AI, capable of seamlessly handling text, audio, and visual data when performing tasks or generating different types of content. Acquiring such data in a convenient way that establishes a transparent link between creators and AI companies is a challenge the industry is still trying to solve. Structured, AI-ready datasets from YouTube are now a part of this developing improved model for training AI on public data.

Related Posts
1 of 41,684

Importantly, consent-based datasets also allow AI companies and creators to be on the same page regarding fair AI development. This development has been riddled with still unanswered questions about making copyrighted material fuel rather than stall innovation.

Also Read: The GPU Shortage: How It’s Impacting AI Development and What Comes Next?

“These datasets offer a breath of fresh air to a tense ecosystem in dire need of facilitating systematic cooperation between creators and AI companies based on mutual agreement. The next wave of tools that will shake the market can now be built on data that all can agree is right for AI training. Hopefully, this also marks a better, more sustainable way forward,” concluded Černiauskas.

The release of ethically sourced YouTube datasets continues Oxylabs’ longtime mission to establish and promote ethical industry practices, previously marked by co-founding the Ethical Web Data Collection Initiative (EWDCI) and introducing an industry-first transparent tier framework for proxy sourcing.

[To share your insights with us, please write to psen@itechseries.com]

Comments are closed.