OSI officially releases its definition for Open Source AI

The Open Source Initiative (OSI) today released its open source AI definition version 1.0 to clarify what constitutes open source AI. This gives the industry a standard by which to validate whether or not an AI system can be deemed Open Source AI.Â

The definition covers code, model, and data information, with the latter being a contentious point due to legal and practical concerns. Mozilla, a long-time open source advocate, is partnering with OSI to promote openness in AI, advocating for transparency in AI systems.

The need to understand how AI systems work, so they can be researched, scrutinized and potentially regulated, is important to ensure the system is truly open source. Ayah Bdeir, senior strategic advisor on AI strategy at Mozilla, told SD Times on the â€œWhat the Dev?â€ podcast that AI systems are influenced by a number of different components â€“ algorithms, code, hardware, data sets and more.Â

As an example, she cited that there are data sets to train models, data sets to test, and data sets to fine tune, and this false sense of transparency leads organizations to claim their systems are open source. â€œWhen it comes to AI in traditional open source software, thereâ€™s a very clear separation between code that is written, a compiler that is used, and a license that is possessed. Each one of them can have an open license or a closed license and itâ€™s very clear how each one of them applies to this concept of openness.â€Â

However, in AI systems, many components influence the system, Bdeir said. â€œThere are algorithms, thereâ€™s code, thereâ€™s hardware, there are data sets. Thereâ€™s a data set to train, thereâ€™s a data set to test, thereâ€™s a data set to fine tune, and sort of this idea that if the code is open, that means their AI systems are open, which is not accurate.â€ This does not allow the fundamental reuse or study of the system that is required under an open source mentality, which is the actual four freedoms â€“ use, study, modify and share, she explained.

â€œThe open source AI definition by OSI is an attempt to put a real fine point on what open source AI is and isnâ€™t, and how to have a checklist that checks for whether something is or isnâ€™t, so that this ambiguity between claiming that something is open source or actually doing it is not is not there anymore,â€ she said.Â

The debate over data information was among the most controversial in coming up with the definition, Bdeir said.Â How do organizations that are training their models with proprietary data protect it from being used in open source AI? Bdeir explained there are schools of thought around data in particular. In one school of thought, the data set must be made completely open and available in its exact form for this AI system to be considered open source. â€œOtherwise,â€ she said, â€œyou cannot replicate this AI system. You cannot look at the data itself to see what it was trained on, or what it was fine tuned on, etc. And therefore itâ€™s not really open source.â€

In another school of thought, where she said some of the more hands-on builders reside, making the data available is not realistic. â€œData is governed by laws that are different in different countries. Copyright laws are different in different countries, and licenses on data are not always super clear and easy to find, and if you inadvertently or mistakenly distribute data sets that you have no rights to, you are liable legally.â€

The OSI solution to this problem is to talk about data information. What OSI is requiring is data information, not the data in a data set. The wording, Bdeir said, says the organization must provide â€œsufficiently detailed information about the data used to train the system so that a skilled person can recreate a substantially equivalent system using the same or similar data.â€

The post OSI officially releases its definition for Open Source AI appeared first on SD Times.

Source: Read MoreÂ

CodeSOD: Enterprise Code Coverage

Error’d: Infallabella

CodeSOD: Ready Xor Not

CodeSOD: A Set of Mistakes

Predicting the (actually very exciting) future of next gen Xbox hardware

With Astro Bot winning Game of the Year, Microsoft and Xbox need to start reinvesting in their platforming games

If ChatGPT produces AI-generated code for your app, who does it really belong to?

I tested the viral ‘tangle-free’ USB-C cable, and it’s my new travel essential

Community News: Latest PECL Releases (12.10.2024)

Community News: Latest PECL Releases (12.10.2024)

Community News: Latest PEAR Releases (12.09.2024)

Community News: Latest PECL Releases (12.17.2024)

Predicting the (actually very exciting) future of next gen Xbox hardware

Predicting the (actually very exciting) future of next gen Xbox hardware

With Astro Bot winning Game of the Year, Microsoft and Xbox need to start reinvesting in their platforming games

Asus bombards Windows 11 with christmas.exe malware-like Christmas wreath banner

OSI officially releases its definition for Open Source AI

Predicting the (actually very exciting) future of next gen Xbox hardware

With Astro Bot winning Game of the Year, Microsoft and Xbox need to start reinvesting in their platforming games

What size gaming monitor should you buy?

A New Study by OpenAI Explores How Usersâ€™ Names can Impact ChatGPTâ€™s Responses

Western Sydney University Data Breach: Impact on 7,500 Individuals

Vizia â€“ declarative GUI library

6 Best Free and Open Source Text-Based Mastodon Clients

The recipient of The Game Awards’ inaugural Game Changer award has a history of helping developers affected by layoffs find new placement in the games industry, but now he is the target of a hate campaign

Building Gen AI with MongoDB & AI Partners | July 2024

Researchers at Stanford Present ZIP-FIT : A Novel Data Selection AI Framework that Chooses Compression Over Embeddings to Finetune Models on Domain Specific Tasks

OSI officially releases its definition for Open Source AI

Related Posts