Notes on BLOOM, RAIL and openness of AI

"an initial analysis of a growing movement ... to make key machine learning tools and technologies open and shared. “Open Artificial Intelligence”...

BLOOM is a large language model released under RAIL – a new copyright license that combines an Open Access approach to licensing with behavioral restrictions aimed to enforce a vision of responsible AI use" both resulting from "

the work of BigScience,

a network of over 1000 AI researchers facilitated by HuggingFace." Meta's LLM, OPT, was released under a similarly designed license, while Stable Diffusion was released under CreativeML Open RAIL-M, a modified version of RAIL.

"Open AI" (as opposed to OpenAI, which isn't), also needs to tackle "other parts of the AI technological stack... [eg] sharing of training datasets (... cf AI_Commons initiative) or openness of algorithms".

BLOOM, "an open and collaboratively developed alternative" which can develop text "in 46 natural and 13 programming languages... stands for BigScience Large Open-science Open-access Multilingual Language Model...

enforcing responsible uses of AI technologies... decided to introduce behavior restrictions...

included as an attachment to the license, which lists thirteen such restrictions. Their range is very broad...uses that:

  • violate laws and regulations...
  • exploit or harm minors...
  • discriminate or harm “individuals or groups based on social behavior or known or predicted personal or personality characteristics”....
  • disclaim that text created with BLOOM is machine-generated...
  • any “medical advice and medical results interpretation”

Balancing "open sharing and enforcement of responsible use" is hard. "Creators of RAIL envision a decentralized system... AI developers are free to implement licenses with various restrictions, based on their ethical preferences.--

BLOOM model is not just openly licensed, but also the result of ...commons-based peer production... [the] model at the heart of Wikipedia or the Firefox browser... opens the black box not just of the model itself, but also of how LLMs are created and who can be part of the process... treat participatory approaches ... – as crucial to the success of open initiatives."

While many opensource projects can dominate their sector (Wordpress, Wikipedia), BLOOM's use restrictions will probably prevent it dominating. Curbing "unethical and harmful uses... will be achieved not through the proliferation of openly shared models, but through the proliferation of licenses ... so that use restriction becomes a standard."

RAIL meets neither "the Open Source Initiative definition of open code licenses [nor]... Open Definition ... the Can’t Be Evil licenses also challenged established open licensing models, while seeking to uphold the spirit of open sharing."

