AI Archives - Creative Commons https://creativecommons.org/tag/ai/ Wed, 13 May 2026 14:55:21 +0000 en-US hourly 1 https://wordpress.org/?v=6.3.7 From Signals to Infrastructure: Strengthening the Commons for the AI Era https://creativecommons.org/2026/05/13/from-signals-to-infrastructure-strengthening-the-commons-for-the-ai-era/?utm_source=rss&utm_medium=rss&utm_campaign=from-signals-to-infrastructure-strengthening-the-commons-for-the-ai-era Wed, 13 May 2026 08:00:45 +0000 https://creativecommons.org/?p=78086 In this post, we outline our plans to build upon and strengthen CC signals in order to support our goal of sustained access to human knowledge. We do not have all the answers yet. What we do have is a framework for how we will work toward them.

The post From Signals to Infrastructure: Strengthening the Commons for the AI Era appeared first on Creative Commons.

]]>
We recently shared an update on the evolution of CC signals. As AI systems increasingly extract value from the commons without adequate consent, attribution, or transparency, sustaining a healthy commons requires stronger governance and accountability. This reflects a shift in our approach: from expressing preferences to rebalancing power to protect the commons.

In this post, we outline our plans to build upon and strengthen CC signals in order to support our goal of sustained access to human knowledge. We do not have all the answers yet. What we do have is a framework for how we will work toward them.

Recap: What’s At Stake

When it comes to AI, copyright operates in a landscape that is uneven and often unclear. Because of this, the CC licenses, while still important, are not sufficient to address how content is used in AI systems. You can read more on this here. CC licenses also do not fully capture the range of intentions creators and data holders have in an AI-mediated world.

Across the web, creators, communities, and institutions are turning to multiple forms of defensive enclosure to restrict access. These include:

  • Legal (e.g. licensing), such as open access publishers recommending CC BY-NC-ND as a mechanism of control, which ACM now does, which negatively impacts human collaboration.
  • Technical (e.g. CAPTCHAs, bot blocking, rate limiting), such as what news publishers are doing, which negatively impacts archiving efforts.
  • Financial (e.g. paywalled APIs), such as what X did post-acquisition, which negatively impacts researchers. 

The problem is that these tools treat all machine use as the same, regardless of the purpose. In trying to limit large-scale extraction by AI developers, they also block public interest uses like research, preservation, and accessibility.

While our research is ongoing, there are early indications of a more fragmented and potentially shrinking commons, along with a weakening of long-standing public interest protections.

Building the Next Generation Infrastructure of Sharing

Open access through CC licenses created a spectrum of sharing. Today we need something similar for AI: a spectrum of participation, where creators and data-holding stewards are active participants in how knowledge is produced, shared, and used.

The commons we have built over the past 25 years did not emerge on its own. It was designed through legal frameworks, technical standards, and shared norms. The AI era requires the next generation of that infrastructure. We want a future where the global knowledge commons remains accessible, and where AI systems engage with it in ways that are transparent, accountable, and aligned with the public good.

Our Plans

CC is advancing several high-impact interventions as part of the CC signals framework to restore trust, strengthen participation, and embed public interest values into the AI knowledge ecosystem.

  1. Helping People Make Informed Decisions in the Current Moment
  2. Making Attribution the Norm in AI
  3. Building New Tooling that Protects Public Interest Uses while Restoring Agency

Helping People Make Informed Decisions in the Current Moment

AI systems are using CC-licensed works in ways that are causing many to question whether the existing CC license suite still aligns with their goals.

These concerns take different forms: attribution that disappears inside AI systems, sensitive knowledge stripped from its original context, growing concentrations of value and power, and no clear mechanisms for reciprocity or accountability. But they share a common root: uncertainty about what the CC licenses actually mean in this new environment.

We want people who choose to CC license to do so with confidence. We also want institutions with CC licensing embedded in their policies to have a clear picture of what the licenses do and do not cover when it comes to AI.  Over the next six months, we will provide sector-specific interim guidance to support CC licensors in navigating the new questions that AI raises for them. This guidance is not intended to resolve all legal ambiguity. Instead, during this period of uncertainty, we want to preserve the practice of sharing that AI is currently putting at risk, while we develop new tools and practices that address our communities’ concerns.

We will be holding a series of sector-specific virtual events to collect feedback on this interim guidance. Sign up for the CC newsletter for more information as soon as it becomes available. 

Making Attribution the Norm in AI

Attribution has always been a cornerstone of the commons. It supports participation, enables transparency, and allows knowledge to be traced, evaluated, and built upon.

Today’s AI ecosystem is eroding this norm. Most generative systems do not meaningfully acknowledge the sources they rely on. As AI increasingly mediates access to knowledge, this has serious consequences: loss of provenance, reduced trust, and fewer incentives to share. The first iteration of CC signals included attribution as a preference; today we believe that attribution must be a requirement. 

Our plan is to define best practices for attribution in AI contexts. AI developers often claim that attribution is simply not possible in LLMs. But this is a consequence of choices made during design, not a technical inevitability. We believe there is value in envisioning what attribution practices could look like in an AI ecosystem that prioritized them. And while there is no going back in time, we can demand attribution where it is technically possible within existing systems, such as Retrieval Augmented Generation (RAG), a method where AI systems pull from specific, traceable sources to generate responses. 

Our work will involve detailing ideal attribution guidance for AI systems, end users, and creators. We will then demonstrate how attribution can be realized in RAG models. This initiative serves two purposes: building shared understanding of what attribution in AI can and cannot currently achieve, and giving creators and AI users the tools to advocate for attribution as a baseline expectation. Strengthening attribution helps ensure that knowledge can circulate widely without losing connection to the people and communities who created it.

CC is looking to connect with experts working on attribution standards and developers working on AI systems that preserve attribution. If that describes your work, we would love to hear from you. 

Building New Tooling that Protects Public Interest Uses While Restoring Agency

Copyright alone cannot do this work. We believe maintaining a human-centered internet requires meaningful guardrails, upheld collectively. Our goal is to support an ecosystem that balances openness with agency, and access with accountability.

First, we are advocating for the development and usage of carefully scoped AI opt-outs that simultaneously sustain creator agency while protecting public interest uses. In an effort to address this need, we proposed additions to the IETF (the body that sets foundational internet standards) AI Preferences vocabulary that would help strike the right balance between creator agency and public interest reuse. It is essential that opt-out tooling and any related legislation protect public interest uses. This includes enabling cultural heritage institutions to preserve and analyze content, and supporting not-for-profit research and educational organizations in their work.

Second, we are doing research and development for a new tool designed to enable conditional access to openly shared collections and compilations. It will allow data stewards to set terms for accessing and using a collection or compilation that protect the sustainability of their technical infrastructure. These stewards may include libraries, archives, research institutions, data repositories, public knowledge projects, and cultural heritage organizations. Resource-heavy bulk reusers of data may be subject to more conditions, and public interest uses would be excluded entirely.

Without practical legal tools to define conditions for AI development, collections are left with blunt options: allow unrestricted extraction by AI developers, or restrict access entirely. Neither option reflects the goals of most knowledge stewards. This research and development is informed by close consultation with community members and stakeholders, such as dialogue with practitioners in the African context this past year, as well as broader explorations in the movement, such as this analysis on sharing of cultural heritage by Open Future Foundation, and the development of NOODL to rebalance power for marginalized language communities.

Many want to continue sharing their collections while ensuring that AI developers use them responsibly by respecting attribution, ensuring transparency, and meeting other safeguards aligned with their public interest missions. We want to build tooling to enable this in standardized, legally enforceable ways. 

What Happens Next

The exploration of these kinds of tools requires us to look beyond copyright alone, which is a real paradigm shift for CC, and not one we take lightly. We believe that investigating the risks and benefits of legal tools that support conditional access is an essential part of stewarding the long-term health of the commons. We need to preserve access to valuable knowledge resources while ensuring that the institutions and communities who steward them remain active participants in shaping the AI ecosystem.

Here is where things stand. This month, we are convening a workshop in London to begin working through the design and governance questions that new tooling raises. Later this year, we will be seeking pilot adopters to help us test and refine the approach in practice. We will share updates as this work develops. 

We have a clear plan, with these initiatives entering pilot phases within the year. Like many nonprofits, our ability to accelerate depends directly on the resources we have available. Support from our Open Infrastructure Circle has made progress to date possible, and as we mark our 25th anniversary, we have set a goal to raise $5 million to advance the next iteration of CC signals. If you are able, we invite you to support this work

Let’s collectively build what the commons needs next.

The post From Signals to Infrastructure: Strengthening the Commons for the AI Era appeared first on Creative Commons.

]]>
How to Keep the Internet Human https://creativecommons.org/2026/02/12/how-to-keep-the-internet-human/?utm_source=rss&utm_medium=rss&utm_campaign=how-to-keep-the-internet-human Thu, 12 Feb 2026 19:16:53 +0000 https://creativecommons.org/?p=77506 I like to say I am a “writer who lawyers”. I begin here because I want to name my biases up front. I am a lawyer, but I come to this work first and foremost as a writer thinking about the conditions that will allow us to continue to share knowledge publicly. And in spite of—or perhaps because of—the fact that I am a lawyer, I have a healthy skepticism about the power of legal terms and conditions. The law will play a role, but the challenge of keeping the internet human will ultimately be navigated by the stories we imagine and tell.  We need new stories.

The post How to Keep the Internet Human appeared first on Creative Commons.

]]>
It is time to update our mental models about open knowledge

I like to say I am a “writer who lawyers”. I begin here because I want to name my biases up front. I am a lawyer, but I come to this work first and foremost as a writer thinking about the conditions that will allow us to continue to share knowledge publicly. And in spite of—or perhaps because of—the fact that I am a lawyer, I have a healthy skepticism about the power of legal terms and conditions. The law will play a role, but the challenge of keeping the internet human will ultimately be navigated by the stories we imagine and tell. 

We need new stories. 

I spent the first 15 years of my legal career working in intellectual property. For most of that time, I was part of the open movement, fighting overly restrictive intellectual property laws to promote access to knowledge. But over time, I began to feel like the message of open licensing did not resonate with me in the same way, especially in my identity as a writer. Eventually I left the open movement to go into the field of privacy. 

Immersing myself in digital privacy led me to realize why the story of open felt incomplete. We had been undervaluing the role of boundaries around reuse. The tension between the instinct to share and the need for boundaries around reuse is the point. And right now, that tension is completely out of balance. Instead, what exists online is a free-for-all.

disequilibrium/a broken commons graphic. Pursuit of knowledge leads to the instict to share which leads to a free-for-all.

If you are familiar with the concept of a commons, you know it requires shared rules that govern reuse of resources. Those shared rules represent a mutual commitment by producers and reusers, and they ensure that the cycle leads to collective benefit and begins again. A free-for-all, on the other hand, has no shared rules. As a result, we are losing the instinct to share. 

What happened to the commons? 

It would be easy to blame AI for this situation, but it is not so straightforward. AI is simply speeding up and exacerbating longstanding challenges with open knowledge. As privacy scholar Daniel Solove has written, “AI is continuous with the data collection and use that has been going on throughout the digital age.” 

In preparation for this talk, I went back and reread the brilliant CC Summit keynote “Open As In Dangerous” by Chris Bourg from 2018 and the seminal Paradox of Open report by the Open Future Foundation. For many years, these and countless other voices have been warning us about the vulnerabilities that open knowledge creates. Whether it is the use of CC-licensed photos for facial surveillance technology or the creation of Grokipedia, it is clear that open content is particularly vulnerable to abuse. 

But of course, it is not just open content that is vulnerable. All content online today has essentially been treated as fair game. The free-for-all extends to everything online. 

This has led to a vast renegotiation of what it means to share publicly, still currently underway. We see this in the massive wave of litigation against AI services, the rise of paywalls and commercial licensing deals, the introduction of new technologies to increase control over content in ways that scale back the open web, and the extreme backlash against AI by creators and the general public.

All of this constitutes a threat to open access to knowledge. It is unlikely that the incentives to share can outweigh all of the growing countervailing forces at play: economic, moral, safety, more. We cannot respond by accepting these risks and harms as inherent and inevitable costs of public sharing knowledge.  

Changing our mental models

To meet the moment, we need to rethink our most fundamental assumptions about open knowledge. 

The old taxonomies no longer apply. 

For a very long time, we have used categories to help us determine the appropriate rules for sharing knowledge. Open content could be licensed one way, while open data had different parameters. This distinction no longer applies when everything online is used as data by machines. Even the difference between copyrighted material and public domain is not very useful, since even copyrighted works are largely used by machines for the public domain material within them (e.g., facts and ideas). 

Copyright is not the main event.

The original “enemy” of the open movement was copyright, and things were simpler back then. Even the most restrictive open license was more permissive than the default under copyright law, so any boundaries we set around the commons were still fighting the copyright war. Overly restrictive copyright laws still cause problems today, but they are no longer the biggest threat against the commons. In fact, it is copyright’s weakness in the context of machine reuse that is the real challenge. The inapplicability of copyright in protecting against unwanted machine reuse guts the CC licenses of the same ability, creating the free-for-all even on CC-licensed content. And importantly, because the aim was to avoid having CC licenses impose restrictions on activity that was otherwise allowed under copyright, this was by design

We have to stop confusing property with morality.

This is where I depart from my younger self and from many of my peers in the open movement. I think we have let important principles like the notion that facts and ideas should not be privately owned, or the fact that some permissionless reuse plays a critical role in free expression, convince us that the scope of copyright is an ethical line. The logic goes: if no one can own it, then no rules should apply. This leads to an impoverished sense of morality, where the only justification for constraint is property rights. As Robin Wall Kimmerer says, “In that property mindset, how we consume doesn’t really matter because it’s just stuff and the stuff all belongs to us. There is no moral constraint on consumption.” 

The ethics of sharing—which is what open is about—needs to be broader than what we can own. 

Boundaries benefit us all.

Boundaries on reuse are what create the reciprocity that fuels a commons. Without them, there is no assurance that sharing leads to collective benefit, and people lose their instinct to share. But boundaries can also have social value in their own right. Even when sharing in public, people rightfully expect some boundaries around how their works are used, regardless of what copyright law says. This is foundational in the field of privacy, but somehow we lose sight of it when we are sitting in the realm of content sharing. Daniel Solove writes: “People expect some degree of privacy in public, and such expectation is reasonable as well as important for freedom, democracy, and individual wellbeing.” Similarly, we establish boundaries around reuse of knowledge because those protections serve us all. 

Open should not be a purity test. 

The open movement has had incredible success creating global standards, and this has helped make it so successful. But the emphasis on standardization has led us to hyper-focus on definitions, and this focus is distracting us from the bigger picture. What matters is not open versus closed, or even abundance versus scarcity. We need to focus on values, not prescriptions. Open licensing has always been conditional, and it has always been a spectrum. This means we have to accept that there will be gray areas. What we lose in certainty, we will gain in relevance and moral clarity. As Rebecca Solnit says, “Categories are where thoughts go to die.” 

Where do we go from here? 

All of this leads back to where we began. We have to reconstruct the mutual commitment that keeps the commons cyclical.

Equilibrium/a healthy commons graphic. Pursuit of knowledge leads to the instinct to share, which leads to mutual commitment, which leads to collective benefit, which leads back to the pursuit of knowledge.

Rebuilding the mutual commitment that comes with sharing knowledge requires us to balance opposing values. On the one hand, we must protect important freedoms of the reusing public. On the other, we must establish boundaries around responsible reuse. The goal is to be as open as possible and as restrictive as necessary. And before we start panicking about slippery slopes, we should remember there is an important limiting principle we can leverage:  does the boundary shift power in ways that further concentrate it or redistribute it? We can also ask whether there are ways to mitigate a boundary’s effect on access. 

We already have a good sense of the dimensions of boundaries around responsible reuse. They all have roots in the existing CC license suite.

Attribution: While the AI landscape complicates methods and norms for attribution, the principle is more important than ever for informational integrity, authors rights, and transparency. 

Reciprocity: Molly Van Houweling calls this “extractability,” the idea that those extracting facts and ideas from others’ works have a moral responsibility to ensure that knowledge remains extractable by others. This is essentially about crafting a ShareAlike obligation for the age of AI. 

Financial sustainability: This has been a longtime challenge in the open movement, and it is more urgent than ever. It is not about preserving business models, it is about financially sustaining the production of knowledge and culture as public goods. 

Prohibitions on harmful use cases: This dimension may feel less familiar in open licensing, but the sentiment is one we hear regularly. There are simply some use cases or even actors that feel out of bounds for people sharing knowledge because of the harm they cause. 

How do we catalyze a mutual commitment around prosocial boundaries in the current free-for-all environment? Open Future Foundation’s Paul Keller has written: “For any response to succeed in preserving a diverse and sustainable information ecosystem, collective action is required—both bottom-up, through coordinated action by information producers, and top-down, through political will to enable redistribution via fiscal interventions.” There is no single solution, and we need to tackle it from all directions. 

For the bottom-up efforts, we can leverage the tools we have. Norms and social pressure have a role to play, though it is hard to put full faith in voluntary action right now. We can also explore methods for legal control, including both contract and copyright law. As Nilay Patel has said, “Copyright is the only functioning regulation on the internet,” which makes it impossible to avoid considering it as one lever to employ.1 Finally, there is the strategy of controlling access. This is the most uncomfortable tactic because of the collateral damage it risks, and it requires extreme care. But if AI companies will not pay attention voluntarily, technical controls around access look increasingly necessary. 

There are many in the open movement already experimenting with these efforts, including the Mozilla Data Collective, the differentiated access model proposed by Europeana and the Open Future Foundation, the NOODL license, and many more. Creative Commons is also actively thinking about how to build a framework that re-instills mutual commitment into the ecosystem. Many of you have been following along as we experiment with an AI preference signals framework we’ve been calling CC signals. While the path we will take is evolving, the goal is the same. We need to come together to define and sustain the boundaries that serve us all. 

I will end with the words of Ruha Benjamin: “We need to give the voice of the cynical, skeptical grouch that patrols the borders of our imagination a rest.” 

We can imagine a better way. 


1 While copyright law is ill-equipped to function as a method of control over machine reuse (and rightly so, considering the importance of not treating facts and ideas as private property), copyright law still has a role to play because of the uncertainty around its application on a global scale. Granting copyright permission in exchange for agreement to certain conditions could still be a valuable offer to some reusers. 

 

The post How to Keep the Internet Human appeared first on Creative Commons.

]]>
AI and the Commons: A Reading List https://creativecommons.org/2025/09/03/ai-and-the-commons-a-reading-list/?utm_source=rss&utm_medium=rss&utm_campaign=ai-and-the-commons-a-reading-list Wed, 03 Sep 2025 16:50:34 +0000 https://creativecommons.org/?p=77011 Distorted Forest Path © by Lone Thomasky & Bits&Bäume is licensed under CC BY 4.0 Here at CC, we have the goal of defending and sustaining the digital commons in the face of developments in artificial intelligence. We’ve recently introduced a new framework, CC signals, to offer a new way for stewards of large collections…

The post AI and the Commons: A Reading List appeared first on Creative Commons.

]]>
Distorted Forest Path © by Lone Thomasky & Bits&Bäume is licensed under CC BY 4.0

Here at CC, we have the goal of defending and sustaining the digital commons in the face of developments in artificial intelligence.

We’ve recently introduced a new framework, CC signals, to offer a new way for stewards of large collections of content to indicate their preferences for how machines (and the humans controlling them) should contribute back to the commons.

As we develop our approach, we’re taking inspiration from the work of our partners, community, and other stakeholders. We’re particularly interested in efforts to understand:

  • How AI scrapers are reshaping the web 
  • Copyright, labor, surveillance, and resistance
  • The effects of a new economy of data licensing
  • Emerging ideas for more ethical AI and consensual data governance 

We’re reading (a lot!) on these topics, to help ensure that CC signals become part of a diverse set of solutions for protecting the commons in the unfolding AI future. Here’s some of the writing that’s shaping our thinking:

We’d love for you to read and learn alongside us, share your thoughts, and contribute other articles and resources to this list! Connect with us on LinkedIn, Bluesky, or Mastodon

The post AI and the Commons: A Reading List appeared first on Creative Commons.

]]>
Understanding CC Licenses and AI Training: A Legal Primer https://creativecommons.org/2025/05/15/understanding-cc-licenses-and-ai-training-a-legal-primer/?utm_source=rss&utm_medium=rss&utm_campaign=understanding-cc-licenses-and-ai-training-a-legal-primer Thu, 15 May 2025 17:51:13 +0000 https://creativecommons.org/?p=76580 Whether you are a creator, researcher, or anyone licensing your work with a CC license, you might be wondering how it can be used to train AI. Many AI developers, who wish to comply with the CC license terms, are also seeking guidance.  The application of copyright law to AI training is complex. The CC…

The post Understanding CC Licenses and AI Training: A Legal Primer appeared first on Creative Commons.

]]>
Whether you are a creator, researcher, or anyone licensing your work with a CC license, you might be wondering how it can be used to train AI. Many AI developers, who wish to comply with the CC license terms, are also seeking guidance. 

The application of copyright law to AI training is complex. The CC licenses are copyright licenses, so it follows that applying CC licenses to AI training is just as complex. 

The short answer is: AI training is often permitted by copyright. This means that the CC license conditions have limited application to machine reuse. This also means that using a more restrictive CC license in an effort to prevent AI training is not an effective approach. In fact, restrictive licensing may actually end up preventing the kind of sharing you want (like allowing for translation, for example), while not being effective to block AI training. 

For the long answer, read our new guide that provides a legal analysis and overview of the considerations when using CC-licensed works for AI training. 

👉  For an at-a-glance overview, head over to the Using CC-Licensed Works for AI training webpage

👉  For a more in-depth analysis, check out our handy PDF download

👉 For those who love a visual, take a look at our supplementary flowchart

If the CC licenses have limited application to machine reuse, what agency do creators have in the AI ecosystem? 

This is an important question. As you’ve heard us talk about before, we’re actively developing a CC preference signals framework to help bridge this gap. The framework is designed to offer new choices for stewards of large collections of content to signal their preferences when sharing their works, using scaffolding inspired by the architecture of the CC licenses. This is not mediated through copyright or the CC licenses. It is governed by something that tends to be even more widely adopted: a social contract. Stand by for the release of the paper prototype of CC preference signals framework at the end of June 2025. 

While you are here, please consider making an annual recurring donation via our Open Infrastructure Circle. This work will require a large amount of resourcing, over many years, to make happen. 

The post Understanding CC Licenses and AI Training: A Legal Primer appeared first on Creative Commons.

]]>
CC @ SXSW: Protecting the Commons in the Age of AI https://creativecommons.org/2025/04/09/cc-sxsw-protecting-the-commons-in-the-age-of-ai/?utm_source=rss&utm_medium=rss&utm_campaign=cc-sxsw-protecting-the-commons-in-the-age-of-ai Wed, 09 Apr 2025 15:18:38 +0000 https://creativecommons.org/?p=76386 SXSW by Creative Commons is licensed under CC BY 4.0 If you’ve been following along on the blog this year, you’ll know that we’ve been thinking a lot about the future of open, particularly in this age of AI. With our 2025-2028 strategy to guide us, we’ve been louder about a renewed call for reciprocity…

The post CC @ SXSW: Protecting the Commons in the Age of AI appeared first on Creative Commons.

]]>
SXSW by Creative Commons is licensed under CC BY 4.0

If you’ve been following along on the blog this year, you’ll know that we’ve been thinking a lot about the future of open, particularly in this age of AI. With our 2025-2028 strategy to guide us, we’ve been louder about a renewed call for reciprocity to defend and protect the commons as well as the importance of openness in AI and open licensing to avoid an enclosure of the commons. 

Last month, we took some of these conversations on the road and hosted the Open House for an Open Future during SXSW in Austin, TX, as part of a weekend-long Wiki Haus event with our friends at the Wikimedia Foundation. 

During the event, we spoke with Audrey Tang and Cory Doctorow about the future of open, especially as we look towards CC’s 25th anniversary in 2026.  In this wide-ranging conversation, a number of themes were reflected that capture both where we’ve been over the last 25 years and where we should be focusing for the next 25 years, including: 

  • The Fight for Technological Self-Determination: Contractual restrictions are increasingly being used to lock down essential technologies, from printer ink to hospital ventilators. The push for openness and economic fairness must go beyond just content-sharing and extend to fighting for the rights of people to repair, modify, and use technology freely.
  • Shifting from Resistance to Building Alternatives: The open movement is not just about opposing corporate restrictions but also about creating viable, open alternatives. Initiatives like Gov Zero show that fostering decentralized, user-controlled platforms can help counteract monopolistic digital ecosystems.
  • The Power of Exit as a Lever for Change: Simply having the option to leave restrictive platforms can influence corporate behavior. Efforts like Free Our Feeds and Bluesky aim to create credible exit strategies that prevent users from being locked into exploitative digital environments.
  • Beyond Copyright: New Frameworks for Openness and Innovation: While Creative Commons began as a response to copyright limitations, the next phase should focus on broader issues like supporting an infrastructure for open sharing, ethical AI development, and open governance models that empower communities rather than just limiting corporate control.
  • Reclaiming the Ethos of Open Source and Free Software: The movement must reconnect with its ethical roots, focusing on freedom to create, share, and innovate—not just openness for the sake of efficiency. This includes resisting corporate capture of “openness” and ensuring technological advances serve public interest rather than private profit.

Since the proliferation of mainstream AI, we’ve been analyzing the limitations of copyright (and, by extension, the CC licenses since they are built atop copyright law) as the right lens to think about guardrails for AI training. This means we need new tools and approaches in this age of AI that complement open licensing, while also advancing the AI ecosystem toward the public interest. Preference signals are based on the idea that creators and dataset holders should be active participants in deciding how and/or if their content is used for AI training. Our friends at Bluesky, for example, have recently put forth a proposal on User Intents for Data Reuse, which is well worth a read to conceptualize how a preference signals approach could be considered on a social media platform. We’ve also been actively participating in the IETF’s AI Preferences Working Group, since submitting a position paper on the subject mid-2024 .

SXSW by Creative Commons is licensed under CC BY 4.0

As CC gets closer to launching a protocol based on prosocial preference signals—a simple pact between those stewarding the data and those reusing it for generative AI training—we had the opportunity during SXSW to chat with some great thought leaders about this very topic. Our panelists were Aubra Anthony, Senior Fellow, Technology and International Affairs Program at Carnegie Endowment for International Peace; Zachary J. McDowell, Phd, Assistant Professor, Department of Communication, University of Illinois at Chicago; Lane Becker, President, Wikimedia LLC at Wikimedia Foundation, and our very own Anna Tumadóttir, CEO, Creative Commons to explore sharing in the age of AI.  A few key takeaways from this conversation included: 

  • Balancing Norms and Legal Frameworks: There is a growing interest in developing normative approaches and civil structures that go beyond traditional legal frameworks to ensure equitable use and transparency.
  • Navigating AI Traffic and Commercial Use: Wikimedia is adapting to the influx of AI-driven bot traffic and exploring how to differentiate between commercial and non-commercial use. The idea of treating commercial traffic differently and finding ways to fundraise off bot traffic is becoming more prominent, raising important questions about sustainability in an open knowledge ecosystem. From CC’s perspective, we’ve found that as our open infrastructures mature they become increasingly taken for granted, a notion that is not conducive to a sustainable open ecosystem.
  • Openness in the Age of AI: There is growing reticence around openness, with creators becoming more cautious about sharing content due to the rise of generative AI (note, this is exactly what our preference signals framework is meant to address, so stay tuned!). We should emphasize the need for open initiatives to adapt to the broader social and economic context, balancing openness with creators’ concerns about protection and sustainability.
  • Making Participation Easy and Understandable: To encourage widespread participation in open knowledge systems and for preference signal adoption, tools will need to be simple and intuitive. Whether through collective benefit models or platform cooperativism, ease of use and clarity are essential to engaging the broader public in contributing to open initiatives.

Did you know that many social justice and public good organizations are unable to participate in influential and culture-making events like SXSW due to a lack of funding? CC is a nonprofit organization and all of our activities must be cost-recovery. We’d like to sincerely thank our event sponsor, the John S. and James L. Knight Foundation for making this event and these conversations possible. If you would like to contribute to our work, consider joining the Open Infrastructure Circle which will help to fund a framework that makes reciprocity actionable when shared knowledge is used to train generative AI.

The post CC @ SXSW: Protecting the Commons in the Age of AI appeared first on Creative Commons.

]]>
From Strategy to Action: Focus Areas for 2025 https://creativecommons.org/2025/03/03/from-strategy-to-action-focus-areas-for-2025/?utm_source=rss&utm_medium=rss&utm_campaign=from-strategy-to-action-focus-areas-for-2025 Mon, 03 Mar 2025 18:24:20 +0000 https://creativecommons.org/?p=75883 Astronomical Clock by olemartin is licensed under CC BY-NC-SA 2.0. The team here at Creative Commons was delighted to publicly release our new organizational strategy on January 22, after almost a year of intensive team, community, and board consultations. For the next several years, our focus will be to: Strengthen the open infrastructure of sharing…

The post From Strategy to Action: Focus Areas for 2025 appeared first on Creative Commons.

]]>
Astronomical clock
Astronomical Clock by olemartin is licensed under CC BY-NC-SA 2.0.

The team here at Creative Commons was delighted to publicly release our new organizational strategy on January 22, after almost a year of intensive team, community, and board consultations. For the next several years, our focus will be to:

  • Strengthen the open infrastructure of sharing
  • Defend and advocate for a thriving creative commons
  • Center community

These goals are high level, as they tend to be when packaged up as part of a multi-year strategy. These goals should also feel familiar, for an organization whose mission it is to empower individuals and communities around the world through technical, legal, and policy solutions that enable the sharing of education, culture, and science in the public interest. But there are important nuances included in these goals and subsequent short-, medium-, and long-term objectives that point to intentional and meaningful shifts in the ways we operate to meet this moment. 

Of course the legal layer of the open infrastructure—the CC licenses and legal tools themselves—must be strengthened. But also, new sharing frameworks must be explored for changing times. 

Of course we must ensure the ongoing survival of the commons. But strategies need to evolve from solely being a sensible argument around opening up access to information. We know that greater access facilitates advances in education, in the scientific arena, and in our ability to understand and appreciate the diversity of cultural heritage that exists. However, those who previously saw the obvious benefits to sharing may now be hesitant, uncertain about how their works will be used or contextualized, through advances in Artificial Intelligence (AI) and machine learning. 

Finally, one might think that centering community goes without saying, but actually, it doesn’t. As an organization that has only achieved what it has because of a strong community of advocates bringing their expertise and passion to bear, we know we cannot continue to impact the social norms and legal frameworks of sharing without full participation.

So what does all of this mean for our work today, and throughout this year? Since we are currently operating in the age of AI, where all content also functions as data, we are focusing our work in two key areas:

  1. Data governance, shaped by legal and norms-based infrastructure to facilitate sharing.
  2. Sustaining open licensing in the age of AI, as high value contributions to the commons at scale that must be sustained through reciprocity.

This focus is guided by CC’s core principle: ideas and facts should not be commodified. As we reimagine sharing in the age of AI, we also draw on our history which reminds us to resist the reflex to expand copyright. Instead, we believe developing new norms, as part of a healthy data governance framework that prioritizes sharing in the age of AI, is the best approach to meeting our mission.  

Data Governance

Our friends at Open Future define data governance as “how rules for data use are created and enforced. This includes laws, standards, and social norms that guide what people can and can’t do with data. Good governance ensures fair and responsible data sharing.”

CC plays a unique role within data governance across the open internet. The CC licenses provide a form of legal and social norms guidance that has facilitated sharing on the internet for the last 25 years. We think of CC’s role within data governance as providing critical infrastructure that enables community-driven, fair, and responsible data sharing. The challenge is that what is considered fair and responsible data sharing is not static; it evolves based on context. And while this has always been true, AI has brought issues of fairness, transparency, trust, accountability, and more to the forefront for CC and for our many collaborators and colleagues who are committed to human-centered approaches to data governance. 

In 2025, we need to continue to explain how the CC licenses interact with AI training, and champion preference signals as a way to advance the data governance we need to meet this moment. You’ve heard from us on this subject in the past, and there is much more to come as we find partners to pilot this work with in the coming months. Policy and legal environments will also continue to play a significant role in both driving and influencing the data governance landscape of the future. CC’s role in advocating for balanced copyright and policies that drive access to knowledge, especially as new legislation, particularly around AI, is passed and implemented, is instrumental in representing civil society and advocating on behalf of the public interest.

Sustaining Open Licensing in the Age of AI

The use of the CC licenses has resulted in billions of items being released openly. Today, these items have also become parts of AI training sets—this is a significant shift that is influencing the norms around open licensing. Our priority is increasing sustainable sharing and access, but we now must consider “what about AI?”. We believe that openly licensed collections of content, which act as high-value contributions to the commons, must continue to be prioritized. 

However, many creators (artists, researchers, educators, and everyone in between) are understandably concerned about their contributions to the commons being reduced to small pieces of data within huge datasets where they lose agency over how their works are being used. We believe that the antidote to this is reciprocity. We believe it is time for the open movement to ask for something in return when there is disproportionate benefit from use of open datasets. We aim to do this by developing relationships with AI model builders on behalf of those who contribute to the commons, ensuring that training datasets remain collectively owned, sustain the commons, and that data governance principles are respected.

We need more open educational, cultural, scientific, and research data to allow more rapid scientific discovery and collaboration. Sharing must continue in the age of AI and we are committed to supporting open licensing at scale, taking the context of AI into consideration. 

There are new and layered complexities in the open sharing world, and we’re excited and determined to help clarify and address these challenges. We’d like to see open sharing grow as a collective strategy  to advance the public interest. In 2025 (and beyond, I’m sure), we will be finding ways to facilitate agency for the movement and facilitating even more sharing and access, while ensuring that the commons remain resilient and sustainable.

If you’d like to support this work, consider joining the Creative Commons Open Infrastructure Circle. Our most dedicated supporters ensure that every day we can show up and do the valuable work of preserving and growing the global commons of knowledge and culture from which we all benefit.

The post From Strategy to Action: Focus Areas for 2025 appeared first on Creative Commons.

]]>
The AI Action Summit & Civil Society’s (Possible) Impact https://creativecommons.org/2025/02/18/the-ai-action-summit-civil-societys-possible-impact/?utm_source=rss&utm_medium=rss&utm_campaign=the-ai-action-summit-civil-societys-possible-impact Tue, 18 Feb 2025 18:51:45 +0000 https://creativecommons.org/?p=75852 The Conciergerie, Paris by Mustang Joe is marked with CC0 1.0. On February 10 and 11, 2025, the government of France convened the AI Action Summit, bringing together heads of state, tech leaders, and civil society to discuss global collaboration and action on AI. The event was co-chaired by French President Macron and Indian Prime…

The post The AI Action Summit & Civil Society’s (Possible) Impact appeared first on Creative Commons.

]]>
The Conciergerie, Paris
The Conciergerie, Paris by Mustang Joe is marked with CC0 1.0.

On February 10 and 11, 2025, the government of France convened the AI Action Summit, bringing together heads of state, tech leaders, and civil society to discuss global collaboration and action on AI. The event was co-chaired by French President Macron and Indian Prime Minister Modi. This was the third such Summit in just over a year, the first two in the UK and South Korea respectively. The next one is to be hosted in India, with a firm date not yet set.

Creative Commons was invited to be an official participant in the Summit, and given room to speak on a panel about international AI governance. Given our continued advocacy for public interest AI, and on-the-ground work, particularly in the US and EU, to interrogate new governance structures for data sharing, open infrastructures, and data commons, the Summit was an important venue to contribute to the global conversation.

We focused on three things in our panel and direct conversations:

  1. Civil society matters, and must continue to be included. While we may not hold the pen on drafting declarations, or be in the negotiating room with world leaders and their ample security teams, we must continue to (loudly) bring our perspectives to these spaces. If we aren’t there, then nobody is. Without civil society, there can be no public interest. 
  2. The importance of openness in AI. What it means, who benefits from it, and how we think critically about ongoing (dis)incentives to participate in the open knowledge ecosystem.
  3. Local solutions for local contexts, local content, and local needs.

Civil Society Matters

Civil society matters because we represent real concerns from real people. A people-centered approach to AI must inevitably be a planet-centered approach as well, one simply cannot and should not exist without the other.

Included in the civil society contingent at the Summit were also major philanthropic foundations who have long focused on public interest technology. Encouragingly (we hope) they have joined forces with private investment and governments to launch Current AI, a coalition which is advocating ‘global collaboration and local action, building a future where open, trustworthy technology serves the public interest’. The Summit also saw the launch of ROOST (Robust Open Online Safety Tools), which was born out of a conversation at a prior Summit around the absence of reliable, robust, high-quality open source tooling for trust and safety. ROOST adds a critical building block to the open source AI ecosystem as tools to allow anyone to run safety checks on datasets before use and training should (hopefully) result in safer model performance.

But philanthropy is not a business model for something that is set to become ubiquitous public infrastructure at a greater level than is already the case with the internet currently. The investments of philanthropy alone will not be enough to steer the public interest conversation to the top of the action agenda. There must be matching political will and public investment, and we’ll be watching closely for evidence that actions are following words.

Our view is that governments should prioritize investment in publicly accessible AI, which meets open standards and allows for equitable access. These are key drivers of innovation and every sector stands to benefit. Governments can lead the way on investing in compute, (re)training people, and preparing and encouraging high quality openly licensed datasets, to level the playing field for researchers, innovators, open source developers, and beyond.

Openness in AI

Openness in AI continues to be a broad and multifaceted topic: how do we continue to foster open sharing, making it resilient, safe and trustworthy while we’re hearing from our community some examples of creators and organizations choosing more restrictive licenses now, or hesitating to share at all in an attempt to regain agency over how their content is used as training data. Our future depends on protecting the progress of the last 20 years of open practices. The answer does not lie in a misguided shift from CC BY to CC BY-NC-ND. We have to think more holistically.

The CC licenses alone are not a governance framework in and of themselves, but what they represent are absolutely critical components of legal and social norms that support data governance that can serve the public interest.

In the context of data governance, we see our role in helping negotiate preferences for reuse of datasets containing openly licensed works. We need to ensure that folks are still incentivized to participate and contribute to the commons, while feeling their voices are heard and their work is contributing in mutually-beneficial ways. If you are the steward of a large open dataset, we want to hear from you.

Local Solutions for Local Contexts

From CC’s perspective, local solutions for local contexts are where we need to put our energy. As Janet Haven from Data & Society frames it, let’s focus on collaboration for AI governance, rather than striving for a single, global governance structure. One size does not fit all, and even issues that are global needs, like planetary survival, will require very different efforts by country or region. It was rather encouraging to hear examples of “small” language models from across the world, that emphasize language preservation and cultural context. Efforts to record, catalog, and digitize language and cultural artifacts are underway. This is yet another area where we see a need to systematically articulate and clearly signal preferences for reuse, so that local efforts thrive and are respected appropriately.

Where We Go From Here

We heard from many fellow civil society organizations that the tone in France differed markedly from previous Summits in the UK or South Korea. There was a welcome diversity of civil society voices on panels and in workshops, with a steady drumbeat of calls for safe, sustainable, and trustworthy AI. “Open source” and “public interest” were phrases uttered in many major interventions. But aside from us collectively being able to fill a few volumes on how we define these terms anyway (sustainable for who?) the real impact of the Summit will be seen in the ways in which we collaborate from now on.

The political discussions at the Summit focused heavily on the false dichotomy of regulation versus innovation – and yes, the language used heavily fed into the narrative that those are mutually exclusive. Much emphasis on the desire for regional investment (and superiority), while offering global collaboration, was mildly disheartening but also fully expected. Political statements around public interest were repeated but vague. Canadian Prime Minister Trudeau, who emphatically urged everyone to not forget the people, stating that “the benefits must accrue to everyone”. Whether those in power will pay attention to that message is anyone’s guess. Take, for example, The Paris Charter on Artificial Intelligence in the Public Interest, which says all of the right things but lacks in terms of both widespread endorsement and meaningful steps towards implementation.

We are clear-eyed on the fact that AI is here, has been for quite some time, and will not go away. We need collaborative, pragmatic approaches to steer towards what we see as beneficial outcomes and public interest values. While there were glimmers of hope from some who hold legislative and executive power, it’s clear that civil society has a lot of advocacy work ahead of us.

The Summit culminated in countries signing onto a declaration, with notable omissions from the United States and UK. As always, it is once the media cycle moves on where we will see any lasting impact. In the meantime, let’s not wait for another global Summit to take action.

The post The AI Action Summit & Civil Society’s (Possible) Impact appeared first on Creative Commons.

]]>
Why Digital Public Goods, including AI, Should Depend on Open Data https://creativecommons.org/2025/01/27/why-digital-public-goods-including-ai-should-depend-on-open-data/?utm_source=rss&utm_medium=rss&utm_campaign=why-digital-public-goods-including-ai-should-depend-on-open-data Mon, 27 Jan 2025 17:34:43 +0000 https://creativecommons.org/?p=75806 Acknowledging that some data should not be shared (for moral, ethical and/or privacy reasons) and some cannot be shared (for legal or other reasons), Creative Commons (CC) thinks there is value in incentivizing the creation, sharing, and use of open data to advance knowledge production. As open communities continue to imagine, design, and build digital…

The post Why Digital Public Goods, including AI, Should Depend on Open Data appeared first on Creative Commons.

]]>
Acknowledging that some data should not be shared (for moral, ethical and/or privacy reasons) and some cannot be shared (for legal or other reasons), Creative Commons (CC) thinks there is value in incentivizing the creation, sharing, and use of open data to advance knowledge production. As open communities continue to imagine, design, and build digital public goods and public infrastructure services for education, science, and culture, these goods and services – whenever possible and appropriate – should produce, share, and/or build upon open data.

Open Data by Auregann is licensed under CC BY-SA 3.0.

Open Data and Digital Public Goods (DPGs)

CC is a member of the Digital Public Goods Alliance (DPGA) and CC’s legal tools have been recognized as digital public goods (DPGs). DPGs are “open-source software, open standards, open data, open AI systems, and open content collections that adhere to privacy and other applicable best practices, do no harm, and are of high relevance for attainment of the United Nations 2030 Sustainable Development Goals (SDGs).” If we want to solve the world’s greatest challenges, governments and other funders will need to invest in, develop, openly license, share, and use DPGs.

Open data is important to DPGs because data is a key driver of economic vitality with demonstrated potential to serve the public good. In the public sector, data informs policy making and public services delivery by helping to channel scarce resources to those most in need; providing the means to hold governments accountable and foster social innovation. In short, data has the potential to improve people’s lives. When data is closed or otherwise unavailable, the public does not accrue these benefits.

CC was recently part of a DPGA sub-committee working to preserve the integrity of open data as part of the DPG Standard. This important update to the DPG Standard was introduced to ensure only open datasets and content collections with open licenses are eligible for recognition as DPGs. This new requirement means open data sets and content collections must meet the following criteria to be recognised as a digital public good.

  1. Comprehensive Open Licensing:
    1. The entire data set/content collection must be under an acceptable open licence. Mixed-licensed collections will no longer be accepted.
  2. Accessible and Discoverable:
    1. All data sets and content collection DPGs must be openly licensed and easily accessible from a distinct, single location, such as a unique URL.
  3. Permitted Access Restrictions:
    1. Certain access restrictions – such as logins, registrations, API keys, and throttling – are permitted as long as they do not discriminate against users or restrict usage based on geography or any other factors.

The DPGA writes: “This new requirement is designed to increase trust and confidence in all DPGs by ensuring that users can fully engage with solutions without concerns over intellectual property infringement. Simplifying access and usage aligns with the DPGA’s goal of making DPGs truly open and accessible for widespread adoption… it helps foster an environment and ecosystem where innovation can thrive without legal uncertainties.”

AI and Open Data

As CC examines AI and its potential to be a public good that helps solve global challenges, we believe open data will play a similarly important role.

CC recognizes AI is a rapidly developing space, and we appreciate everyone’s diligent work to create definitions, recommendations, and guidance for and warnings about AI. After two years of community consultation, the Open Source Initiative released version 1.0 of the Open Source AI Definition (OSAID) on October 28, 2024. This definition is an important step in starting the conversation about what open means for AI systems. However, the OSAID’s data sharing requirements remain contentious, particularly around whether and how training data for AI models should be shared.

CC is of the opinion that just because it is difficult to build and release open datasets, that does not mean we should not encourage it. In cases where training data should not or cannot be shared, we encourage detailed summaries that explain the contents of the dataset and give instructions for reproducibility, but nonetheless that data should be defined as closed. When data can be made open and shared, it should be.

We agree with Liv Marte Nordhaug, CEO, Digital Public Goods Alliance who said in a recent post: “With regards to AI systems, there is a need to ensure that we don’t inadvertently undermine the open data movement and open data as a category of DPGs by advancing an approach to AI systems that is more permissive than for other categories of DPGs. Maintaining a high bar on training data could potentially result in fewer AI systems meeting the DPG Standard criteria. However, SDG relevance, platform independence, and do-no-harm by design are features that set DPGs apart from other open source solutions—and for those reasons, the inclusion of [AI] training data is needed.”

Next Steps

CC will continue to work with the DPGA, and other partners, as it develops a standard as to what qualifies an AI model to be a digital public good. In that arena we will advocate for open datasets, and consideration of a tiered approach, so that components of an AI model can be considered digital public goods, without the entire model needing to have every component openly shared. Updated recommendations and guidelines that recognize the value of fully open AI systems that use and share open datasets will be an important part of ensuring AI serves the public good.


¹Digital Public Goods Standard
²Data for Better Lives. World Bank (2021). CC BY 3.0 IGO

The post Why Digital Public Goods, including AI, Should Depend on Open Data appeared first on Creative Commons.

]]>
Six Insights on Preference Signals for AI Training https://creativecommons.org/2024/08/23/six-insights-on-preference-signals-for-ai-training/?utm_source=rss&utm_medium=rss&utm_campaign=six-insights-on-preference-signals-for-ai-training Fri, 23 Aug 2024 14:49:02 +0000 https://creativecommons.org/?p=75346 “Eagle Traffic Signals – 1970s” by RS 1990 is licensed via CC BY-NC-SA 2.0.. At the intersection of rapid advancements in generative AI and our ongoing strategy refresh, we’ve been deeply engaged in researching, analyzing, and fostering conversations about AI and value alignment. Our goal is to ensure that our legal and technical infrastructure remains…

The post Six Insights on Preference Signals for AI Training appeared first on Creative Commons.

]]>
Eagle Traffic Signals – 1970s” by RS 1990 is licensed via CC BY-NC-SA 2.0..

At the intersection of rapid advancements in generative AI and our ongoing strategy refresh, we’ve been deeply engaged in researching, analyzing, and fostering conversations about AI and value alignment. Our goal is to ensure that our legal and technical infrastructure remains robust and suitable in this rapidly evolving landscape.

In these uncertain times, one thing is clear: there is an urgent need to develop new, nuanced approaches to digital sharing. This is Creative Commons’ speciality and we’re ready to take on this challenge by exploring a possible intervention in the AI space: preference signals. 

Understanding Preference Signals

We’ve previously discussed preference signals, but let’s revisit this concept. Preference signals would empower creators to indicate the terms by which their work can or cannot be used for AI training. Preference signals would represent a range of creator preferences, all rooted in the shared values that inspired the Creative Commons (CC) licenses. At the moment, preference signals are not meant to be  legally enforceable. Instead, they aim to define a new vocabulary and establish new norms for sharing and reuse in the world of generative AI.

For instance, a preference signal might be “Don’t train,” “Train, but disclose that you trained on my content,” or even “Train, only if using renewable energy sources.”

Why Do We Need New Tools for Expressing Creator Preferences?

Empowering creators to be able to signal how they wish their content to be used to train generative AI models is crucial for several reasons:

  • The use of openly available content within generative AI models may not necessarily be consistent with creators’ intention in openly sharing, especially when that sharing took place before the public launch and proliferation of generative AI. 
  • With generative AI, unanticipated uses of creator content are happening at scale, by a handful of powerful commercial players concentrated in a very small part of the world.
  • Copyright is likely not the right framework for defining the rules of this newly formed ecosystem. As the CC licenses exist within the framework of copyright, they are also not the correct tools to prevent or limit uses of content to train generative AI. We also believe that a binary opt-in or opt-out system of contributing content to AI models is not nuanced enough to represent the spectrum of choice a creator may wish to exercise.  

We’re in the research phase of exploring what a system of preference signals could look like and over the next several months, we’ll be hosting more roundtables and workshops to discuss and get feedback from a range of stakeholders. In June, we took a big step forward by organizing our most focused and dedicated conversation about preference signals in New York City, hosted by the Engelberg Center at NYU.

Six Highlights from Our NYC Workshop on Preference Signals

  • Creative Commons as a Movement

Creative Commons is a global movement, making us uniquely positioned to tackle what sharing means in the context of generative AI. We understand the importance of stewarding the commons and the balance between human creation and public sharing. 

  • Defining a New Social Contract

Designing tools for sharing in an AI-driven era involves collectively defining a new social contract for the digital commons. This process is essential for maintaining a healthy and collaborative community. Just as the CC licenses gave options for creators beyond no rights reserved and all rights reserved, preference signals have the potential to define a spectrum of sharing preferences in the context of AI that goes beyond the binary options of opt-in or opt-out. 

  • Communicating Values and Consent

Should preference signals communicate individual values and principles such as equity and fairness? Adding content to the commons with a CC license is an act of communicating values;  should preference signals do the same? Workshop participants emphasized the need for mechanisms that support informed consent by both the creator and user.

  • Supporting Creators and Strengthening the Commons

The most obvious and prevalent use case for preference signals is to limit use of content within generative AI models to protect artists and creators. There is also the paradox that users may want to benefit from more relaxed creator preferences than they are willing to grant to other users when it comes to their content. We believe that preference signals that meet the sector-specific needs of creators and users, as well as social and community-driven norms that continue to strengthen the commons, are not mutually exclusive. 

  • Tagging AI-Generated vs. Human-Created Content

While tags for AI-generated content are becoming common, what about tags for human-created content? The general goal of preference signals should be to foster the commons and encourage more human creativity and sharing.  For many, discussions about AI are inherently discussions about labor issues and a risk of exploitation. At this time, the law has no concept of “lovingly human”,  since humanness has been taken for granted until now. Is “lovingly human” the new “non-commercial”? Generative AI models also force us to consider what it means to be a creator, especially as most digital creative tools will soon be driven by AI. Is there a specific set of activities that need to be protected in the process of creating and sharing? How do we address human and generative AI collaboration inputs and outputs? 

  • Prioritizing AI for the Public Good

We must ensure that AI benefits everyone. Increased public investment and participatory governance of AI are vital. Large commercial entities should provide a public benefit in exchange for using creator content for training purposes. We cannot rely on commercial players to set forth industry norms that influence the future of the open commons. 

Next Steps

Moving forward, our success will depend on expanded and representative community consultations. Over the coming months, we will:

  • Continue to convene our community members globally to gather input in this rapidly developing area;
  • Continue to consult with legal and technical experts to consider feasible approaches;
  • Actively engage with the interconnected initiatives of other civil society organizations whose priorities are aligned with ours;
  • Define the use cases for which a preference signals framework would be most effective;
  • Prototype openly and transparently, seeking feedback and input along the way to shape what the framework could look like;
  • Build and strengthen the partnerships best suited to help us carry this work forward.

These high-level steps are just the beginning. Our hope is to be piloting a framework within the next year. Watch this space as we explore and share more details and plans. We’re grateful to Morrison Foerster for providing support for the workshop in New York.

Join us by supporting this ongoing work

You have the power to make a difference in a way that suits you best. By donating to CC, you are not only helping us continue our vital work, but you also benefit from tax-deductible contributions. Making your gift is simple – just click here. Thank you for your support.

The post Six Insights on Preference Signals for AI Training appeared first on Creative Commons.

]]>
Questions for Consideration on AI & the Commons https://creativecommons.org/2024/07/24/preferencesignals/?utm_source=rss&utm_medium=rss&utm_campaign=preferencesignals Wed, 24 Jul 2024 16:24:08 +0000 https://creativecommons.org/?p=75311 “Eight eyes. Engraving after C. Le Brun” by Charles Le Brun is licensed via CC0. The intersection of AI, copyright, creativity, and the commons has been a focal point of conversations within our community for the past couple of years. We’ve hosted intimate roundtables, organized workshops at conferences, and run public events, digging into the…

The post Questions for Consideration on AI & the Commons appeared first on Creative Commons.

]]>
Eight eyes. Engraving after C. Le Brun” by Charles Le Brun is licensed via CC0.

The intersection of AI, copyright, creativity, and the commons has been a focal point of conversations within our community for the past couple of years. We’ve hosted intimate roundtables, organized workshops at conferences, and run public events, digging into the challenging topics of credit, consent, compensation, transparency, and beyond. All the while, we’ve been asking ourselves:  what can we do to foster a vibrant and healthy commons in the face of rapid technological development? And how can we ensure that creators and knowledge-producing communities still have agency?

History and Evolution

When Creative Commons was founded over 20 years ago, sharing on the internet was broken. With the introduction of the CC licenses, the commons flourished. Licenses that enabled open sharing were perfectly aligned with the ideals of giving creators a choice over how their works were used.

Those who embrace openly sharing their work have a myriad of motivations for doing so. Most could not have anticipated how their works might one day be used by machines: to solve complex medical questions, to create other-wordly pictures of dogs, to train facial recognition systems – the list goes on.

Can we continue to foster a vibrant and healthy commons in today’s technological environment? How can we think innovatively about creator choice in this context?

Preference Signals

Preference signals for AI are the idea that an agent (creator, rightsholder, entity of some kind) is able to signal their preference with regards to how their work is used to train AI models. Last year, we started thinking more about this concept, as did many in the responsible tech ecosystem. But to date the dialog is still fairly binary, offering only all-or-nothing choices, with no imagination for how creators or communities might want their work to be used.

Enabling Commons-Based Participation in Generative AI

What was once a world of creators making art and researchers furthering knowledge, has the risk of being reduced to a world of rightsholders owning, controlling, and commercializing data. In this bleak future, it’s no longer a photo album, a poetry book, or a family blog. It’s content, it’s data, and eventually, it’s tokens.

We recognize that there is a perceived tension between openness and creator choice. Namely, if we  give creators choice over how to manage their works in the face of generative AI, we may run the risk of shrinking the commons. To potentially overcome, or at least better understand the effect of generative AI on the commons, we believe  that finding a way for creators to indicate “no, unless…” would be positive for the commons. Our consultations over the course of the last two years have confirmed that:

  • Folks want more choice over how their work is used.
  • If they have no choice, they might not share their work at all (under a CC license or strict copyright).

If these views are as wide ranging as we perceive, we feel it is imperative that we explore an intervention, and bring far more nuance into how this ecosystem works.

Generative AI is here to stay, and we’d like to do what we can to ensure it benefits the public interest. We are well-positioned with the experience, expertise, and tools to investigate the potential of preference signals.

Our starting point is to identify what types of preference signals might be useful. How do these vary or overlap in the cultural heritage, journalism, research, and education sectors? How do needs vary by region? We’ll also explore exactly how we might structure a preference signal framework so it’s useful and respected, asking, too: does it have to be legally enforceable, or is the power of social norms enough?

Research matters. It takes time, effort, and most importantly, people. We’ll need help as we do this. We’re seeking support from funders to move this work forward. We also look forward to continuing to engage our community in this process. More to come soon.

The post Questions for Consideration on AI & the Commons appeared first on Creative Commons.

]]>