This Insights and Signals Report was written by Brittany Amell, with thanks to John Willinsky, John Maxwell, and William Bowen for their feedback and contributions.
At a Glance
Insights & Signals Topic Area | Responses to generative AI and LLMs |
Key Participants | European Union, Canada, |
Timeframe | 2022 to Present |
Keywords or Key Themes | Generative AI, open scholarship, trust, credibility, open access |
Summary
Policy Insights and Signals Reports scan the horizon in order to identify and analyse emerging trends and early signals for their potential to impact future policy directions in open access and open, social scholarship. They tend to highlight shifts in technology, public opinion and sentiments, and/or regulatory changes both within and outside of Canada. Like OSPO’s policy observations, insights and signals reports aim to support partners in crafting proactive, responsive, and forward-thinking strategies.
This Insights and Signals Report is the first in a series that will focus on evolving discussions centered around artificial intelligence (AI), particularly generative AI (genAI) and large language models (LLMs), and the implications these may have for open access and open social scholarship. Interested in other Insights and Signals Reports focused on AI? You can find them here and here.
Items discussed in this report include:
- A brief introduction to generative artificial intelligence, with comments from John Maxwell
- The world’s first artificial intelligence act passed in May 2024 by the Council of the European Union
- The inclusion of artificial intelligence in Canada’s Digital Charter Implementation Act (2022), along with critiques from Joanna Redden (2024), an Associate Professor in the Faculty of Information and Media Studies at Western University who is critical of Canada’s proposed AI legislation in its current form
- Several responses to AI in Canada from journals, post-secondary institutions, scholarly associations and granting agencies, as well as some core concerns raised by these groups
- An announcement from Prime Minister Justin Trudeau (2024) regarding plans to dedicate 2.4 billion dollars towards ‘securing Canada’s AI advantage’
- Responses from INKE partners John Willinsky (Founder, Public Knowledge Project) and John Maxwell (Associate Professor of Publishing at Simon Fraser University)
- Some proposed discursive silences for consideration, such as perspectives on data mining as an extractive colonial practice, and Indigenous data sovereignty
This report ends with some key questions and considerations.
Introducing generative artificial intelligence, briefly
Widespread debates about the future of artificial intelligence and the need for ethical frameworks and regulatory policies to mitigate potential harms, re-ignited in 2022 by OpenAI’s first release of generative artificial intelligence (AI) system ChatGPT, continue to receive attention by scholars and media alike. Generative artificial intelligence tools like ChatGPT and Microsoft’s Bing (both powered by OpenAI’s GPT-4) and Google’s Gemini (Bard, previously) can be used to generate poetry, essays, code, translations, and exam responses, as well as images and videos.
However, while these tools have huge potential, they’re also creating challenges and, in some cases, harm (e.g., “Data Harm Record” by Redden et al. 2020; see also, generally, Broussard 2024, Noble 2018 and O’Neil 2016). According to Josh Nicholson, Co-Founder and CEO of Scite.ai, Meta’s Galactica large language model for science was “taken down in under a week because it was so problematic. You could give it prompts and get a full paper back, but it could be racist or completely wrong” (quoted in Wiley 2024, 1).
As much as Generative Artificial Intelligence, or genAI as we refer to it here, is used to describe tools like the ones mentioned above, it really describes a fundamental shift in the design of instructions that can be followed to complete a task, also known as algorithms (Danaher et al. 2017). This shift entailed a move away from ‘top-down’ algorithms (where rulesets for algorithms were exhaustively defined by programmers) to ‘bottom up’ machine-learning algorithms (where an algorithm is essentially trained to develop its own set of rules). Machine-learning algorithms involve giving the machine some data, a goal, feedback to tell it when it is on the correct path, and then some time alone to work out the best way to complete the instructions and achieve the end (Fry 2018, p. 11). In other words, genAI relies on the use of various statistical techniques to produce the outputs many of us marvel at (Gorwa et al. 2020; Whiteley 2023).
That’s right, the poem you asked ChatGPT to generate is the result of probabilistic calculations that select words based on their likelihood to fit (statistically) the context. Sounds less romantic, but perhaps that’s important— INKE partner John Maxwell (Associate Professor of Publishing at Simon Fraser University) certainly thinks so.
Responding to this piece over email, Maxwell wrote, “I am increasingly concerned that, in responding (both individually and collectively; causally and formally) to the rise of deep learning, generative text and image technologies – by following industry’s suit in calling them ‘AI’ – we are both playing into the hands of a group of very canny investment promoters and at the same time undermining our own critical sensibilities around the various innovations gathered under this umbrella.” (You can read John Maxwell’s response in full by scrolling to the ‘Responses from the INKE Partnership’ section below.)
Maxwell isn’t alone in sounding the alarm. In a post for the London School of Economics and Political Science blog, Paul Whiteley argues that ‘artificial intelligence’ is a misnomer, one invented by computer scientist John McCarthy in 1950 (and anthropomorphised by the rest of us ever since).
Even ChatGPT agrees. When we asked ChatGPT if it thought users tend to anthropomorphize it, it generated the following ‘statistical output’:
Yes, people often do anthropomorphize me! It’s quite natural; human language and interaction often involve assumptions about shared experiences, understanding, or intentions. Since my responses are conversational and mimic human language patterns, users might sometimes attribute thoughts, feelings, or intentions to me, even though I don’t have experiences, consciousness, or personal understanding. The familiarity of conversation can make it easy to forget that responses come from patterns rather than an inner sense of self. It’s fascinating because it highlights how strongly people are wired to seek connection, even with technology that’s fundamentally different from human minds.
John Maxwell (Associate Professor of Publishing at Simon Fraser University and INKE partner) cautions against this anthropomorphizing, however, because it makes it difficult to develop meaningful policy responses:
While, for instance, ‘LLM’ defines a specific technology with specific characteristics to which we might develop a coherent policy response, the term ‘AI’ is a sci-fi catch-all that carries with it a set of well-publicized (and monetized) utopian, dystopian, and eschatological connotations. We are not likely to develop any meaningful response to such an ill-defined object. I am hopeful that, as we begin to develop guidelines and policy, we can start to meaningfully differentiate and refer to these technologies by their specific names and functions, instead of repeating the hype: that these massive, mysterious, corporate-owned black boxes will simultaneously solve humanity’s problems while also posing an existential threat.
EU Announces Approval of World’s First Artificial Intelligence Act
On the policy front, the Council of the EU officially approved the world’s first artificial intelligence act this past May. The legislation will follow a “‘risk-based’ approach, which means the higher the risk to cause harm to society, the stricter the rules,” explains the Council in a Press Release.
“With the AI act, Europe emphasizes the importance of trust, transparency and accountability when dealing with new technologies while at the same time ensuring this fast-changing technology can flourish and boost European innovation,” says Mathieu Michel, Belgian secretary of state for digitisation, administrative simplification, privacy protection, and building regulation (Council, 2024).
Four governing bodies that include a “scientific panel of independent experts” will be responsible for ensuring the AI act is enforced. Another three include an AI Office within the European Commission, an advisory forum for stakeholders, and an AI Board that consists of member states’ representatives (Council, 2024).
Closer to Regulating Artificial Intelligence in Canada?
Here in Canada, the inclusion of artificial intelligence in Canada’s Bill C-27 (Digital Charter Implementation Act, 2022) remains under committee consideration in the House of Commons since completing its second reading over a year ago in April 2023.
According to Joanna Redden (2024), an Associate Professor in the Faculty of Information and Media Studies at Western University who is critical of Canada’s proposed AI legislation in its current form, many Canadians already have little trust in the growing use of AI, even though Canada was the first country to introduce a national AI strategy. Echoing concerns held by those such as Andrew Clement (Professor Emeritus, University of Toronto), as well as McKelvey and colleagues (2024), Redden suggests this distrust is due, in part, to a lack of meaningful public consultation, “despite deep concerns from the public.” Redden notes how proposed legislation is not only “already out of step with the needs of Canadians,” it also “falls short of the regulatory approaches taken by other nations” such as the one recently approved by the European Union or the Executive Order issued by the White House in 2023.
Some of the needs identified by Redden include ensuring uses of AI by businesses and governments are transparent, and that there are toothy mechanisms for maintaining oversight and accountability. In addition, Redden argues that dedicating funding to ensuring public “AI registries” are maintained are critical (note: AI registries, such as this one developed by Redden and colleagues, track how and where AI and other automated systems are being used).
If successfully passed, the Digital Charter Implementation Act (otherwise known as “An Act to enact the Consumer Privacy Protection Act, the Personal Information and Data Protection Tribunal Act and the Artificial Intelligence and Data Act and to make consequential and related amendments to other Acts”) would be the first Act focused on regulating AI in Canada.
Currently, generative AI managers and developers in Canada are asked to voluntarily commit to a Code of Conduct on the Responsible Development and Management of Advanced Generative AI Systems. The voluntary code foregrounds six outcomes: accountability, safety, fairness and equity, transparency, human oversight and monitoring, and validity and robustness.
Responses to AI in Canada from Journals, Post-Secondary Institutions, Scholarly Associations, and Granting Agencies
In a step intended to curb the misuse of generative AI, several journals (e.g., see here, here, and here), post-secondary institutions (e.g., see here and here for examples, as well as here for broader commentary and here for HESA’s AI in Canadian postsec observatory), scholarly associations (e.g., see here and here), and granting agencies (read more here and here) are in the process of developing and releasing statements, policies, and/or guidelines that set out expectations regarding what is considered fair or responsible use of AI in their respective contexts.
Concerns relating to accountability, authorship, transparency, disclosure of use, responsibility, accuracy, bias, safety, confidentiality and privacy, and copyright and intellectual property consistently recur across these statements, policies, and guidelines. Several journals as well as larger publishers such as SAGE, Elsevier, and Wiley explicitly forbid listing an AI tool (such as ChatGPT) as an author because, as Tulandi (2023) writes in one Canadian medical journal, “authorship implies responsibilities and tasks that can only be attributed to and performed by humans” (para 7).
In addition to the concerns raised above, CARL, or the Canadian Association of Research Libraries (2023), recommends that responses to generative AI also consider social impacts associated with AI use, including who has access to generative AI tools and who may not, whether for financial reasons or otherwise. Citing the work of a group of researchers associated with The Open University in the UK, CARL (2023) also questions whether combining open-access and pay-walled articles in the development of datasets used to train generative AI could enhance its reliability and credibility, though they also note the potential for legal implications.
As debates relating to the Digital Charter Implementation Act continue, Prime Minister Justin Trudeau (2024) has announced plans to dedicate 2.4 billion dollars towards ‘securing Canada’s AI advantage,’ though it is worth noting that just 2 percent of this funding (50 million dollars) is meant for exploring the social impacts associated with the increased use of AI.
Key Questions and Considerations
The insights and signals outlined above indicate that recent developments in generative AI, along with its continuing integration into various academic contexts (along with society more broadly) pose some significant challenges for those involved in developing policy responses.
At the same time, while paying attention to existing responses remains important, equally important and worthy of consideration are the discursive silences or absences (i.e., what is not being said, or perhaps what is not being amplified and picked up in reports on this topic).
Some proposed notable absences include whether LLM use of online content is considered fair dealing (along with other copyright considerations). Other important absences include the social and ecological impact of large language models (LLMs) and the materials, labour, and compute power required to sustain them — paralleling conversations about sustainability in the digital humanities (such as this one by Johanna Drucker 2021 or this one by Joanna Tucker 2022). Others include the implications of LLMs for Indigenous data sovereignties (UNESCO releases report related to ensuring data sovereignty in light of accelerated developments in AI), as well as the implications that Indigenous (data) sovereignties can have for thinking through ‘ethical AI’ (Gaertner 2024; Roberts and Montoya 2024).
For instance, reflecting on the intersection between new(er) forms of AI and Indigenous data sovereignty and stewardship, Associate Professor David Gaertner (Institute for Critical Indigenous Studies, UBC) makes further connections between AI trained on large language models and the impacts of settler colonialism:
AI technologies, trained on large language models, mirror the disenfranchisement and violence imposed through settler colonialism while redistributing it at scale. Algorithms institutionalize a power dynamic where dominant linguistic and cultural narratives are further entrenched and amplified in the social infrastructure while personal expression is rendered increasingly obsolete and niche. The emergence of reports detailing how the internet is currently cannibalizing itself and generating new AI content from existing AI material, also known as the ‘dead internet theory,’ further amplifies these concerns.
However, many will point out that questions surrounding the ethical development and use of AI are often directed at private companies and organizations that make use of their models. What about community members, academics, and stakeholders? Will AI remain solely in the hands of private firms and for-profit companies?
Not necessarily. Some are beginning to explore commons-based approaches to machine learning, such as Open Future—a non-profit based in Europe that wants to explore the potential risks and benefits associated with new frameworks for sharing AI models under open licenses. The following comment, which John Willinsky (Founder, Public Knowledge Project and INKE partner) shared with us, moves in similar direction of thought:
While there are reasons to be concerned about recent advances in AI, academics also have a responsibility to explore the potential contributions and advances that AI may hold for research and scholarship. The Public Knowledge Project has for some time looked to AI to solve pressing issues around resource equity and quality in scholarly communications, if only with limited success. It is now engaged in research on the ability of Large Language Models to address the long-standing challenge of developing a sustainable means for Diamond OA journals to publish in the standard formats of HTML and PDF, as well export files in JATS XML. (John Willinsky’s comment can be read in full below.)
Interestingly, several AI models have already been released under the Apache 2.0 license (a permissive software license that allows use of the software for any purpose, including distribution, modification, and the distribution of modified versions without royalties, according to Wikipedia), and more may be released before the end of this year (there are rumours that OpenAI will release its latest model, ‘Orion,’ in December 2024).
Responses from the INKE Partnership
Response from John Willinsky (Founder, Public Knowledge Project):
While there are reasons to be concerned about recent advances in AI, academics also have a responsibility to explore the potential contributions and advances that AI may hold for research and scholarship. The Public Knowledge Project has for some time looked to AI to solve pressing issues around resource equity and quality in scholarly communications, if only with limited success. It is now engaged in research on the ability of Large Language Models to address the long-standing challenge of developing a sustainable means for Diamond OA journals to publish in the standard formats of HTML and PDF, as well export files in JATS XML. The principal focus of this work is to establish if LLMs can be sufficiently tuned to reliably automate HTML and JATS XML markup of author manuscripts (given that such markup currently requires technical skills or payments that exceed the capacity of most Diamond OA journals. This work has reached an initial proof of concept stage, with further work continuing around its comparative value (given other tools) and ways of incorporating and sustaining such a markup service in the editorial workflow.
Response from John Maxwell (Associate Professor of Publishing at Simon Fraser University):
I am increasingly concerned that, in responding (both individually and collectively; causally and formally) to the rise of deep learning, generative text and image technologies — by following industry’s suit in calling them “AI”— we are both playing into the hands of a group of very canny investment promoters and at the same time undermining our own critical sensibilities around the various innovations gathered under this umbrella. The term “AI” has for many decades already served as an aspirational brand name for a wide variety of computational technologies. Today, it is the brand name for massive capital investment funding a collection of disparate deep learning approaches: LLMs, Diffusion Models, facial recognition systems, and other tools. These are specific technologies which both exist and are making an impact on scholarship and labour. But while, for instance, “LLM” defines a specific technology with specific characteristics to which we might develop a coherent policy response, the term “AI” is a sci-fi catch-all that carries with it a set of well-publicized (and monetized) utopian, dystopian, and eschatological connotations. We are not likely to develop any meaningful response to such an ill-defined object. I am hopeful that, as we begin to develop guidelines and policy, we can start to meaningfully differentiate and refer to these technologies by their specific names and functions, instead of repeating the hype: that these massive, mysterious, corporate-owned black boxes will simultaneously solve humanity’s problems while also posing an existential threat. It would appear that venture capitalists and hype-men have beaten us to the starting line here; we need to rapidly bring a critical apparatus (e.g., from science- and technology studies) to the analysis and development of effective policy and guidelines.
References
“C-27 (44-1): An Act to Enact the Consumer Privacy Protection Act, the Personal Information and Data Protection Tribunal Act and the Artificial Intelligence and Data Act and to Make Consequential and Related Amendments to Other Acts.” Parliament of Canada. Accessed October 30, 2024. https://www.parl.ca/legisinfo/en/bill/44-1/c-27.
Broussard, Meredith. 2024. More than a Glitch: Confronting Race, Gender, and Ability Bias in Tech. First MIT Press paperback edition. Cambridge, Massachusetts London, England: The MIT Press.
Canadian Association of Research Libraries. 2023. “Generative Artifical Intelligence: A Brief Primer for CARL Institutions.” https://www.carl-abrc.ca/wp-content/uploads/2023/12/Generative-Artificial-Intelligence-A-Brief-Primer-EN.pdf.
Drucker, Johanna. 2021. “Sustainability and Complexity: Knowledge and Authority in the Digital Humanities.” Digital Scholarship in the Humanities 36 (2): ii86–ii94. https://doi.org/10.1093/llc/fqab025.
Fitzpatrick, Kathleen. 2024. “Open Infrastructures for the Future of Knowledge Production.” Keynote presented at Creative Approaches to Open Social Scholarship: Canada (An Implementing New Knowledge Environments (INKE) Partnership Gathering). Montreal, Canada. https://doi.org/10.25547/6GG1-7B37.
Fry, Hannah. 2019. Hello World: Being Human in the Age of Algorithms. New York: W.W. Norton & Company.
Gaertner, David. 2024. “Indigenous Data Stewardship Stands against Extractivist AI.” UBC Faculty of Arts (blog). June 18, 2024. https://www.arts.ubc.ca/news/indigenous-data-stewardship-stands-against-extractivist-ai/
Gorwa, Robert, Reuben Binns, and Christian Katzenbach. 2020. “Algorithmic Content Moderation: Technical and Political Challenges in the Automation of Platform Governance.” Big Data & Society 7 (1): 2053951719897945. https://doi.org/10.1177/2053951719897945.
McKelvey, Fenwick, Sophie Toupin, and Maurice Jones. 2024. “Introduction.” In Northern Lights and Silicon Dreams: AI Governance in Canada (2011-2022), edited by Fenwick McKelvey, Johnathan Roberge, and Sophie Toupin, 7–30. Montreal, Canada: Shaping AI. https://www.amo-oma.ca/wp-content/uploads/2024/04/ORA-CA-Policy.pdf.
Noble, Safiya Umoja. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: New York university press.
O’Neil, Cathy. 2017. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. New York: Crown.
Pride, David. 2023. “CORE-GPT: Combining Open Access Research and AI for Credible, Trustworthy Question Answering.” CORE (blog). March 17, 2023. https://blog.core.ac.uk/2023/03/17/core-gpt-combining-open-access-research-and-ai-for-credible-trustworthy-question-answering/.
Redden, Joanna, Jessica Brand, and Vanesa Terzieva. 2020. “Data Harm Record.” Data Justice Lab. 2020. https://datajusticelab.org/data-harm-record/.
Redden, Joanna. 2024. “Canada’s AI Legislation Misses the Mark.” Western News (blog). April 12, 2024. https://news.westernu.ca/2024/04/proposed-ai-legislation/.
Roberts, Jennafer Shae, and Laura N. Montoya. 2023. “In Consideration of Indigenous Data Sovereignty: Data Mining as a Colonial Practice.” In Proceedings of the Future Technologies Conference (FTC) 2023, Volume 2, edited by Kohei Arai, 180–96. Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-47451-4_13.
Roberts, Jennafer Shae. 2024. “In Consideration of Indigenous Data Sovereignty: Data Mining as a Colonial Practice.” Montreal AI Ethics Institute (blog). January 23, 2024. https://montrealethics.ai/in-consideration-of-indigenous-data-sovereignty-data-mining-as-a-colonial-practice/.
Tucker, Joanna. 2022. “Facing the Challenge of Digital Sustainability as Humanities Researchers.” Journal of the British Academy, 10: 93–120. https://doi.org/10.5871/jba/010.093.