Illustration of a robot reading books. Artificial intelligence

On AI and its uses of scientific research

Illustration of a robot reading books. Artificial intelligence

Artificial intelligence. Image by Mohamed Hassan, Pxhere.

Having spent a big portion of his professional career in the Global South, the Sea Around Us principal investigator, Dr. Daniel Pauly, quickly learned how difficult and onerous it is to access scientific literature in the region, even when working at renowned universities or institutes.

The effort that researchers working in countries outside North America, Europe and Australia have to make to write their dissertations, papers and other scientific contributions would be unimaginable for their peers in the Global North. From outdated library collections to poor bandwidth Internet connections, downloading a PDF of a recent publication for free with the click of a few buttons is unheard of in many places. If payment and delivery are required, then the task may become even more difficult.

Knowing this, Dr. Pauly and the Sea Around Us have strived to make research available to all for over a quarter century. Whereas it’s by pushing students and colleagues to distribute their work in Fisheries Centre Research Reports, encouraging those who can afford it to publish papers in open access journals, or even making some available on websites or personal pages – and doing so ourselves –, part of our mission is to make sure most of our data and analyses are freely accessible. Through outreach and, for certain content, even translations to several languages, we also make an effort to put these materials in front of interested parties and remind them that they are always available.

As a not-for-profit research unit that works globally, treats each country equally and collaborates with local researchers whenever possible, our interest is to advance science and promote conservation everywhere. But what happens when all of this effort is used by for-profit corporations for their own benefit?

A recent article in The Atlantic revealed that Meta, the parent company of Facebook, Instagram and WhatsApp, and OpenAI, which owns ChatGPT, DALL-E, and Sora, downloaded a big chunk of the data set contained in Library Genesis, or LibGen, to train their artificial intelligence models. LibGen hosts more than 7.5 million books and 81 million research papers, which have been pirated and uploaded to its system. Among these books and papers, there are at least 30 authored or co-authored by Dr. Pauly, Dr. Maria ‘Deng’ Palomares, the Sea Around Us project manager, and other colleagues. This is surely the case for many other researchers and research groups.

According to The Atlantic investigation, although Meta employees had some ethical concerns about using that pirated-yet-copyrighted material to train its AI, they ended up getting approval to use the file-sharing protocol BitTorrent to download it, which, consequently, forced them to upload such material to other users. The article also notes that it is impossible to know which parts of the data set they, as well as OpenAI, used to train their models.

The question is, then, whether it is fair that research that has been produced for the public good ends up being used, through legit or murky mechanisms, by big tech to sell ads, keep people glued to their platforms, and sell their digital products.

Similarly, we ponder the risks of feeding research meant to support global conservation, climate resilience, equity, diversity, and inclusion to platforms that can be trained to be biased and spread misinformation that supports anti-environmentalism, climate denialism, inequality, racism, xenophobia, violence, and misogyny, among many other unethical values that threaten social justice and human rights.

Further development of artificial intelligence and large language models like ChatGPT will, undeniably, continue to advance. However, sensible policies may have to be put in place to regulate the way these systems access, interpret, translate and disseminate information, scientific or any other type, so that human, social, and environmental rights and freedoms are not jeopardized.