-
Preprint content moderation avant la lettre?
Content moderation is vital on preprint servers like arXiv.org. Today, more than ever, moderators are experiencing immense pressure due to the rising number of AI slop papers, while sociologists have shown how moderators function as gatekeepers, who scan and sort submissions.1 But how did the practice of preprint content moderation start?
While preprints were initially distributed by authors themselves, moderation was inherent to the practice of preprint communication. But as soon as the of preprints became the purview of libraries in the late 1950s and early 1960s, the practices of controlling the preprint literature had to be made more explicit. However, since libraries did not perform peer review, or other forms to judge the quality of the academic content of papers, their practices encompassed collecting, sorting, and registering incoming papers. Library staff improved their handling of the newest accessions and were innovative in creating methods to handle papers, as the influx of preprints to the library grew and researchers’ information demands increased – essentially prefiguring the later content moderation on preprint servers.

Library staff cataloging preprints at the CERN Central Library in 1968 (c) CERN “No attempt would be made to select the papers according to the scientific value of the work presented therein.”2 – this was how the library at CERN qualified its work to control the preprint literature. With the rising demand in preprints, the library at the Geneva laboratory, began to understand itself as an important gatekeeper of the preprint literature, however, without appropriating for itself editorial qualities like journal editors. Instead, the 1965 CERN Library Staff Manual states: “The usefulness of a special library depends in large measure on its selectivity.” At the same time it warns that a “heterogeneous mass of vaguely related documentation can choke or crowed out the relevant and important items.”
Libraries were aided by physicists in making their selections and innovative in developing new methods to select and classify content. At the CERN library, staff introduced a pragmatic categorization system to manage the constant stream of preprints . Papers were given simple subject categories from the field of high-energy physics, such as “theoretical particle physics,” “high energy experimental physics,” “experimental techniques,” “detectors,” or “accelerators,” purely to enable the list to be sorted in a hierarchical order.3 Libraries also employed “scientific information officers,” who scanned papers sent in and helped categorize them.4 Scientific information officer was a specific role in the organization, usually for people trained in both physics and librarianship, who had retired from active scientific research but remained in touch with recent developments in the field due to their academic expertise. As with content moderation today, the library at CERN recognized early on that such forms of content moderation introduced “certain dangers”: the manual admits, “the rejection of ‘border’ material is inevitably somewhat arbitrary.” There are no ‘objective’ criteria, which can determine what counts as useful information.

Job ad by the CERN Scientific Information Service in the September 1967 Issue of the New Scientist. - Reyes-Galindo, L. Automating the Horae: Boundary-Work in the Age of Computers. Soc. Stud. Sci. 46(4), 586–606 (2016). ↩︎
- See Roth, P. H. Formalizing informal communication: an archaeology of the pre-web preprint infrastructure at CERN. Minerva (2026). ↩︎
- Ibid. ↩︎
- Roth, P.H. How libraries classified physics preprints before arXiv and set the stage for distinguishing insiders from outsiders. Nat Rev Phys 8, 188–189 (2026). ↩︎
-
Tuesdays 10 o’clock at the CERN library
Tuesdays 10 o’clock was an important time for physicists working at CERN. Every week at that time, starting in the early 1960s, a ritual would play out that also structured much of the local research community’s habits of acquiring new information of what was happening in high-energy physics and related fields. As one informant who used to work for the Scientific Information Service at CERN described to me:
“A librarian would appear carrying a large pile of newly-received preprints, each one marked with its report number for subsequent filing. She would lay out the preprint one-by-one on the display on top of the wooden drawers where the back collection was stored, the preceding ones were taken away for copying in response to requests and later in the day filed. Each preprint had a small slip attached by a paper clip to the first page, on which one could give one’s name if one wanted to be sent a copy in the internal mail.”

Image from September 1968 CERN Courier (No. 9 Vol. 8) showing the preprint displays at the library in the back. “Preprints” have served as an important means to rapidly inform members of the global physics community about the newest developments and findings in the field. While personal contact was essential to keep afloat of rapid developments until the 1940s, the sharing of lecture notes, unpublished reports, or copies of manuscripts through the mail or at gatherings had later gained considerable importance to compensate for the dispersion of the community. However, as historian David Kaiser notes: “No one could afford to rely on published sources alone.”1 High-energy physicists are known as a lot that thrives on oral communication and personal networks. The infamous discussions in front of blackboards have been a staple of the popular culture of physics across the 20th century.
Thus, the practice of sharing one’s notes or manuscripts was based on personal contacts. Authors would keep track of those researchers, who were active in the same research are or who could otherwise be interested in ones work. So, asking them to send their papers and notes to an institution was a breach in a system based on a convention of “private communication”. In a study of communication behaviours among high-energy theoretical physicists conducted for the American Institute of Physics and published in 1967, the authors reveal that the majority of scientists rely on “personal mailing lists” to keep up with the newest developments in the field.2

The library at CERN played a major role in the early developments of a preprint culture in physics. In the late 1950s, the librarian Luisella Goldschmidt-Clermont ventured on a daring mission: She began soliciting preprints from physicists to collect and display at the library. When Mrs. Goldschmidt-Clermont started asking physicists to not only send their unpublished papers to the CERN library to put on display for the local research community, but also asked to share their personal contacts with the institution, she was introducing radical changes into the communication behaviors of high-energy physicists. However, while it was unconventional to request that physicists send their preprint papers to the library instead of directly to their network of colleagues, the local physicists, with whom she spoke, showed support for her project; while the higher echelons at the CERN library voiced concern that her preprint system might distort the mechanisms for making claims to priority in science, which is usually registered through formal publication. So, Goldschmidt-Clermont took recourse to the one argument that couldn’t be denied: the mandate of CERN. In a proposal to the directorate in 1961 to set up the preprint collection and distribution system, she therefore emphasized the “openness” policies that were enshrined into the founding of CERN:
“… CERN’s contribution to this [preprint system] is intended mostly as a ‘conversion’ of its present efforts in the field of preprints towards this project. CERN would benefit directly form this conversion as more material would become available to its scientists. CERN would also benefit indirectly from this conversion; by an inexpensive gesture of good will, it would share with the Member States laboratories a privilege (the preprint service) which CERN is almost alone to enjoy at present in Europe; by its Convention, CERN is bound to contribute to ‘international cooperation in nuclear research, … This cooperation may include … the promotion of contacts between … scientists, the dissemination of information, …’ (CERN Convention, Article II, para 3 c)”
Thus, it could be argued that the CERN library is where important groundwork for the current culture of “open science” was laid in the 1960s.
- David Kaiser (2005). Drawing Theories Apart. The Dispersion of Feynman Diagrams in Postwar Physics. Chicago/London: University of Chicago Press. ↩︎
- Miles A. Libbey, Gerald Altman (1967). The Role and Distribution of Written Informal Communication in Theoretical High Energy Physics. New York: American Institute of Physics. ↩︎
-
DESY and the High-Energy Physics Index
From May 21 to 23, 2024, I was a guest at the “Deutsches Elektron-Synchronton” (DESY), Germany’s national accelerator center located in the North-East of Hamburg. It was established in 1959 and has contributed substantially to particle physics research over the decades. What is less known is that the library at DESY began publishing a bi-weekly “High-Energy Physics Index” (HEP Index) in 1963, the print-out of a computer database of physics literature, which included a list of recent preprints, where all the bibliographical information was sorted by author, subject, and report number indexes. As an international publication, the HEP Index has contributed significantly to the normalization and formalization of preprints in the field, and therefore constitutes a crucial bibliographic instrument in the history of preprints.

The decommissioned ARGUS detector visitors see as they enter the DESY site. Photo (c) the author. During my visit, I was a guest of DESY’s library, located in one of the central administrative buildings on campus. I had the pleasure to learn much about the early work at the library from Dietmar Schmidt, who began working at the library in 1973, became the head in 1982, and retired in 2007, as well as from Antje Daum, who has been a librarian at DESY since the 1980s. Both were very kind in showing me around the site and displayed a sincere interest in my project.

A printed issue of the HEP Index, first published in 1963 by DESY. Photo (c) the author, courtesy of the DESY library. Schmidt told me that what distinguished the efforts at DESY from existing ones to catalog the physics literature at SLAC or CERN at the time was, first of all, that the literature documentation was not restricted to preprints and reports, but covered “conventional” physics literature as well, i.e., journal papers, conference proceedings, and (text) books. Second of all, literature documentation at the DESY library was computerized from the very beginning. Schmidt, who studied physics at the University of Hamburg, was employed in part because he possessed expertise in computer programming. When he joined DESY in 1973, he said that the first task Kurt Mellentin, then director of the library and documentation service, gave him was to “rewrite the existing programs for literature documentation – correction programs, print programs for the HEP Index – which were all coded in IBM Assembler, into PL/I,” which took him one and a half years. Schmidt told me that PL/I was introduced, “because it enabled fine word processing [schöne Textverarbeitung]” and that it was in use until the mid-1990s, when Unix systems took over.

Magnetic tapes containing the cumulative database of the HEP Index. Photo (c) the author, courtesy of DESY. Compiling the HEP Index required not only bibliographic skills, but also a considerable expertise in high-energy physics. For this reason, many who worked on making the Index were (former) physicists now working in the library and documentation service. Physicists, active in one of the many of DESY’s research groups, were also regularly consulted for their understanding of the field. Compiling the Index for the bi-weekly publication was rather unconventional: the newest library acquisitions – journal issues and conference proceedings – were scanned “manually” for relevant titles to include in the index. Additionally, a system had established, similar to CERN and SLAC, in which authors would send their unpublished or submitted preprints to the DESY library. These too were reviewed for inclusion in the HEP Index.
The HEP Index had a further significance, not just as a bibliography of high-energy physics literature; it also contributed to the preprints and reports database at SLAC in California. beginning in the early 1960s, the DESY library shared its cumulative database with the SLAC library, particularly for its lists of “conventional” publications. The magnetic tapes containing the bibliographic information were shipped across the Atlantic in exchange for tapes containing the preprints acquired at SLAC. This transatlantic connection eventually fed into the establishment of the global high-energy physics literature online database SPIRES at Stanford, which today is the INSPIRE website containing all the bibliographical information in the field.