Page 118 →Chapter 6 Automated Curation
Bots are perhaps the central innovation in digital encyclopedic production, representing a significant departure from the production logistics involved in composing print encyclopedias. More than 1,700 bots currently perform a wide variety of writerly tasks in Wikipedia, which is an environment particularly suited to them. The Wikipedian community and audience work alongside bots that consist of automated applications, usually built from a simple scripting language, which execute basic actions when triggered by specific input. The exigencies of information overload and human frailty in the form of short attention spans, error-proneness, and slowness provide impetus for their presence and labor in this particular situation. Bots can handle these tasks more efficiently, more consistently, and more correctly. While they indirectly shape textual production by freeing up human contributors to concentrate on higher-order concerns, they themselves also directly alter the physical and topical scope of the text by composing and editing text, inserting images, maintaining links, and monitoring human activity.
Their automated labor makes possible the inclusion of entries on less-popular topics such as small towns. The time-consuming tedium of gathering such specific information on hundreds of towns, cities, counties, or parishes and then composing individual entries on each one was simply beyond the reach of a writer working by hand, and so no major print encyclopedias included them.1 Successful planning and execution of an encyclopedia requires a focus on managing breadth so as to result in a useful product. Devoting compositional energy to minor town data takes both time and energy away from creating articles that a wider spectrum of readers might find useful. Additionally the practical considerations of print production make updating specific, quickly outdated information such as census numbers a time-consuming and expensive task that is slow to see distribution. In contrast the affordances of a digital, real-time encyclopedia mean such updates can be made without physical or opportunity cost—and without much human effort.
Page 119 →The human effort involved in collaborating with bots is largely confined to three activities: initial design and development, policing activity, and fixing or updating bots as they experience breakdown or complete their tasks. In each of these stages, human intention and initiative are evident. At deployment bots become proxies that sally forth on their own, carrying out human intention but exerting their own agency that is both productive and occasionally perverse. They also demonstrate the recursive properties of articulated agency through instances of breakdown, repair, and repurposing. As they interact with humans, technologies, texts, and a variety of community attitudes toward automated processes, they serve as one of the most striking examples of distributed authorial agency within the system.
Nonhumans have always been with us, and we have always worked with them, whether we acknowledged it or not, as Bruno Latour has argued extensively over the past quarter century. Now those of us who work with words are working alongside bots more each day. We are so accustomed to relying on automated algorithms such as spell-check or search engines that we barely give them a thought. Most writers have also been working with intelligent agents in one form or another since Microsoft launched Windows 97, which included Office Assistant. This version introduced the all-too-familiar animated helper Clippy, which constantly inquired as to whether human users needed help and then accessed answers from a database as required. When Google Wave launched in 2009, full-fledged bots were built-in composing elements, accessible from the GUI. The world was not ready for Wave, which Google discontinued development on in 2010. All current indications are that most casual users still aren’t quite ready to push a button and knowingly loose a bot on our texts. But corporations are: robots developed by the company Narrative Science2 routinely contribute articles on stock fluctuations to Forbes and write data-driven pieces for prominent newspapers like the New York Times. In 2014 the Los Angeles Times was the first paper to publish a story on an earthquake thanks to a bot that posted the story three minutes after it happened. By late 2014 Wired reported that bots now outnumber humans on the web, accounting for 56 percent of all website visits.3
Questions about automated authorship have concerned intellectual property specialists for decades, since a paper on the subject appeared in the Journal of the Patent Office Society in 1969.4 Related articles on legal aspects of automated authorship continue to contend with the inevitable issue of nonsentient agency, which has no legal precedents.5 It is certainly not too early to consider the implications of automated writing agents. Their collaborative presence will only become increasingly pervasive in composing environments, just as it continues to become ever more present in our daily lives as we come to rely on prescriptive bots like iPhone’s Siri to help manage our daily affairs.
Page 120 →Bots have served as the material originators, composers, and editors of thousands of articles, performing work that is essential to the ever-broadening scope of digital encyclopedias. The article on Darwin, Minnesota, a town that is home to the Biggest Ball of Twine in Minnesota, serves as a microcosm of these automated performances and points to some pressing questions about the specific form of rhetorical agency performed by nonsentient entities. I suggest that these bots are actors that, through interaction with each other and with humans, demonstrate rhetorical agency.
Historical Context
Humans have sought this sort of automated help for millennia, although our efforts at design and deployment of robotic assistants have more often than not fallen short of our goals. The earliest recorded successes in developing automated tools to mark or keep track of information were water clocks, which were built in Babylon and Egypt in the sixteenth century B.C.E.6 The Greek engineer Ctesibius used his extensive knowledge of pneumatics and hydraulics to design a fanciful and complex water clock that incorporated automated moving figures circa 270 B.C.E. Hero of Alexandria, who extended Ctesibius’s research in his handbooks Automata, Pneumatica, and Mechanica, wrote the first documentation on workable robots aside from mythology.7 The Antikythera Mechanism, a differential gear arrangement for calculating astronomical information, is thought to have been built around 87 B.C.E. These mechanisms for calculating information form just some of the early products of what Kevin LaGrandeur has called “a very old archetypal drive that pits human ingenuity against nature via artificial proxies.”8
Most automatons, such as Ctesibius’s moving figures, were intended primarily for decoration, entertainment, and occasionally education. This remained true well into the twentieth century, when Disney Theme parks’ animatronics served all these purposes simultaneously. However, as Jonathan Sawday has written, in the Middle Ages automatons also became integrated in narrative tropes of wizardry, magic, and natural philosophy. Automatons that served as oracles were attributed to figures such as Albertus Magnus, whose automaton was purportedly destroyed by Thomas Aquinas as a device of the devil. Stories of brazen heads who functioned as oracles appeared in continental tales, demonstrating the desire for automatons who could not just perform or calculate, but invent and bestow information.9
The Enlightenment saw an increased interest in automatons. A prominent early example was Vaucanson’s Duck, created between 1733 and 1739. This duck-shaped robot was purported to move, eat, and defecate by use of an automated digestive system that was shown to be a hoax. Still Vaucanson displayed the duck in Paris and then toured throughout the major cities of the continent and on to Page 121 →England, sparking the public’s modern fascination with automatons. In her visual history of automatons, Frances Terpak has argued that “the ingenious mechanical workings of such frivolous automata were appreciated not only by the general public but also by the scientifically inclined,” who appreciated an opportunity to educate themselves on innovative mechanisms10 and occasionally used them for educational and research purposes. Chambers found the topic significant enough to include a brief article “Androides” in his 1728 edition, describing it as “an Automaton, in figure of a Man; which by virtue of certain Springs, &c. duly contrived, Walks, Speaks, &c.”11 He referenced Albertus Magnus and mistakenly cross-referenced it to an article titled “Automaton” that did not actually appear in the edition.
Engineers had better luck with simple automatons that could aid in textual reproduction. The first manual for a pantograph, which enables a human user to trace a 3-D object onto a flat surface, was published in 1631 by Christoph Scheiner, a German Jesuit and professor of mathematics. He had purportedly learned of the device thirty years earlier from a painter living in southern Germany. Terpak has written that “even though the pantograph produces only an outline of the original . . . its precision and efficiency presented a major breakthrough for the reproduction of images.”12 This machine accomplished work only through the simultaneous movement of a human user.
The eighteenth century saw the development of writing automatons that performed on their own but could create only text that included short, preprogrammed sayings or drawings. One of the first such devices, built in 1753 by Friedrich von Kanuss, consisted of a metal hand that emerged from stylized smoke to write six lines of text in four minutes. Other writing automatons typically were built as seated, articulated dolls that were programmed by stacked cams, such as Pierre Jaquet-Droz’s “The Scribe.”13 Other drawing automatons were programmed to create portraits of royalty and presented as political gifts to those depicted. These robots existed primarily for entertainment and could only repeatedly scrawl their programmed texts. They neither educated their audiences nor produced information.
In the twentieth century, the military-industrial complex heavily sponsored development of intelligent systems and basic artificial decision making. This technology was also deployed in the private sector in industries as diverse as paper-making14 and air traffic direction.15 The advent of the web made simple automated coding available to anyone with sufficient access and knowledge, and bots have become a vital element of the Wikipedian infrastructure.
Bots in Wikipedia
Perhaps the most well known of the Wikipedia bots is RamBot, named after Derek Ramsey, who created it in 2002 as an out-of-work computer science graduate Page 122 →from RIT.16 He noticed the dearth of articles on towns and cities and surmised that most of the information needed to build out these texts could be found on the Census Bureau’s website. He also noticed that many casual editors were reluctant to create a brand-new article page and hypothesized that creating a series of basic article stubs would encourage them to start contributing to articles on places. After creating the first three thousand county articles by hand, he turned to the daunting task of creating 33,932 city articles. Inspired by other bots already handling small tasks within the system, he wrote his own script to handle the entry creation. During the week of October 19–25, 2002, Rambot completed all the city articles. As Andrew Lih noted in his history of Wikipedia, that week saw the first mass article creation in the project’s history, which extended the then fifty thousand article project by 60 percent. However, the community reception of this development was mixed: “Others viewed his work as an abomination—an unintelligent automaton systematically spewing rote text, fouling the collection of articles. Wikipedia was supposed to be a project started by humans and controlled by humans. Was an article where every other word was a number or a statistic a well-crafted start or a data dump?”17 Eventually the community reached a consensus about the value of the work; the articles have stayed intact ever since.
As of this writing, 1,724 bots work on Wikipedia, performing a surprising variety of tasks that were formerly the exclusive domain of humans who were expected to use their critical judgment and authorial agency to responsibly attend to these duties.18 Since its initial contribution, Rambot has been updated with functionality to improve existing entries with some intelligence.19 Plans are underway for the Rambot Translation Project, which will translate all Rambot articles into other languages for inclusion in other language-based Wikipedia editions.20 Other bots work to discipline human users. SineBot monitors human identity by adding {{Unsigned}} and {{UnsignedIP}} tags to unsigned edits, adding signatures when unsigned users can be traced, tracking recurrent violators through their IP addresses, and placing warnings on User Talk pages if behavioral problems persist. They also “report vandalism and suspected personal attacks to antivandalism IRC channels.”21 According to its User page, SineBot has made more than two million “contributions” to the system as of this writing. SpellBot, another prominent bot in the project, corrects common typos with the aid of humans who approve its edits. An army of other bots manage interwiki links, tags, and redirects as well as perform general maintenance tasks such as resetting sandboxes.22 Still others monitor the Recent Changes stream, reverting pages that have been vandalized.23 These particular bots search for anonymous edits, which are statistically more likely to be vandalism, as well as words from a list of common sophomoric terms (for example “poop”). Lih estimates that bots have “helped Page 123 →tremendously by catching well over 50 percent of the obvious vandalism.”24 And finally bots act as archivists, handling automated archiving for heavy-use pages such as the Help Desk.25
Automated Agency
The problem of accounting for nonhuman agents and their agency is challenging. With the exception of work by Miller, Cooper, and Rickert, most studies of rhetorical agency have limited considerations to humans. Indeed this is how we culturally approach such questions: from a human-centric viewpoint. In the current prevailing understanding of this problem, nonhumans, which are always understood as objects, simply do not have the capacity for agency. Accordingly our understanding of what might constitute performance is tied to the human: speech, gesture, and texts with an attributable author who assumes responsibility. But these bots function as attributable authors generating texts that function persuasively within a socially constrained network of agents and texts. Further they themselves shape the texts written by humans, repair human mistakes, and otherwise function in many ways as active parts of the community (although not in deliberative capacities.)
Latour offers the hybrid as a term for understanding the ways that humans and machines working together can accomplish things neither could as separate, unarticulated actor. In his example, a gun held by a human becomes a human-gun hybrid that is capable of acts that neither actor might accomplish alone. This model represents the physical unity of the tool-in-hand, creating a contingent third actor. While bots and their coders never physically touch, the writing and launch of a coded entity creates a third actor that is the deployed code, writing that lives and performs in ways that are very different from text on the page. This writing is transformed into an actor that accomplishes curatorial work that neither its author or its unlaunched code could accomplish alone. It is also an actor with a recursive life cycle that incorporates the potential for destruction as well as generative potential, for breakdowns, adaptations, and redeployments.
Extended Description
1. Coder: Creative Labor Craft leads to Community (1), Intention/Delegation, Design/Deployment Process, and to Agency.
2. Community (1) leads to Coder: Creative Labor Craft and to Intention/Delegation.
3. Intention/Delegation leads to Community, Coder: Creative Labor Craft, Design/Deployment Process, and Bot: Compositional Labor and Agency.
4. Design/Deployment Process leads to Coder: Creative Labor Craft, Intention/Delegation, Agency, and Recursion.
5. Agency leads to Coder: Creative Labor Craft, Design/Deployment Process, Bot: Compositional Labor and Agency.
6. Bot: Compositional Labor and Agency leads to Intention/Delegation, Agency, Community (2), Editors, Readers, Bots, Infrastructure, Shutoff, and Recursion.
7. Recursion leads to Design/Deployment Process and Bot: Compositional Labor and Agency.
Articulated process of design, deployment, performance, and either shutoff or recursion.
Page 124 →The bot’s performance and agency must be considered within not just the technological, social, and textual environments that it operates within, but also in terms of its design and deployment. Without a human coder, there would be no bot to begin with, and the two actors engage in what Amy Propen has called the “co-construction of agency.”26 Accounting for the coder also allows us to separate the two kinds of work that are involved in this assemblage: creative work and compositional work.
Compositional work is most often understood as creative labor to the extent that it is naturalized in the terminology we associate with writing: “I majored in creative writing,” or “We have a creative nonfiction speaker series.” But in the case of a bot who writes, these two types of work are distinct. The creative labor is performed by the coder as he or she designs, develops, and tweaks the bot itself. The bot’s human creator is also a textual curator and a very specific sort of writer: one who writes code that is constrained in mundane ways by community rules, the limits of the technological environment the bot is built within and built for, and the coder’s knowledge and skills. Coding is a craft, one that on its good days demonstrates clear artistry through elegant syntax and work-arounds. Intention also resides with the coder, who builds the bot for specific purposes and tasks: after all, a bot who searches for typos does not have the same purpose or functionality as one who creates maps. The bot itself, being nonsentient, demonstrates intention only by delegation. The sentient decision-making process of figuring out how a bot might best be used within the system, the rhetorical process of seeking community approval, and the creative process of writing the code that “hatches” the bot all point to a different performance of agency than that demonstrated by the bot, who performs the actual work and thus drives material change within the system.
Once it is deployed, the bot performs its own agency as it carries out its tasks and interacts with other actants who may be community members, editors, readers, other bots, textual features, or elements (bots, linkages, photos) and layers (code, servers, databases) of the technological infrastructure that is Wikipedia. Wikipedian bots perform the process of encyclopedic composition, making limited decisions of the sort that place them a bit above Latour’s famous door opener. They compose articles by inserting census data into a predetermined script. They decide whether or not they will perform a given task, such as reverting a vandalized page, or as to whether or not a particular word is a typo that should be fixed. Then they decide the most appropriate solution for the problem at hand (for example a choice between two or more possible word spellings). These bots react to their environment, initiate action with it, and affect change both within the texts and sometimes within the broader scope of the project, as when Wikipedia rather suddenly expanded exponentially to cover thousands of towns.
Page 125 →When we examine an article’s primary text, it is not immediately apparent which text was written by humans and which was written by bots. Which sort of writer contributed which text can be discerned only by careful reading of the page history. Still the work of the bot is never completely black-boxed because its activities are tracked on two levels: the page history of each article that it impacts, which is available to anyone with access to the article, and the activity history of the bot itself, which is available to the coder. Its performance also appears in project logs available to curators who devote their time to tracking vandalism or spam, close copyediting, and other projects that bots contribute to.
Emergency robot shutoff button for RussBot. Image courtesy Wikipedia under CC BY-SA license.
As it performs its tasks, the bot possesses the potential for perverse performances of agency, which invites a clear social response through the quarantine procedures imposed by the Wikipedian community. Because of their capacity for unique and unintentional action, bots are quarantined, tracked, and approved by the Bot Approvals Group (BAG). As the Bot Guidelines note, “because bots are potentially capable of editing far faster than humans can, have a lower level of scrutiny on each edit than a human editor, may cause severe disruption if they malfunction or are misused, and are held to a high standard by the community, high standards are expected before a bot is approved for use on designated tasks.”27 An unsupervised bot can wreak havoc in the text and leech necessary resources from the infrastructure. In order to prevent these blunders, strict rules govern their editing speed. A bot performing high-priority tasks is permitted to edit once every four seconds; lower-priority task bots may edit every ten seconds. Lower speeds are required during typically high-use periods: Wednesdays, Thursdays, and between 1200 and 0400 UTC on any given day. Wikipedian bots may be trusted Page 126 →to perform essential tasks, but they were quickly deemed sufficiently untrustworthy so as to require an Emergency Robot Shut-Off Button on their individual User pages. When Wikipedia Labs launched, bot hosting transitioned to the Labs servers in order to lessen the potential for negative impact on the larger system.
Automated performances of agency have the potential to move a project both forward and backward, occasionally taking a good portion of the larger system down with it. The bots’ performance of agency during a malfunction is not equivalent to the large-scale oppression Campbell mentions in her discussion of malign performances, but it is harmful to community morale, its creator’s social capital, the articles impacted, and potentially entire project sectors or servers. A bot suddenly run amok can hobble the entire system as it slows servers and archival processes to a crawl, frustrating human editors. Bots’ unique agency can also be used for purposefully malicious tasks such as spreading spam on targeted entries or committing more malevolent acts of vandalism. And the possibility of complete and sudden shutdown through human intervention is always within the realm of possibility during these automated performances.
Recursion is a necessary event in this loop of labor and agency when we consider the case of a writing bot with the capacity to continue running as long as the technological infrastructure supports it. If and when a bot comes to rest, so to speak, it is typically for one of four reasons: it has been shut down, it has completed its current tasks, it has been taken down temporarily for tweaks, or it has been decommissioned. In each of these cases, there is a consolidation of full agency to the coder, who must either reprogram the bot for a new task set, make necessary changes/repairs, or decide to leave the bot deactivated. This return reunites the bot and coder in the creative work of the design and development process, where intentions may be reconsidered and redelegated. After these creative issues are resolved through the coder’s skill and craft, then the bot is deployed again for another cycle of compositional work. There exists the potential for a nearly eternal return, since these bots, being lines of code, do not have material aspects that can corrode or degrade. Their code may become corrupt and require fixing, or their servers may be shut down—as might Wikipedia itself, for that matter. But barring these catastrophic events, the bot exists in a recurring loop of deployment, labor, and return, maintaining a discrete agency that intersects with other agents along the way.
Identity and Attributed Agency
This division of not just agency but also identity between the bot and its maker has not always been clear, and both aspects have been negotiated over time by the Wikipedian community. The controversy surrounding Rambot’s initial deployment also inspired policy discussion about Wikipedian identity, since Ram-man Page 127 →originally ran the bot under his own user ID. As a result during this initial run it was impossible to differentiate his own edits from those made by Rambot. A new policy emerged that mandated separate user names and pages for bots, and the Wikipedia policy on bots continues to make a strict distinction between the human user who creates a bot and the bot itself. Bots are required to have their own names and User pages (hence RamMan’s bot’s user name is RamBot), thus assuming identities that are textually equivalent to human users. Their scripts must be able to leave notes and comments akin to the ones human users leave when making an edit; that is, an automatically logged signature and a cordial description of the changes made.28
These naming conventions, along with award practices, provide a compelling illustration of not just the ways that rhetorical agency is articulated, but also the rather surprising extent of nonhuman performance found within the text. Considering the sort of agency a bot might demonstrate is a peculiar task, given the fact that as a society, we are generally reluctant to formally admit any attribution of agency to nonsentient actors. However, daily attributions of agency to nonhuman objects are common in everyday life, as Latour (writing as Johnson) also has pointed out in his discussion of “gremlins”: “we are constantly granting mysterious faculties to gremlins inside every conceivable home appliance, not to mention cracks in the concrete of our nuclear plants.”29 We talk regularly to our computers and cars and are not uncommonly convinced that they answer back. Miller and others have also noted that certain humanoid characteristics of expert systems and intelligent agents facilitated easy attribution of agency by the humans they interacted with.30 As a result human users frequently develop personal relationships with them, especially if the agent convincingly simulates true interaction, as with the MUD softbot Julia, who convincingly conversed with other system users.31 This act of attribution, Miller suggests, is a necessary condition for agency “because it is what creates the kinetic energy of performance and puts it to rhetorical use. Agency, then, is not only the property of an event, it is the property of a relationship between rhetor and audience. . . . We understand agency as an attribution made by another agent [emphasis original], that is, by an entity to whom we are willing to attribute agency. It is through this process of mutual attribution that agency does, indeed produce the agent.”32 Part of our willingness to attribute agency must indeed be simple anthropomorphism. However, we are also accustomed to nonhuman actions shaping our lives (as when a computer crashes close to a deadline) and accustomed to resisting the constraints they impose (as when we admonish our students that a crashed hard drive does not warrant an extended deadline).
While attribution is an important rhetorical element, it is not a requirement for agency: these objects perform agency and shape the environment around Page 128 →them whether or not humans decide to recognize it. Subaltern groups also do not require attribution or recognition by dominant communities for their performances of agency but instead rely in some instances on shielding themselves from attribution through the use of hush-harbors and similar safe spaces. That aside, human acts of attribution have the potential to reveal quite a bit about the ways in which we live with technology differ from the things we are willing to admit believing about technology. For example this quote from a Wikipedian discussing the impact of Rambot reveals an attribution that is not unlike the ways that we might describe any technological intervention that spurs invention in writing. Editor Meelar agreed with the assessment of Ram-man (Rambot’s creator) that providing article stubs for each town increases the chance that users will build them out, since the practice removes the slightly intimidating barrier of starting a new page in Wikipedia.: “The point is that I wouldn’t have bothered to write any of my contributions, and probably many other users wouldn’t either, if Rambot hadn’t given me a starting point and some organization.”33
Notice that Meelar, who is aware of the process that led to these articles being created, does not attribute their creation to Ram-man. Faced with a choice between attributing authorial agency to a human or a faceless bot with no ingratiating characteristics, he says that Rambot gave him a starting point and imposed organization, thus shaping his own actions. Another similar instance of attribution is the Barnstar award on Rambot’s User page. The award reads, “For the seemingly impossible amount of useful, accurate, and completely unbiased articles you have created and continue to update, despite being one of the youngest contributors to Wikipedia, I award you with this special Barnstar Award, the Tireless Contributor Barnstar. Batteries not included.”
Barnstar awards represent community acknowledgment and appreciation of individual Wikipedians who demonstrate outstanding contributions to the project. This particular Barnstar notes Rambot’s adherence to socially negotiated community guidelines through creating “useful, accurate, and completely unbiased articles.” Note that through this award’s placement on the bot’s User page, it directly celebrates the bot’s contributions rather than its creator’s; no such award has been presented on Ram-man’s User page. Similar awards appear to be fairly common on bot User pages. For example the vandalism-reverting VoABot II’s User page is littered with ten awards for its actions: seven Anti-Vandalism Barnstars, two Defender of the Wiki Barnstars, and the WikiChevrons for protecting articles on the main page. Only two additional awards were clearly meant for its creator: a Surreal Barnstar for creating the bot and a Da Vinci Barnstar from a user who clearly addressed the developer by saying, “I am very impressed by your bots and software work.” The Bot-Builder award’s title indicates it is also meant for the coder, but the language commends both builder and bot: “On February Page 129 →27, 2008, VoABot II fought off six repeat vicious vandalism attacks on the Wikipedia entry-Ukraine. [sic] The response rate was exceptional, under 1 minute. As a creator of this defending machine, you are bestowed this ‘Bot Builder Award.’ Much thanks from the Wikipedia community.”34 Bot and bot creator are both commended for fighting off vandalism attacks, but one entity is acknowledged as the coder who performed the craft/work of building the bot, which in turn performed the actual work of fighting vandalism. This attribution implicitly acknowledges a distinction between the different forms of creative and compositional work performed by coder and bot respectively. Other examples further demonstrate the extent and types of work performed by automated writers.
Bots at Work
The specific rhetorical situation of the small-town Wikipedia articles affords text written primarily by bots, since these articles tend to attract so little human attention. This heavy ratio of bot-generated text is not necessarily true of all town entries. For instance the article on Syracuse, New York, has followed a very different trajectory from its creation. As a far more populated town, its article saw immediate human involvement as well as continuous human-driven additions and improvements. However, the questions that interest me do not revolve around the central features or compositional life of a town or city article, but rather around the special technological affordance of using bots to write such articles and the sort of agency they demonstrate. Therefore I focus primarily on the Darwin, Minnesota, article because it is typical of many articles for small U.S. towns that rarely capture the public’s attention and consequently see little human editing.35
Primary text of Wikipedia article on Darwin, Minnesota. Image courtesy Wikipedia under CC BY-SA license.
Page 130 →As seen in figure 12, the primary text of the Wikipedia article on Darwin, Minnesota, is brief and basic. It contains only essential statistics on the town’s location, population, geography, and demographics. We learn that the city has a total area of 2.15 square miles and that Darwin’s 350 residents use U.S. Route 12 as the main route. In the 2000 census, the median household income was US$34,286, and 10 percent of the population lived below the poverty line. When the 2010 census information was automatically integrated the income information was not updated, but the bot-written update included the fact that 96.9 percent of the population identified as white. Despite the article’s eleven years of existence, we learn nothing of Darwin’s primary industry, of its celebrations—nothing that cannot be found in census stats or other government data, because human editors have not added this sort of information. Until 2012 no information regarding the Twine Ball, the town’s most significant claim to fame, appeared on the page.
Discussion page for Darwin, Minnesota, article. Image courtesy Wikipedia under CC BY-SA license.
Regardless this article is inarguably a coherent text that has been created within a prescribed textual and rhetorical situation. The composers of this article are participating in a larger community and structure and have produced an article that is meant to invite reciprocal participation from that community. It meets the Wikipedian article conventions of including a descriptive title, the standard project byline, and an introductory overview followed by subsections and a list of references. A map and at-a-glance subject stats are listed on the right-hand sidebar, as they are for all town entries. The tone of the text, while dry, is unbiased and straightforwardly informative, meeting the community’s symbolic constraints of neutral point of view and no original research. The strategic, fairly dense inclusion of interwiki and external links not only provides pathways for human readers but also increases page rank for search engines. The Discussion page shows that there have been no backchannel discussions about the content of the Darwin article.
Page 131 →On October 30, 2007, an anonymous visitor inquired as to the origins of the town name, but no reply has been offered as of this writing. The article has been flagged as being “within the scope of WikiProject Minnesota, a collaborative effort to improve the coverage of articles related to Minnesota on Wikipedia.” Visitors are invited to contribute to this curatorial project, and the article has been rated as “start-class” and of “mid-importance” to the project. So far this appears to be a sleepy little entry. But over on the History pages, we find a surprising amount of activity: sixty edits at this writing, thirty-five of which were made by bots. The initial article creation is credited to Ram-Man, but that edit was actually made by Rambot, since the bot was still running under the human user’s name at that point. Through attention to edit history details, it becomes evident that not just one but thirteen other edits involve bot-written article creation in other Wikipedia translation projects.
The English text itself has been largely bot-written from its inception: the initial article consisted of 358 words, and as of this writing, the article stands at 678 words. In other words approximately half of the article text has been in place since the first bot-written draft. Judging merely by sheer textual bulk, we might conclude that this article has been effectively created by bots. However, a closer look at the edit types tells us more about the specific sort of work that has happened here.
The article has been edited fifty-eight times during its decade-plus existence, and thirty-four of those edits—59 percent of the total—have been performed by bots. Analysis of the edit types in this article, noting which were performed by humans and which by bots, reveals the extent to which bots perform originary, compositional labor that goes far beyond simple correction of typos or reversion of vandalism to include entry creation and deletion, image additions, fact checking, categorization, interwiki linking, and updates to the text itself.
Bots and humans rarely perform the same edits in this article, with the exception of fairly simple curatorial tasks: the addition and correction of facts, and maintenance of categorization. Bots handled the majority of tasks associated with building an entry from the ground up: initial creation, translation into other languages, formatting with templates, adding images, adding information boxes, and both adding and updating facts. Humans categorized the article topic under “Cities in Minnesota” and “Meeker County, Minnesota” and added and updated images and some facts. They also performed minor edits, template tweaks—and, perversely, vandalism.
Fifty-one percent of the twenty-nine human edits to this article were either acts of vandalism or reversions of vandalism. Most involve humans vandalizing Page 132 →the entry by tinkering with statistical information and then other humans reverting that vandalism. Some instances of vandalism may or may not have been malicious: the January 3, 2007, instance simply involves an editor typing “afdsfasdf asdf,” which may merely indicate a user who was exploring the ease of editing a wiki and was unaware that such activities should be confined to the user’s sandbox. This sort of edit would not automatically trigger a ping to the vandalism activity lists, and so this activity was not reverted until five days later. (The proximity to a major holiday may also have to do with the slowness of this reversion.) In 2011 an unregistered user drastically lowered the population data and inserted claims that “the entire city of Darwin was created by Delton Voigt, who also invented Pop-Tarts” and that the town claimed the world’s largest Pop-Tart and also to own Russia. These edits were reverted the same day.
The agency humans deployed in their edits demonstrates the protean possibilities for its deployment: they spend much of their energy twiddling, squabbling, and undoing additions without contributing significantly to the piece. They tweak numbers to disseminate misinformation and then effectively cancel each other out; they type randomly and then leave, never returning to add meaningfully to the page. Although the reversions, minor edits, and fact updates that humans performed did unquestionably contribute to the quality of the article, they do not consistently demonstrate the sort of productive text building that bots do in this article. Almost all the generative textual work, additions of images, and information updates to this article were done by automated actors and have been for more than a decade. The same is true of innumerable other articles in this encyclopedia.
Can this automated labor be described as authorial invention in the rhetorical sense? Clearly when Rambot was deployed for the initial creation these articles, it brought many texts into being that did not previously exist. However the bot never browsed other city entries, realized that Darwin was missing from the master list, and decided to create an article on that topic. We cannot claim that the bot worked through the gritty cognitive moments of conceiving of a topic, applying its own originality or intellect to the process, and then writing an entry with all the starts, stops, twists, and turns that we associate with the writing process. Nothing about this fits our definitions of human invention. Rather the bot ran its script, pulled in the information it was directed to access, and plugged that information into the spaces where it belonged. Other bots came along because this article was in their prescribed path and performed the duties they were directed to pursue.
The compositional work these bots do constitutes persuasive rhetorical performance, although the process necessarily looks rather different from human performances. As the bots performed the work of building this article, they ran Page 133 →algorithms and scripts. They did not make cognitive choices about words based on flow or sound or other stylistic implications. Rather they followed a series of if-then sequences in order to determine if there were typos to be corrected or vandalism to be reverted. They did not design the visual aspects of the page to meet their own aesthetic standards but instead applied a template and placed illustrations in the prescribed sectors of the page. And in performing these compositional tasks, they did not learn new skills or hone existing ones. Instead they did what they were built to do and moved on. They did not conduct creative work, but yet we have here cogent articles that contribute to and shape this textual project. But none of these factors mean these bots did not possess and perform agency; they point, rather, to our own refusal to attribute agency to nonhumans.
Wikipedia’s writing bots actively create and correct text. They interact with each other and with humans as well as with the larger technological infrastructure. Their uncreative compositional work directly shapes and extends the existing encyclopedia, and they are more consistent and productive in this labor than most human contributors. This work extends beyond mere compilation and involves some of the most vital tasks of textual curation: management of metadata, internal and external links, maps, and facts. In the case of this article, bots performed a good portion of the heavy lifting of composition and are largely responsible for the existence of the text.
Automated composing agents are a welcome, long wished for change for overwhelmed encyclopedists as they face down the anxieties of information overload and the formidable pragmatic challenges of curating an ever-expanding text. The creation and deployment of bots by expert coders provide a view into real uses of automated composition. They also demonstrate the articulated nature of rhetorical agency as this complex balance of intention and performance shifts between creator, who performs the craft and creative labor of coding, and bot, which performs the compositional work. Both forms of labor are vital in the development, deployment, and execution of automated curation.