Yannick Trapman-O'Brien

April 25 Archive Highlight; "T2000 Part 1; The Dataset Awakens"

Added 2025-05-01 03:18:14 +0000 UTC

After hitting 2000 calls in the Telelibrary, I resolved to dive into the data again (just as I did after the first 1000 calls) and to see what I could learn, and what if anything has changed about the composition and operation of the System. This is a HUGE undertaking - especially for someone with Fischer-Price Data Science skills such as myself. As such, you’re going to be getting these updates in bits and pieces over the next many months. You’ll find one of those bit-and-pieces below, along with accompanying [footnotes] that will probably become longer than the piece itself, which will prove either impenetrably dense for those of you who don’t love process or positively enchanting for those of you like me in a romantic relationship with methodology. For now, let’s start by defining a particular data set:

What can data tell us about a Telelibrary call?

At the current moment (April 2025), Laurie Allen is the chief of the Digital Innovation Division (LC Labs) in the Digital Strategy Directorate at the Library of Congress. Before that she was a Research Director at Monument Lab in 2017, and became my “Data Mentor” by merit of the misfortune of having her office at UPenn situated one wall away from the Digital Media Lab where I transcribed all 4000 Public Monument Proposals from the Citywide Exhibition. Somewhere between those two points in time Laurie once told me

“people think that data is factual and can tell you things in the same way they think that Monuments are factual and can tell you things; both are just constructions that we use to tell a story.”

So if we’re going to figure out what story the data will tell us about a Telelibrary Call, we first need to be clear what that data is — and what a call is.

When I pulled the latest round of data on April 26th, 2025, I had performed the Telelibrary 2072 times. However, this larger number (splashed around as frequently as possible for cheap shock value) belies a more complicated truth: not every Telelibrary show happens on the same terms.

Specifically, I’d say there are approximately 4 categories of calls in my records:

Standard: Bread and butter of the Telelibrary, these calls are what we’d consider “normal” (if that word ever applies to this show), and were identified by screening out the more unique use cases below. After doing so, we’re left with a group for whom we can bet fairly safely that the User on the call joined the Waitlist, (frantically) clicked a New Hours Posted email, selected a time, and attended their Telelibrary before being invited to make a contribution.
Patrons: Folks who subscribe to my Patreon at certain tiers get on to a “Shortlist” of exclusive times, letting them gain access to an easy, unhurried booking once every 5 month cycle. Patrons can use these for themselves, or give them as gifts to friends [1].
Guests & Tests: Since its second month of existence, the distribution of the Telelibrary has been defined by demand far outstripping supply. The advantage of this is that it allows me to avoid marketing entirely, and to build performance times around my ever-changing schedule and release them without abundant lead time. One of the disadvantages of this System is that it meant the chances of my friends seeing the work dropped to approximately 0 (dear reader, if you’ve ever sent that text or drafted that post inviting/begging your friends to come see your new show, imagine having to also tell them they’ll be testing their reflexes against 400 strangers). Having folks skip the public line didn’t sit right with me [2], so I assessed my availability and began adding a small number of performances each month that were to be given at my discretion, until everyone I love and cherish had the chance to experience my work.

Just kidding—the vast majority of the people in my personal life have still never done the Telelibrary. But these times still proved crucial for coordinating with producers and other opportunities for commissions (see below), connecting with artists and possible collaborators, and occasionally trouble-shooting issues or testing new ideas, features, or systems.
Commissions: I’ve done the Telelibrary at a number of Festivals including the Philadelphia Fringe and the Denver and True/False film festivals), and for a pretty eclectic mix of small affinity groups, University classes, and more. I’ve even done a number of private calls—that’s right, if you just email me directly, for a pretty penny you can commission a performance of the Telelibrary on your schedule for yourself or a friend. Generally all these commissioned tickets are partially subsidized for the caller - in some cases they are entirely prepaid. In all cases, there’s a pretty distinct context at play separate from a Standard Telelibrary call.

It’s taken me a fair amount of time to “clean” the data from the master roster, but thanks to a few standardized pieces of metadata I consistently input [3], I was able to identify a subset of those 2072 times that we can call “Standard;” for the purposes of the next few posts, we’ll name those 1,655 calls “The Sample.”

Which gives us our first piece of information: as many as 20.13% of all of my calls fall outside the limits of what I’d define as “Standard:” one in five. A fractional amount of that is due to the inconsistency of my early record keeping (a number of entries from my first month are so different in convention as to be not functional), but mostly this speaks to a more interesting truth: even as I have worked to share the basic model of the Telelibrary to other creators as something worth considering and implementing in their own practice, I myself have made a living by supplementing it with the variations we discussed above. That’s not something that I think invalidates the Standard model of low-overhead, low-throughput, high-repetition work. Instead, I look to it as additional evidence for its strength; the stability of that model let me build a regular performance practice that I can operate with very few new production decisions, month to month, while the flexibility of it has let me experiment and play with all sorts of different options and opportunities. Once a production is up and running, you can continue to play and investigate what else it might become, and new invitations to your work for new audiences can coexist with the old.

But - back to “The Sample:” what do those 80% of calls look like? Well, for today’s post, let’s limit ourselves to figuring out who’s calling, and how often.

In those 1,655 calls, we see 1164 User IDs active - about 82.20% of all User IDs at the time the data was captured. I was a bit surprised to see how many Users are exclusive to non-standard calls, but on reflection, commissions in particular tend to involve the creation of many new User IDs in a short period of time, and at a faster proportion of new users than Standard calls.

Speaking of which—what is the proportion of new users to returning callers?

The overwhelming majority of Users in the Sample (971 out of the 1164) have only attended the Telelibrary once, compared with 193 who have done so more than once. The chart above shows us that almost 96% of Users have called 3 times or fewer, and the average number of calls for a User in the Sample is 1.422 calls. That being said, a careful reader (or fellow chart fanatic) will also notice that this 96% of Users only account for a nudge over 78% of calls. That’s because 158 calls in The Sample come from just 10 User IDs. That’s right, just 10 users account for 9.55% of the total number of calls surveyed. Indeed, the top 5 most frequent callers account for 6.22%, and the User ID with the highest number of calls in the dataset accounts for 1.63% of The Sample, with 27 recorded calls [7]. We can see how notable the skew in total number of calls to a small few dedicated users is when comparing the “upper limits” with our previous numbers:

(a bit confusing in the time-signature change on the page-turn, but to put it simply: there are almost as many Users with exactly 3 calls on their record in the Sample than there are Users with any amount of calls 4 or more). We can see that precipitous drop off another way in this visualization:

It’s a pretty drastic step-down, but while the population skews heavily towards first-time callers, I also feel these charts help point to the disproportionate influence Users who return, and the meaningful ways that a few dedicated Users have written themselves into a significant amount of the DNA of the Telelibrary. I’ve written before about how the balance of recurring and repeating users helps keep the ecosystem of the Telelibrary both broad and deep. But how has this balance shifted?

Looking at the same Dataset when pulled at the end of 2022, at first the numbers seem fairly even:

However, looking at the full range shows us that there is some difference:

Over time, there appears to be a slight trend towards a higher average number of visits, which is coming from a slight shift towards a higher proportion of returning callers who call more often. This leads to a lot of questions: is this general shift in users returning driven by a broad trend, or a few users? Are users who joined in one year more likely to return than those who joined in another? What would a “heat map” of user activity look like? Answering those questions and understanding this slight shift in center of gravity will take more refined research.

But there are plenty more things to investigate about The Sample. What do these Users do with the Telelibrary? How do they behave? How much is anyone paying? All excellent questions — for another month. For now, I’m off to enjoy the last few gasps of April.

Here’s hoping you all have a soft evening - and that you go hard for May Day.

[1] Which they quite often do! An interesting question would be to go through the data, find all the instances of a User whose first call came through the shortlist, and determine what percentage (if any) of those came back for additional visits through the normal waitlist.

[2] I freely admit that some of the hills I die on for the Telelibrary can be pretty arbitrary, but I’ll say this one at least has a very clear origin: when I was in University, I did shows and research in New York City a few summers in a row, and always made a point of standing on line for Shakespeare in the Park. I found the flattening of status in those long park mornings incredibly appealing; it felt truly public, and while it was undoubtedly an imperfect system, the idea of something so in demand being distributed through such a stubbornly outdated and egalitarian system gave me a certain kind of deep satisfaction. Equally, when I would eventually learn of ways that some were allowed to buy or cheat their way to a ticket (or when I caught one of my professors sending his intern to get himself and his husband a ticket (and leave none for the intern!), I was truly childlike in my inflamed injustice. I find that spirit of stubborn-headed, obstinately shaped fairness keeps me very loyal to my own strange little ticket ritual.

[3] I keep a fair amount of data points for each record[4] it’s actually predominantly made possible by just two documenting conventions: The first is tracking User IDs, which became almost instantly very important when I was stunned to find callers calling back again and had to have a way to connect all of their data without relying on personal details like names, numbers, and emails (all of which are subject to change beyond my control). Entering the User ID on a record is the first step in my data entry, and I generally do it even before the call begins, just as soon as a User confirms their call by text. Ergo, User ID on a record means the call actually happened, no User ID means it didn’t (missed call or no-show, cancellation, etc.). The second big driver of documentation is figuring out if I’m actually making any money—or rather, if I’m making enough to keep going. In the early days, at that strange moment in which “I’ll guess we’ll all just quarantine for two weeks” became “I think my full time job is now Phone,” I began to view the “Pay What You Choose” model as excellent data for eventually setting a “Standard” price. To that end, I started separating out payments made for non-standard calls so I could each week view the average contribution for a “Standard” call.

[4] A record for a call is one row in a spreadsheet, with fields for the following data:

User ID, Name Registered, Email Registered, Phone Number Registered, Additional Notes at time of Booking, Remaining Credits on the Account (relevant for returning users) User Name Assigned, System Name Assigned, Notes (a record of everything accessed/selected on the call, everything unlocked, and notes to myself), Selection Codes (1-3 character shorthand for everything chosen, copied from paper notes), Call Length, Payment Amount, Payment Method, and Fees (paypal, currency conversion, etc.)

That is, as the kids would say, a lot. At the same time, almost everything there overlaps with what I need to know to run bookings, what I need to know to pay the IRS/not go to jail, and what I need to know to perform the show. I’d say the most consistent data in the set comes from that which is imported from my booking website, followed by payment data (sometimes people pay days, weeks or even months late, so I periodically have to scrub through and make updates), then Notes (which began more lackadaisical but have become more and more detailed as repeat callers proliferate and Telelibrary functions and features become more complicated), and then finally at least consistent is Call length and Selection Codes. Call length is something I began tracking to determine if the piece needed to be longer than its initial 45 minute run time (spoiler: it did) and to try to calculate out my exact hourly wage and whether longer shows paid better (spoiler: I lost interest in the former and found I preferred not knowing the latter, so that I wouldn’t subconsciously pressure Users to shorten/lengthen calls). Selection Codes are vital during a performance, when I track them by hand in paper notes, and thus keep a reference of the entire call, but so far aside from some occasional activation through the Stock Market [5][6] and some periodic data scrapes, this is rarely as vital to keep constantly up to date.

[5] Which is to say, the first of two concurrently operating User Generated Stock Markets. But that’s a story for another post…

[6] a footnote ON a footnote? Someone stop this man.

[7] this user is the same as the most frequent caller overall, but those numbers at the top frequency look different; a Patreon caller who had followed me at the $12 level or up since 2021 could have had as many as 10 exclusive visits by now. Which accounts in part for the following when we look at these same stats for ALL TELELIBRARY CALLS

The percent of 1-time Users is pretty narrowly the same, but returning calls now amount for 43.34% of all calls performed. Looking at those higher numbers, we see the density of calls above 4+/10+ and beyond is all a little greater, pulling the skew even sharper. (Interestingly the top slot is roughly equally representative, with a top # of Calls at 34, or 1.64% of all calls). Whether we are looking at The Sample or All Calls, just 3 of the top 10 most frequent Users are not and have never been Patrons; I’d say that means that having a Patreon has allowed even more access to my most interested Users, but that it doesn’t necessarily mean those audiences are 1:1.