Alex and Evelyn chat with Sol Messing, Research Associate Professor at New York University with the Center for Social Media and Politics, about what to make of the open source info Twitter provided on its ranking algorithm. They also chat about the blue tick debacle, concerns about YouTube's treatment of political videos in India, and Midjourney's pretty not-okay approach to content moderation.
Twitter is (partially) open sourcing its recommendation algorithm. In this special episode, Evelyn and Alex are joined by New York University Research Associate Professor Sol Messing to talk through what he found in the code.
Join the conversation and connect with Evelyn and Alex on Twitter at @evelyndouek and @alexstamos.
Moderated Content is produced in partnership by Stanford Law School and the Cyber Policy Center. Special thanks to John Perrino for research and editorial assistance.
Like what you heard? Don’t forget to
the podcast with friends!
Alex Stamos:
Political satire in China is pretty not okay, which is also a pretty significant understatement. That's almost like a Mel Brooks... There's political prisoners chained to the wall and somebody saying, "Oh, this is pretty not okay."
Evelyn Douek:
That's right. It's not very okay or illegal or possibly punishable by being disappeared, it's just pretty not okay.
Hello and welcome to Moderated Content's Weekly News Update from the world of trust and safety with myself, Evelyn Douek and Alex Stamos, and we are headed straight to our Twitter corner this morning and breaking news is that we finally have the long awaited open source algorithm that has been promised for a while. So, kudos to Musk for really coming through or is it? So in order to answer that question, we're joined very luckily by Sol Messing who is a research associate professor at New York University with the Center of Social Media and Politics and has a bunch of relevant expertise here. So Sol, thank you so much for joining and maybe could you tell our listeners just to start why you are really well-placed to read the code and talk about it?
Sol Messing:
Totally. Thanks so much for having me. So in my last role, I was actually the head of what's called discovery data science at Twitter, which is data science for Twitter's ranking and recommendation systems, also created the Applied Sciences Group, which was kind of modeled after Facebook's Core Data Science team.
Alex Stamos:
And, a Stanford alumni.
Sol Messing:
True.
Alex Stamos:
Sol, this last week Twitter uploaded a bunch of code to GitHub, not very much annotation. They just dumped out a bunch of files and immediately there was a feeding frenzy as everybody jumped into it to try to figure out what was going on. What were some of the first things that you saw when you dove into the code? We'll dive into some of the controversies, but overall, what did you see in the structure of this thing?
Sol Messing:
Totally. So first of all, it appears to be super redacted. It is incomplete. A lot of the code that governs the for you home timeline ranking system, that's everything from candidate generation, which is sort of figuring out what tweets to rank, to the actual ranking system itself, which determines what you see on the platform.
Alex Stamos:
And so, this is code that would run on a front-end web interface. So you'll hit some load balancer, TLS termination, and then you'll go into a system. Would this be running in the first system that is putting together or is this a backend service that is then being composited later?
Sol Messing:
It's a little bit of each. It has a lot of the code that's sort of the mid-layer before it gets to the front-end. It has the code that shows how to actually fit the models that it uses, including the heavy ranker. It doesn't have any of the parameters in that code though. It's missing a lot of the configuration files and it doesn't tell you the features that is what things that it's using to predict what you're going to like in home timeline.
Alex Stamos:
So, it refers to all these variables for which we don't know how those variables are being created. Are those being created on post time? So if I post something on Twitter, are classifiers running and then attaching to that tweet a bunch of weights or how does that generally work?
Sol Messing:
There is some real-time data collection that is going into this ranking system. A lot of what is there is the inputs to the heavy ranker, which is really the heart of Twitter's ranking system. I think the most revealing thing that we saw is actually a ReadMe file that I saw posted by Jeff Allen that's a former Facebook data scientist. And if we take that ReadMe file at face value, it suggests some really interesting things about how the site is doing the ranking live. It has the formula for a lot of the actions that you take on Twitter, so a fave seems to be worth half a retweet, a reply seems to be worth like 27 retweets, and a reply with response from the author is worth a whopping 75 retweets.
Alex Stamos:
These are numbers that they played with, looked at the statistics, and saw what behavior it created. Is that accurate?
Sol Messing:
What's optimizing overall engagement? That's what they're really trying to figure out and what they've sort of done in a sense with these numbers. It's not quite as simple as a hard and fast rule with these things because if you think about what happens when a tweet is first posted and there's no data, what's going to happen when that occurs is Twitter's deep learning system is going to do a bunch of heavy lifting and then predict the likelihood of each of those little actions in that formula based on the tweet's author, their network, any initial engagements, the tweet text, and 100, maybe 1,000 other signals and embeddings. And so when you first post that tweet, what happens in that little window really shapes who sees it and who engages with it in the future. So, it's actually tremendously important. That model is doing a lot of work, it's tremendously important even though there's this formula that says, "If you retweet something, we're going to give it a boost in ranking for other people downstream regardless of the model."
Alex Stamos:
So, I think there's a bunch of interesting things here. One of the interesting things is they did include what they call visibility lib. So, for relevant to this podcast about moderated content, which seems to be the thing that takes a number of safety variables and labels and can both change the visibility of stuff. Problem is it's mostly empty when you look at the directories there, and so it doesn't seem like we've got any of the actual classifiers or very few rules as to what they actually consider unsafe in certain circumstances and such, which unfortunately is a big part of Musk saying this is that he wanted to open source the algorithm, so you can see what... As people are calling it censorship when you either down rank or label certain things, but it seems like from my read, we can't really tell what Twitter's policies are, at least from a code perspective there.
Sol Messing:
Totally, so a lot of the trust and safety code seems to be gone and that was a purposeful decision, presumably, to prevent bad actors from learning too much and gaming those systems, which that is a coherent approach to safety.
Alex Stamos:
It makes sense. It's something a number of us pointed out of what one of the challenges with algorithmic transparency is that some of these algorithms are how you stop spam, they're how you stop hate speech, they're how you stop people dog piling on each other. It's how you do child safety work and releasing those does make it not trivial, but a lot easier for folks to reverse engineering to try to defeat them.
Sol Messing:
Well, so some of the abuse scores, which are what hate speech is categorized under, some of that code is actually in this release. If you just search the code base for abuse, you can see a number of places where it actually isn't redacted. It's true that most of it is not there, but some of it is there. There are some safety parameters in the code which map directly to Twitter's policies and their formal public documents. Some of those parameters are actually not in what Twitter has published publicly. There's entries like high crypto spam score [inaudible 00:07:55]-
Evelyn Douek:
One's not working very well. Whatever they're doing there is not effective.
Sol Messing:
Right, well, they're trying to do something there, and one concern that folks have raised is, will this give scammers hints about how to get around what they're attempting to do, how to get around those detection systems?
Alex Stamos:
So, let's talk about some of the specific things people have pointed out. So it turns out there was not only one commit, there were three commits on March 31st when they released the algorithm. The third commit, and as of right now, the current last one is entitled remove stats, collection code measuring how often tweets from specific user groups are served. So, what is it that people found, Sol, when they looked through this code on the measurement side?
Sol Messing:
So first of all, it's comical that you would go in and try to delete that code in a poll request as if once it's off GitHub it's gone. No, first of all, it's all over the internet and second of all, the whole point of GitHub is so that you can track every change ever made by anyone.
Alex Stamos:
This commit has 200 some comments, which is actually I think a record for me of seeing a single commit have comments.
Sol Messing:
It's unbelievable, so what's in the commit? There's this reference to an Elon DDG config, and essentially what that means is that Twitter has created an entire suite of metrics about Musk's personal Twitter experience. The code shows that they fed those metrics into the DDG, the duck, duck, goose as they call it, their experimentation platform, which at least historically has been used to figure out whether or not to ship products. How are we doing on the metrics?
Alex Stamos:
So take a step back, this is their A/B testing framework, so this is a framework that allows them to manipulate the algorithm for small sets. You have groups of people that you test algorithmic changes on, and when they do that, then they collect statistics on certain things and in the statistics, the things that they measured was VITs, which I believe is very important Tweeters, so like VIPs, so we don't know who are defined as VITs, Democrats, Republicans, and Elon, that Elon himself, one of the things that they measure in A/B testing is whether or not they're improving his visibility. Is that an accurate assumption?
Sol Messing:
Yeah, that seems to be the case. So first of all, there was a bunch of reporting that engineers are super concerned about how anything they ship is going to affect the CEO's personal experience, and this seems to be suggesting that Twitter was at one time or is currently looking at that actively and using that to help them figure out what to ship and what not to ship.
Alex Stamos:
So, it's part of their acceptance criteria is that they don't destroy Elon and perhaps that every time that they're slightly improving his... Which looking at some of the things they've done, they've made these tweaks where all of a sudden it's all Elon all the time and for you, even if you're not following anybody, demonstrates that perhaps they've over-optimized on this one thing, right?
Sol Messing:
Right, well, they reverted that to be clear, but absolutely.
Alex Stamos:
So this doesn't show them actually... Because we don't really know, we don't have the code to see whether how much they're putting the thumb on the scale for Elon. They clearly are though, and the fact that it's so important, that is one of the top level stats that they track is pretty interesting, along with people who are Democrats or Republicans, which again, they also don't classify, does that mean elected officials or is it a classifier of who is pro Democrat pro Republican? But, it is interesting that that is a top level statistic that they're gathering.
Evelyn Douek:
To my mind, the Elon thing is fun and amusing, but the fact that they are testing how different changes affect Democrats and Republicans and using that to decide whether to ship something I think is super interesting and says something about what the priorities are within Twitter, even if it is just that we won't make this change that we otherwise would've, that would improve the product experience if it's going to look like it's favoring one group over the other group.
Sol Messing:
Absolutely, we know that conservative accounts tend to share more misinformation than liberal accounts. We know that Musk has alleged that Democrats and big tech are colluding to enforce policy violations unequally across parties. And so, if you have this partisan equality stat as part of your ship criteria, the thing you're looking at when you're figuring out, should I actually ship this feature, and it's on the same footing with policy violation frequency and these other really important metrics, you can see how the stack could really affect the type of health and safety features that actually make it into the site in production.
Alex Stamos:
The other thing people found was references to Ukraine, and it is interesting, while we can't see the actual backend classifiers, we can see some of the categories of what they consider to be bad because we can see the variable names to the labels.
Sol Messing:
Well, so there were some initial reports that Twitter was down-ranking tweets about Ukraine. I actually looked at this code and I can tell you that those claims are not correct. So first of all, the code itself is for Twitter's audio only sort of clubhouse clone called Spaces. So, the code is governing Spaces stuff on the platform. It's not governing ordinary tweets on home timeline first and foremost. Second, this is something called crisis misinformation, and it just so happens that they're calling out the current sort of global crisis by name and the code. This is consistent with Twitter's crisis misinformation policy. It is almost certainly the case that they're not down-ranking Ukraine stuff in general. They're down-ranking crisis information related to Ukraine, and that's what that reference is, and also that it is only in reference to Spaces crisis misinformation in this particular spot in the code.
Alex Stamos:
So, it's interesting because I think in another place, public interest rules [inaudible 00:13:58], we can see the labels that can be applied and it's a pretty reasonable list. So it's moment of death, deceased user, private information, right to privacy, violent sexual conduct, hacked materials, and one of them is abuse policy, Ukraine crisis misinformation. So, pretty clearly they have enough Ukraine crisis related disinformation that they decided to create a specific category as part of the response. So, there was some incorrect interpretation there, but it's just part of a pretty long and reasonable list of the different kinds of labels that they applied a certain piece of it.
Sol Messing:
It seems that way to me.
Alex Stamos:
So, the funny thing about the modification of GitHub is the whole point here is about transparency. If you delete the files, there's no indication. In fact, it's almost impossible for them to have changed the running code that fast. So they delete it off of GitHub, but it's not like they fixed the problem by removing the Elon stuff. Almost certainly the Elon specific code is still running in production, and so I just find this hilarious of on multiple levels of one, it obviously had embarrassing stuff, but this has been a good demonstration as we discuss algorithm transparency of the limits of algorithm transparency because we watched in real-time while the new Elon Twitter tried to manipulate us and lie to us about what their algorithm does, and there's no possible way for us to verify what they're telling us is the truth or not. We know there's stuff missing and then we watched them delete it because it's the Keystone Cops over there, but there's no reason why any other company doing the equivalent couldn't just be smarter and removed the stuff that is controversial before.
Sol Messing:
Absolutely, is this a step forward for transparency as Musk and Twitter would like us to believe? I'm skeptical as well. You can't learn much from this release in and of itself. There are no under underlying features, parameters and the data itself is not there, and basically Musk has just recently said, "We're going to make it prohibitively expensive. We're going to charge you half a million dollars a year if you want enough data to actually run an algorithmic audit," which is basically running a bunch of small experiments to really understand what's happening on the platform, which that price is prohibitively expensive for academics, for essentially almost anyone outside of Twitter, certainly anyone who's going to be doing public interest research on this.
Alex Stamos:
Well, thank you, Dr. Messing, for your time. It was great to have you.
Sol Messing:
Thanks so much.
Evelyn Douek:
One thing to say though, Alex, is that the A/B testing has been perhaps very effective because the news broke this week that Musk is now the most followed person on Twitter, overtaking Barack Obama, so worth every cent I'm sure of the millions of dollars, and again, therapy is cheaper. The big news this week, the story that everyone's following right now is the clown show that is the removal or not of the verified check marks. Starting of course on April Fool's Day, Twitter did what it did best and have some fantastic spoofs of otherwise verified users, but, Alex, what's going on here? Most people still have blue check marks. We now have this added language that now says when you click on a blue check mark, they have the verified badge either because they paid for it or because they're notable and there's no way of telling which is which. Either they're really important or they have $8, one of the two. Enter at your own peril, user. So just like everything else, this has been a bit of a spectacle.
Alex Stamos:
Right, you and I have talked about this a lot, Twitter under Musk's direction is trying to change the semantics of what the blue check mark means and change it from, we think you are notable and we have verified your identity, to you have $8 and the level of verification is that you have a credit card, but there does not seem to be any tying of any of the payment information to the name you use on the platform. In fact, there have been a couple of tests. Ben Wittes just had a blue check mark while calling himself the Russian Embassy and got suspended for that because it was obvious, but that's the kind of thing that happens post facto and clearly there's no verification going to those steps. And so, that was going to be a difficult challenge. The other challenge is that Twitter has relied upon the fact that it was a place where celebrities, actors, politicians, academics, lots of people who get paid for their time would effectively donate their time to Twitter, create free content that Twitter would sell ads against.
This was the economic model and flopping that economic model where you expect, say LeBron James, one of the most popular people in the world to create content and then to pay you for the privilege was always going to be incredibly hard. And so, the thing that we've seen is we've seen celebrities and newspaper outlets and all these folks saying, "We will not pay for Twitter." Obviously, LeBron can afford the $8, but on a basis of principle, he does not want to pay the money for a site that he provides free content to when he is not benefiting economically downstream at all. And so as a result, they're in this weird place where the blue check mark was on the way to become just a demonstration that you are willing to pay for clout and become incredibly embarrassing. And so, they've come to this weird place where they did not remove blue check marks from most people.
Mine is still up there and it says, "This account is verified because it's a creditor Twitter blue or it's a legacy verified account," so they're trying to sell to people at $8 because they're trying to imply that you might be the same as LeBron James and it's just not going to work. A number of people have asked for their blue check marks to be removed because they do not want to be seen in that category of people and eventually they're going to have to follow through on their threat to remove it to people who don't pay, and at that point, the value of the blue check mark will go down. In fact, a number of people have joked that they'll probably pay the $8 because the only way you can get rid of the blue check mark is apparently pay the eight bucks and then push the button to stop it.
Or, you could tweet directly at Elon and get it done. He removed the New York Times after somebody tweeted a meme to him, and so it's just a complete mess. There's absolutely no way this is a viable economic model and they have now destroyed the value of the blue check mark in a way that I don't think is recoverable. Even if they go back to having real verification, after this, it is not going to be seen as a positive thing, and I think this is the beginning of the end. This is the beginning of the end because Twitter relies upon those people to create content for it, and the fact that the entire creative class and class of famous people who used to make Twitter an interesting place are moving en masse is a big problem for them.
Evelyn Douek:
The New York Times story is a personal favorite because you can't have any big move in Twitterland these days without a ham-fisted dose of hypocrisy along with it, which is the ostensible principle behind this is everyone should be treated equally. We should have the most powerful people and you, regular users, all on the same platform and treated in the same way. And then, "Oh, except for the people that Elon doesn't like." They will be handpicked to suffer certain fates on Musk's platform." When he removed the New York Times, the main account for the New York Times, not all of its other subsidiary accounts, it removed its blue check mark just for fun over the weekend. I've seen reporting that part of the reason why this is happening is because there is no good way to en masse remove the blue check marks. So, it's not necessarily an intentional move of keeping blue check marks for a whole bunch of people, but that it's just going to take a while.
Alex Stamos:
That's weird. They fired so many people that you can't do just like a basic data warehouse query, come up the list of people who have paid for Twitter Blue and then run against the database and remove the check marks for everybody else. That's the kind of thing that Twitter at least used to be able to do in a couple hours in an afternoon.
Evelyn Douek:
Meanwhile, Musk is losing his core constituency on Twitter because this week devastatingly, Catturd2 was unhappy with content moderation on Twitter, the previous thought leader on the platform, when Twitter was removing a bunch of posts about Trans Day of Vengeance events, protests that was being organized, and because it was done ham-fistedly, they just blanket removed both the tweets that were calling for this protest because they associated vengeance with the call for violence, which is now without any context removed under its new hate speech policy, but also removed posts and froze accounts that were just commenting on this event. And so, that's swept up a lot of people that are usually in Musk's fan camp, including Catturd2 and Marjorie Taylor Greene. So, everything going over very well on Twitter this week.
Alex Stamos:
Catturd2, and Marjorie Taylor Greene, and Ben Shapiro are really The Federalist Papers authors for the 21st century. If you have lost them on the definition of free speech and free expression, I don't think you're going to be able to survive. It's really unfortunate. I'm sorry, Elon, but you are part of the deep state now. Welcome.
Evelyn Douek:
That's right, dark moment. So quickly heading back to India this week, so there was some reporting that YouTube's new CEO Neal Mohan said YouTube would look into claims that videos from the recently disqualified opposition leader, Rahul Gandhi were being artificially suppressed, and in particular videos that were critical of Prime Minister Modi's BJP and its relationship with prominent businessman, Mr. Adani in the country. There was some evidence that these videos were getting significantly lower views than the number of engagements, likes, and otherwise would've suggested on comparable videos from Mr. Gandhi's account, and the Wall Street Journal had a letter from Mohan saying that they were looking into it. There's no proof here of what's going on. This is all just sort of speculation, but I think it's worth mentioning, because of course, India is worth watching and we've talked a lot about Twitter's activities in India and it's obviously highly unlikely that Twitter is the only platform that the Indian government is leaning on.
And so, we don't get as much as much reporting about what's going on in other platforms, but I think it is worth noting that there's certainly no reason to think that the BJP and the government isn't also leaning on other platforms to do it. And, just a small follow-up from a story that we had last week about Midjourney. So, Midjourney took the content moderation capitulations of other platforms in places like India and said, "Hold my beer."
There was reporting in the Washington Post this week that on Midjourney you can't use the tools to generate images of Xi Jinping because they just want to minimize the drama. They said that political satire... And this is really deep theory about free speech here, political satire in China is pretty not okay, and so the ability of people to use Midjourney in China to create images is more important than the ability to generate satire, which is just a beautiful articulation of free speech principles there. I think it's just a nice indication of how all of these tools are going to run into content moderation problems, and you have no guarantees that anyone thoughtful is going to be deciding the principles of how we approach this.
Alex Stamos:
So, today's the first day of class here at Stanford. Welcome back, Stanford students, and I am teaching trust and safety engineering, and this year I've added a lecture on generative AI trust and safety issues. And, I now have the worst quote I've ever heard from any CEO of an AI company regarding trust and safety issues. We just want to minimize drama is not an ethical framework by which you can make these decisions. Sorry, sir, but Mr. Hotz, you're probably going to want to hire a team that thinks about these things and thinks about what your fundamental rules are and what impact you want to have on the world because just minimizing drama is not going to cut it. Anyway, that is now replacing Mark Zuckerberg's making the world more open and connected, which I think was the real leader for almost 10 years for one of the silliest driving quotes from a CEO. We're just going to minimize drama, that is not the kind of thing you really want to put on a t-shirt or a poster in your corporate offices.
Evelyn Douek:
It's certainly not an ethical framework, although to be fair, maybe he's just saying the quiet part loud because a lot of trust and safety decisions over the years have been governed by the impulse to just try and minimize drama, not that has necessarily been that effective.
Alex Stamos:
No, you've got to appreciate the honesty of political satire in China is pretty not okay, which is also a pretty significant understatement. That's almost like a Mel Brooks... There's a political prisoners chained to the wall and somebody saying, "Oh, this is pretty not okay."
Evelyn Douek:
It's not very not okay or illegal or possibly punishable by being disappeared, it's just pretty not okay.
Alex Stamos:
Just to be serious, you're right, the trust and safety issues here are huge, partially because these systems are totally unpredictable and trying to prevent them from doing anything is incredibly hard. And so, there's all kinds of issues that are going to pop up. The idea that you could just blanketly keep one of these incredibly complex models from saying anything that is going to offend the Chinese Communist Party is just ridiculous. So even if your goal is to reduce drama, this is not the way to do it, and I do hope they rethink because it's just a fundamentally unethical way to put products out into the world.
Evelyn Douek:
To turn that frown upside down though, if you want something delightful, go and have a look for the Stable Diffusion images that people were circulating on Twitter in response to this story of Xi Jinping at a pride parade, which I thought were just lovely, so this story has a happy ending. And, I think that's all for the week. And, are you excited for first day back at class? Alex, all ready to go?
Alex Stamos:
I am. I'm super excited for class. I love teaching. I love the students, and once again, seeing if I can get canceled this quarter with everything going on around speech on campus. We had a fake professor like me, a non-tenure track professor just show an image of the Prophet Muhammad and get fired by a university, and that was just one section of her art history class. I teach a class about hate speech and bullying and harassment and child sexual exploitation and NCII, and we talk about situations in which there's been anti-Semitic attacks against journalists and we put up real tweets and we talk about what's the difficulty here of moderating when you have such incredible creativity from horrible racists who are trying to intimidate these people. And so, I'm saying to the lady who showed one image of old Arabic art, "Hold my beer," because here we go once more into the breach, good friends, but you know what, Stanford students are resilient, intelligent, young adults, and I am sure that I will not end up being destroyed in the Stanford Daily or having protests outside my classroom this quarter.
Evelyn Douek:
Well, that's great. I am starting teaching for the first time tomorrow afternoon, and that was exactly the rah-rah-rah pick-me-up motivational speech that I needed, Alex.
Alex Stamos:
What are you teaching now?
Evelyn Douek:
Platform regulation of the First Amendment, which is a totally non-controversial topic with no strong feelings or difficult free speech issues, that's for sure.
Alex Stamos:
Right, yes. It is interesting because I feel like a lot of the campus speech stuff, Stanford was a little bit of an island in the craziness. We're no Oberlin here. The student body is much more politically mixed than some of these really progressive liberal arts universities or Evergreen State College or something, but it has come for us, once again, time comes for all men and speech controversies come for all professors, I think. So anyway, it's going to be a good time, looking forward to the class.
If you're a Stanford student, you're welcome to join CS 152, and then my colleague Shelby Grossman is teaching the politics of internet abuse. So, if you're not a CS student, but you're interested in this area, I do recommend you taking her class. Our classes do a final project together. So, it's really cool to see the more policy oriented folks with the more tech oriented folks have to realize you look at their faces as they realize that they're going to have to work with each other in the future, and so they can work out their problems in the safe space of this class versus doing so in industry.
Evelyn Douek:
Well, we will be holed up in the safe space that is the law school and not [inaudible 00:30:12]=
Alex Stamos:
[inaudible 00:30:12] law school, no controversy [inaudible 00:30:13].
Evelyn Douek:
None whatsoever, and so-
Alex Stamos:
We'll have a Stanford law section after you get [inaudible 00:30:21]-
Evelyn Douek:
That's right, exactly. Six years, how's that for a teaser, everyone? Stay tuned, then you'll hear what Evelyn really thinks. And with that, this has been your Moderated Content Weekly Update. This show is available in all the usual places, including Apple Podcasts and Spotify, and show notes are available at law.stanford.edu/moderatedcontent. This episode wouldn't be possible without the research and editorial assistance of John Perrino, policy analyst at the Stanford Internet Observatory. It's produced by the wonderful Brian Pelletier. Special thanks also to Justin Fu, and Rob Huffman.