01.
Overview
Through the X Moderation Research Consortium (“TMRC” or the “Consortium”), X shares large-scale datasets concerning platform moderation issues with a global group of members, comprising of public interest researchers from across academia, civil society, NGOs and journalism, studying platform governance issues.
Through the Consortium, X will continue to support our existing disclosures of datasets of persistent platform manipulation campaigns, which consist of material that was posted in violation of our platform manipulation and spam policy. Over time, we intend to share similarly comprehensive data about other policy areas with the Consortium.
Transparency is core to our mission and has been a critical part of X from the start. In October 2018, we launched the first archive in the industry of potential foreign information operations we had seen on X. The Consortium continues and expands on that access. We’ve designed the Consortium as an industry-leading effort to increase transparency around X’s content moderation policies and enforcement decisions, so credible, public interest researchers can independently investigate, learn, and produce insights that inform the public, policymakers, and other researchers.
Our goal is to provide increased transparency about more issues that impact the health of the platform, while grappling with the considerable safety, security, and integrity challenges in this space. We hope that expanded transparency through disclosures to the Consortium can help us all learn and build the necessary societal defenses and capacities to protect public conversation.
02.
FAQs
Who is eligible to join the Consortium?
Consortium membership is by application. Our intent is to be inclusive while aiming to ensure the privacy and security of the Consortium’s data, and its ethical and public interest use. The Consortium welcomes applications from researchers – from diverse backgrounds, experiences, and who use varied methodologies – who undertake data-driven analysis related to content moderation.
To be an eligible candidate for membership, applicants must demonstrate the following:
- That they hold a primary institutional affiliation with an academic, journalistic, nonprofit, or civil society research organization. If they are students, they must be master’s or PhD level students; undergraduate students are ineligible at this time.
- Prior experience and relevant skills for data-driven analysis. Consortium datasets are primarily shared as JSON files and require technical skills to analyze.
- A specific public interest research use case for the data provided by the Consortium. (“Public interest research use case” means non-commercial research for journalistic, academic, or non-profit/civil society purposes.)
- Industry-standard plans and systems for safeguarding the privacy and security of the data provided by the Consortium. Consortium members are required to sign a data use agreement.
More information on eligibility and a link to the application is at the bottom of this page.
What data is shared with the Consortium?
To start, we are continuing our ongoing disclosures of persistent platform manipulation campaigns and information operations, which are prohibited by X’s platform manipulation and spam policy. (Manipulation that we can reliably attribute to a government or state linked actor is considered an information operation.) Over time, we intend to share similarly comprehensive data about persistent platform manipulation campaigns that are not attributable to state-backed actors, as well as other content moderation policy areas and enforcement decisions – and we will update this page with more information when we do. The exact data types we share may vary depending on the types of activity in question.
Members of the Consortium have access to an archive of information operations datasets starting from 2018. We have attributed these information operations either publicly or internally. Once our teams have identified, removed and investigated these campaigns and any associated violative content, we share datasets with Consortium members. These datasets include profile information, posts and media (e.g., images and videos) from accounts we believe are connected to state linked information operations. Posts and media which were deleted are not included in the datasets. The data the Consortium has access to is not hashed, unlike the public historic archive. Note that not all of the accounts we identified as connected to these campaigns actively posted, so the number of accounts represented in the datasets may be less than the total number of accounts attributed to the information operation and enforced against.
All Consortium datasets require members to be able to analyze large datasets due to their size.
How is the publicly accessible information operations archive different from what the Consortium has access to?
Beginning in October 2018, we published the first comprehensive, public archive of data related to state-backed information operations. From that date through early 2022, when we launched the X Moderation Research Consortium, we publicly shared 37 datasets of attributed platform manipulation campaigns originating from 17 countries, spanning more than 200 million posts and nine terabytes of media.
With the advent of the X Moderation Research Consortium, we have discontinued public dataset releases, instead focusing on releasing data to the Consortium. The existing archive of information operations datasets continue to be available for download below — while no content has been redacted, some account-specific information has been hashed to protect account privacy.
Why is the publicly accessible information operations archive hashed?
For accounts with fewer than 5,000 followers, we hashed certain identifying fields (such as user ID and screen name) in the publicly-accessible archive. While we’ve taken precautions to minimize false positives in these datasets, we’ve nevertheless hashed select fields to reduce the potential for negative impact on authentic or compromised accounts — while still enabling longitudinal research, network analysis, and assessment of the underlying content created by these accounts.
Members of the Consortium are provided access to unhashed versions of these datasets for research. Consortium members agree to the terms of a data license agreement limiting usage of the unhashed datasets to research purposes, with provisions to ensure the researcher may only use the datasets pursuant to specific limitations and in conjunction with appropriate security measures.
Where else can I access X data for research purposes?
If you are an academic, check out free academic access to our API for research here. Learn more about general API access here.
What can I do if I believe I've been included here in error?
If you believe your account has been included in one of these datasets in error, please log into your X account and file a suspension appeal here for our full review.
03.
Download Hashed Information Operations Archive (2018-2022)
03.
Download Archive
Beginning in October 2018, we published the first comprehensive, public archive of data related to state-backed information operations. From that date through early 2022, when we launched the X Moderation Research Consortium, we publicly shared 37 datasets of attributed platform manipulation campaigns originating from 17 countries, spanning more than 200 million posts and nine terabytes of media.
With the advent of the X Moderation Research Consortium, we have discontinued public dataset releases, instead focusing on releasing data to the Consortium. The existing archive of information operations datasets continue to be available for download below — while no content has been redacted, some account-specific information has been hashed to protect account privacy.
You can download the datasets by entering your email address and clicking “Submit”. Your use of the datasets is governed by the X Developer Agreement and Policy. By clicking “Submit”, you agree to the X Developer Agreement and Policy.
If you believe your account has been included in one of the datasets in error, please log into your X account and file a suspension appeal here. We carefully review these cases, and may be able to help restore potentially compromised accounts, or accounts that may have been included in error.
Thank you for submitting the form.
Download the corresponding Dataset Readme. Read more on our blog.
People’s Republic of China - Changyu Culture (December 2021) - 112 Accounts
- Account Information (19.09 KB)
- post Information (2.57 MB)
- Media (4.61 GB, 3 archives)
People’s Republic of China - Xinjiang (December 2021) - 2048 Accounts
- Account Information (350.04 KB)
- post Information (3.46 MB)
- Media (2.3 GB, 3 archives)
Russia IRA North Africa (December 2021) - 16 Accounts
- Account Information (11.66 KB)
- post Information (2.48 MB)
- Media (3.31 GB, 2 archives)
Russia East Africa (December 2021) - 50 Accounts
- Account Information (3.86 KB)
- post Information (1.11 MB)
- Media (1.65 GB, 1 archive)
Tanzania (December 2021) - 268 Accounts
- Account Information (48.77 KB)
- post Information (1.58 MB)
- Media (1.81 GB, 1 archive)
Uganda (December 2021) - 418 Accounts
- Account Information (93.19 KB)
- post Information (62.14 MB)
- Media (19.92 GB, 11 archives)
Venezuela (December 2021) - 277 Accounts
- Account Information (60.42 KB)
- post Information (82.07 MB)
- Media (20.18 GB, 11 archives)
Mexico (December 2021) - 276 Accounts
- Account Information (63.9 KB)
- post Information (2.49 MB)
- Media (2.84 GB, 2 archives)
Download the corresponding Dataset Readme. Read more on our blog.
Iran (February 2021) - 238 Accounts
- Account Information (38 KB)
- post Information (285.2 MB)
- Media (32.4 GB, 16 archives)
Armenia (February 2021) - 35 Accounts
- Account Information (8.3 KB)
- post Information (46.7 KB)
- Media (1.2 GB, 1 archive)
Russia GRU (February 2021) - 69 Accounts
- Account Information (8.9 KB)
- post Information (14.3 MB)
- Media (1.8 GB, 1 archive)
Russia IRA (February 2021) - 31 Accounts
- Account Information (4.6 KB)
- post Information (36.6 MB)
- Media (2.6 GB, 2 archives)
Download the corresponding Dataset Readme. Read more on our blog.
Iran (September 2020) - 104 Accounts
Account Information (7.1 KB)
post Information (292 KB)
Media (16.7 GB, 8 archives)
Russia (September 2020) - 5 Accounts
Account Information (2 KB)
post Information (180 KB)
Media (10 MB, 1 archives)
Thailand (September 2020) - 926 Accounts
Account Information (45 KB)
post Information (2.3 MB)
Media (2.9 GB, 2 archives)
Cuba (September 2020) - 526 Accounts
Account Information (45.7 KB)
post Information (666 MB)
Media (49.2 GB, 25 archives)
Saudi Arabia (September 2020) - 33 Accounts
Account Information (2.9 KB)
post Information (24 KB)
Media (5.8 GB, 3 archives)
Download the corresponding Dataset Readme. Read more on our blog.
China (May 2020) - 23750 Accounts
Account Information (1 MB)
post Information (73.2 MB)
Media (31 GB, 17 archives)
Turkey (May 2020) - 7340 Accounts
Account Information (533 KB)
post Information (5 GB)
Media (821 GB, 391 archives)
Russia (May 2020) - 1152 Accounts
Account Information (85 KB)
post Information (353 MB)
Media (108 GB, 54 archives)
Download the corresponding Dataset Readme. Read more about these datasets on X.
Egypt (February 2020) - 2541 Accounts
Account Information (191 KB)
post Information (1 GB)
Media (575 GB, 204 archives)
Honduras (February 2020) - 3104 Accounts
Account Information (178 KB)
post Information (137 MB)
Media (75 GB, 34 archives)
Indonesia (February 2020) - 795 Accounts
Account Information (57 KB)
post Information (207 MB)
Media (58 GB, 21 archives)
Serbia (February 2020) - 8558 Accounts
Account Information (470 KB)
post Information (5.7 GB)
Media (2.3 TB, 981 archives)
SA_EG_AE (February 2020) - 5350 Accounts
Account Information (388 KB)
post Information (4.2 GB)
Media (977 GB, 330 archives)
Download the corresponding Dataset Readme. Read more about this dataset on X.
Ghana / Nigeria (March 2020) - 71 Accounts
Account Information (18 KB)
post Information (27 MB)
Media (17 GB, 6 archives)
Download the corresponding Dataset Readme, and read more about these datasets on our blog.
Saudi Arabia (October 2019) - 5,929 Accounts
Account Information (512 KB)
post Information (4.3 GB)
Media (1.3 TB)
Download the corresponding Dataset Readme, and read more about these datasets on our blog.
China (July 2019, set 3) - 4,301 Accounts
Account Information (258 KB)
post Information (913 MB)
Media (604 GB, 224 archives)
Saudi Arabia (April 2019) - 6 Accounts
Account Information (1 KB)
post Information (38 KB)
Media (357 MB, 1 archives)
Ecuador (April 2019) - 1,019 Accounts
Account Information (57 KB)
post Information (85 MB)
Media (173 GB, 56 archives)
United Arab Emirates (March 2019) - 4,248 Accounts
Account Information (355 KB)
post Information (227 MB)
Media (680 GB, 304 archives)
Spain (April 2019) - 259 Accounts
Account Information (20 KB)
post Information (7 MB)
Media (16 GB, 9 archives)
United Arab Emirates / Egypt (April 2019) - 271 Accounts
Account Information (20 KB)
post Information (30 MB)
Media (45 GB, 19 archives)
Download the corresponding Dataset Readme, and read more about these datasets on our blog.
China (July 2019, set 1) - 744 Accounts
Account Information (41 KB)
post Information (158 MB)
Media (85 GB, 32 archives)
China (July 2019, set 2) - 196 Accounts
Account Information (14 KB)
post Information (169 MB)
Media (40 GB, 17 archives)
Download the corresponding Dataset Readme, and read more about these datasets on our blog.
Catalonia (June 2019) - 130 accounts
post information (1.5MB)
Media (2.74GB, 3 archives)
Iran (June 2019, set 1) - 1,666 accounts
post information (316MB)
Media (258GB, 111 archives)
Iran (June 2019, set 2) - 248 accounts
post information (318MB)
Media (183GB, 29 archives)
Iran (June 2019, set 3) - 2,865 accounts
post information (46MB)
Media (55GB, 18 archives)
Russia (June 2019) - 4 accounts
post information (260KB)
Media (72MB, 2 archives)
Venezuela (June 2019) - 33 accounts
post information (64MB)
Media (24GB, 13 archives)
Download the corresponding Dataset Readme, and read more about these datasets on our blog.
Iran (January 2019) - 2,320 accounts
post information (717MB)
Media (202GB, 89 archives)
Bangladesh (January 2019) - 15 accounts
post information (2.6MB)
Media (77MB, 3 archives)
Russia (January 2019) - 416 accounts
post information (120MB)
Media (63.7GB, 8 archives)
Venezuela (January 2019, set 1) - 1,196 accounts
post information (1GB)
Media (359GB, 75 archives)
Venezuela (January 2019, set 2) - 764 accounts
post information (136MB)
Media (81GB, 37 archives)
Download the corresponding Dataset Readme, and read more about these datasets on our blog.
Internet Research Agency (October 2018) - 3,613 accounts
post information (1.2GB)
Media (274GB, 300 archives)
Iran (October 2018) - 770 accounts
post information (168MB)
Media (65.7GB, 52 archives)
04.
Applying To Join The Consortium
04.
Applying To Join The Consortium
Thank you for your interest in joining the X Moderation Research Consortium! Please read this full overview before filling out the application linked here and below.
Consortium membership is by application. Our intent is to be inclusive while aiming to ensure the privacy and security of the Consortium’s data, and its ethical and public interest use. The Consortium welcomes applications from researchers – from diverse backgrounds, experiences, and who use varied methodologies – who undertake data-driven analysis related to content moderation.
To be an eligible candidate for membership, applicants must demonstrate the following:
- That they hold a primary institutional affiliation with an academic, journalistic, nonprofit, or civil society research organization. If they are students, they must be master’s or PhD level students; undergraduate students are ineligible at this time.
- Prior experience and relevant skills for data-driven analysis. Consortium datasets are primarily shared as JSON files and require technical skills to analyze.
- A specific public interest research use case for the data provided by the Consortium. (“Public interest research use case” means non-commercial research for journalistic, academic, or non-profit/civil society purposes.)
- Industry-standard plans and systems for safeguarding the privacy and security of the data provided by the Consortium. Consortium members are required to sign a data use agreement.
Consortium Ineligibility
Additionally, applicants are ineligible to join the Consortium if they:
Are undergraduate students; only master’s or PhD level students are eligible.
Hold industry and government positions as their primary institutional affiliation.
Do not hold a primary institutional affiliation in academia, journalism, nonprofit, or civil society research organization.
Plan to share the Consortium’s data with governments or other outside parties.
Application Processing and Review
Applications will be reviewed by X, and applicants will be notified of acceptances or rejections. Successful applicants will be researchers with a demonstrable history of independent research or have met other criteria that demonstrate an ability to be entrusted with the Consortium data and to pursue research for a qualified purpose. Qualified research for purposes of the Consortium is academic, journalistic, nonprofit, or civil society research that aims to better understand content moderation and issues of platform integrity.
Once accepted in the Consortium, Qualified Researchers are provided access to data sets to work independently. X makes no representations about the quality, nature or frequency of the Consortium’s data sets, releases or updates; the work or type of qualified research Consortium members pursue; and does not review nor participate in the decisions or work product of the Consortium’s Qualified Researchers.
Your decision to complete this application is completely voluntary. By submitting your application you give us permission to use your answers to evaluate your eligibility to become a member of the Consortium. Your individual responses are confidential and your personal information will only be used to evaluate your eligibility to participate. If you wish to withdraw your application after submitting it, please respond to the email you will receive confirming our receipt of your application. You can also contact X by clicking here.
Tips on Filling Out and Submitting the Application
We recommend reviewing the application in full, drafting responses in advance in a separate document, and entering all final responses in the form when you are ready to submit. The more information you share with us, the easier it is for us to review and consider the eligibility of your application.
Please fill out the application form in English.