Comment on page
A collection of open data sets for all sorts of interesting things.
- LuminDatabase A searchable database that collects and analyzes legal complaints and requests for removal of online materials (DMCA take-downs), helping Internet users to know their rights and understand the law.
- CensoredPlanet A censorship measurement platform that collects data using multiple remote measurement techniques in more than 200 countries. Provides reports and offers their raw data sets which are available for download.
- CORE Research Paper Database CORE currently contains 207,255,818 searchable open access articles and research papers collected from 10,286 data providers around the world, which you can search using keywords.
- Public Intelligence An international, collaborative research project aimed at aggregating the collective work of independent researchers around the globe who wish to defend the public’s right to access information.
- HuggingFace Offers models based on transformers for PyTorch and TensorFlow 2.0. There are thousands of pre-trained models to perform tasks such as text classification, extraction, question answering, and more.
- Information Operations Archive An archive of publicly available and attributed data from known online information operations. The archive currently consists of over 10 million messages from Russian and Iranian state-sponsored influence operations on Twitter and Reddit, and will be updated on an ongoing basis.
- US National Archives An independent agency of the United States government charged with the preservation and documentation of government and historical records. It is also tasked with increasing public access to those documents which make up the National Archive.
- US Library of Congress The Library of Congress is the research library that officially serves the United States Congress and is the national library of the United States. It is the oldest federal cultural institution in the U.S.
- UK National Archives One of the world's largest archives, containing over 11 million historical government and public records. From Domesday Book to modern government files. Includes paper records, digital records, websites, photographs, posters, maps, drawings and paintings.
- ICO Search The Information Commissioner's Office (ICO) upholds information rights in the public interest, promoting openness by public bodies and data privacy for individuals. ICO is an executive non-departmental public body, sponsored by the Department for Digital, Culture, Media and Sport.
- Netronline Public Records An online directory and portal to those Tax Assessors', Treasurers’ and Recorders' offices that have developed websites for the retrieval of available public records for the U.S.
- U.S. Traffic Cameras and Reports Covers state, county and city traffic cameras as well as accessing and filing traffic accident reports. Many state, county and city police departments provide forms online for filing accident reports and some even provide online searchable accident databases.
- MuckRock Assists thousands of users to file, track, and share public and FOIA record requests at the state, local, and federal levels within the United States, as well as producing original reporting on government transparency.
- WikiLeaks An international non-profit organization that publishes leaks and classified media from governments, companies and organizations alike. All data is provided by anonymous sources. #FreeAssange.
- CryptoMe Publishes documents that are prohibited by governments worldwide. Particularly material on freedom of expression, privacy, cryptology, dual-use technologies, national security, intelligence, secret governance, open, secret and/or classified documents.
- ICIJ Luxembourg Leaks Also known as the "LuxLeaks", is a collection of over 350 documents about Luxembourg's tax rulings set up by PricewaterhouseCoopers from 2002 to 2010 for the benefits of its clients.
- Distributed Denial of Secrets A journalist 501(c)(3) non-profit devoted to enabling the free transmission of data in the public interest. Aims to avoid political, corporate or personal leanings, to act as a beacon of available information.
An organized and importable .html bookmark file that includes everything listed on this page.
A PDF copy that contains everything on this page for offline use. Updated - 24/9/2022.