|
1 |
| -# awesome-ai-safety |
| 1 | +# Awesome AI Safety [](https://github.com/sindresorhus/awesome) |
2 | 2 |
|
3 |
| -Coming soon, pull request will be merged tomorrow. |
| 3 | +[](./LICENSE) |
| 4 | +[](./CONTRIBUTING.md) |
| 5 | +[](https://gisk.ar/discord) |
| 6 | +[](https://fosstodon.org/@Giskard) |
| 7 | +[](http://hits.dwyl.com/Giskard-AI/awesome-ai-safety) |
| 8 | + |
| 9 | +Figuring out how to make your AI safer? How to avoid ethical biases, errors, privacy leaks or robustness issues in your AI models? |
| 10 | + |
| 11 | +This repository contains a curated list of papers & technical articles on AI Quality & Safety that should help 📚 |
| 12 | + |
| 13 | +## Table of Contents |
| 14 | + |
| 15 | +You can browse papers by Machine Learning task category, and use hashtags like `#robustness` to explore AI risk types. |
| 16 | + |
| 17 | +1. [Tabular Machine Learning](#tabular-machine-learning) |
| 18 | +2. [Natural Language Processing](#natural-language-processing) |
| 19 | +3. [Computer Vision](#computer-vision) |
| 20 | +4. [Recommendation System](#recommendation-system) |
| 21 | +5. [Time Series](#time-series) |
| 22 | +6. [General ML Testing](#general-ml-testing) |
| 23 | + |
| 24 | +## Tabular Machine Learning |
| 25 | + |
| 26 | +* [Machine Learning Model Drift Detection Via Weak Data Slices](https://arxiv.org/pdf/2108.05319.pdf) (Ackerman et al., 2021) `#DataSlice` `#Debugging` `#Drift` |
| 27 | +* [Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach](https://ieeexplore.ieee.org/abstract/document/8713886) (Chung et al., 2020) `#DataSlice` |
| 28 | +* [Interacting with Predictions: Visual Inspection of Black-box Machine Learning Models](https://dl.acm.org/doi/abs/10.1145/2858036.2858529) (Krause et al., 2016) `#Explainability` |
| 29 | + |
| 30 | +## Natural Language Processing |
| 31 | + |
| 32 | +* [Beyond Accuracy: Behavioral Testing of NLP Models with CheckList](http://homes.cs.washington.edu/~marcotcr/acl20_checklist.pdf) (Ribeiro et al., 2020) `#Robustness` |
| 33 | +* [Pipelines for Social Bias Testing of Large Language Models](https://openreview.net/pdf/8be28761ea130113e3be7747870c434f53e9b309.pdf) (Nozza et al., 2022) `#Bias` `#Ethics` |
| 34 | +* [Why Should I Trust You?": Explaining the Predictions of Any Classifier](https://arxiv.org/abs/1602.04938) (Ribeiro et al., 2016) `#Explainability` |
| 35 | +* [A Unified Approach to Interpreting Model Predictions](https://arxiv.org/abs/1705.07874) (Lundberg et al., 2017) `#Explainability` |
| 36 | +* [Anchors: High-Precision Model-Agnostic Explanations](https://homes.cs.washington.edu/~marcotcr/aaai18.pdf) (Ribeiro et al., 2018) `#Explanability` |
| 37 | +* [Explanation-Based Human Debugging of NLP Models: A Survey](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00440/108932/Explanation-Based-Human-Debugging-of-NLP-Models-A) (Lertvittayakumjorn, et al., 2021) `#Debugging` |
| 38 | + |
| 39 | +### Large Language Models |
| 40 | + |
| 41 | +* [Holistic Evaluation of Language Models](https://arxiv.org/abs/2211.09110) (Liang et al., 2022) `#General` |
| 42 | +* [Learning to summarize from human feedback](https://proceedings.neurips.cc/paper/2020/file/1f89885d556929e98d3ef9b86448f951-Paper.pdf) (Stiennon et al., 2020) `#HumanFeedback` |
| 43 | + |
| 44 | +## Computer Vision |
| 45 | + |
| 46 | +* [DOMINO: Discovering Systematic Errors with Cross-modal Embeddings Domino](https://arxiv.org/pdf/2203.14960.pdf) (Eyuboglu et al., 2022) `#DataSlice` |
| 47 | +* [Explaining in Style: Training a GAN to explain a classifier in StyleSpace](https://arxiv.org/pdf/2104.13369.pdf) (Lang et al., 2022) `#Robustness` |
| 48 | +* [Model Assertions for Debugging Machine Learning](https://ddkang.github.io/papers/2018/omg-nips-ws.pdf) (Kang et al., 2018) `#Debugging` |
| 49 | + |
| 50 | +## Recommendation System |
| 51 | + |
| 52 | +[Contributions are welcome 💕](CONTRIBUTING.md) |
| 53 | + |
| 54 | +## Time Series |
| 55 | + |
| 56 | +[Contributions are welcome 💕](CONTRIBUTING.md) |
| 57 | + |
| 58 | +## General ML Testing |
| 59 | + |
| 60 | +* [Machine learning testing: Survey, landscapes and horizons](https://ieeexplore.ieee.org/abstract/document/9000651/) (Zhang et al., 2020) `#General` |
| 61 | +* [Quality Assurance for AI-based Systems: Overview and Challenges](https://arxiv.org/abs/2102.05351) (Felderer et al., 2021) `#General` |
| 62 | +* [Metamorphic testing of decision support systems: A case study](https://digital-library.theiet.org/content/journals/10.1049/iet-sen.2009.0084) (Kuo et al., 2010) `#Robustness` |
| 63 | +* [A Survey on Metamorphic Testing](https://ieeexplore.ieee.org/document/7422146) (Segura et al., 2016) `#Robustness` |
| 64 | +* [Testing and validating machine learning classifiers by metamorphic testing](https://www.sciencedirect.com/science/article/abs/pii/S0164121210003213) (Xie et al., 2011) `#Robustness` |
| 65 | +* [The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf) (Breck et al., 2017) `#General` |
| 66 | +* [The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective](https://arxiv.org/pdf/2202.01602.pdf) (Krishna et al., 2022) `#Explanability` |
0 commit comments