All Tools
C
DataFreeOpen Source
CLEANLAB
Automatically detect data and label issues in your ML datasets
Apache-2.0
ABOUT
ML models are only as good as the data they train on, yet real-world datasets are plagued by label errors, outliers, duplicates, and other quality issues that silently degrade model performance. Cleanlab automatically identifies these issues using your existing models, enabling you to clean datasets and train more robust ML systems without changing your modeling code.
INSTALL
pip install cleanlabINTEGRATION GUIDE
1. Automatically detect label errors, outliers, and duplicates in image, text, audio, and tabular datasets
2. Train robust classification and regression models that are resilient to noisy labels
3. Improve consensus labels and estimate annotator quality for multi-annotator data
TAGS
data-qualitydata-cleaninglabel-errorsoutlier-detectionmachine-learningdata-centric-aipythonactive-learning