AIToday

MIT launches first-ever Data-Centric AI course covering techniques to improve datasets rather than just models

Hacker News1h ago2 min read
MIT launches first-ever Data-Centric AI course covering techniques to improve datasets rather than just models

Summaries like this, in your inbox every morning.

Sign up free →

3 Key Points

  1. 1

    A new course titled 'Data-Centric AI' ran from January 16–26, 2024 at MIT, co-taught by Anish, Curtis, and Jonas. The course covers algorithms to find and fix common issues in ML data and to construct better datasets, concentrating on supervised learning tasks like classification.

  2. 2

    Data-Centric AI treats dataset improvement as a systematic engineering discipline, in contrast to traditional machine learning classes that focus on producing effective models for a given dataset. The course emphasizes practical techniques not covered in most ML classes to address the 'garbage in, garbage out' problem in real-world ML applications.

  3. 3

    Topics included label errors, confident learning, class imbalance, outliers, distribution shift, dataset creation and curation, data-centric evaluation of ML models, and data curation for LLMs. Each lecture included an accompanying hands-on lab assignment in Python / Jupyter Notebook.

Discussion

No comments yet. Be the first to share your thoughts!

Log in to join the discussion

Related Articles

Stay ahead with AI news

Get curated AI news from 200+ sources delivered daily to your inbox. Free to use.

Get Started Free

5 minutes a day. The AI essentials.

200+ sources · Email / LINE / Slack

Get it free →