← 記事一覧に戻る

大規模言語モデルオープンソースAI

Researchers boost multilingual hate speech detection by combining web-scale pre-training with LLM-generated synthetic labels across four languages.

arXiv cs.CL · 2026年4月14日

Researchers boost multilingual hate speech detection by combining web-scale pre-training with LLM-generated synthetic labels across four languages.

AI要約

•Continued pre-training on unlabeled OpenWebSearch.eu data improved BERT models by ~3% average macro-F1 across 16 benchmarks, with larger gains in low-resource languages
•Ensemble of four open-source LLMs (Mistral-7B, Llama3.1-8B, Gemma2-9B, Qwen2.5-14B) generated synthetic annotations for hate speech detection
•LightGBM meta-learner ensemble outperformed simpler strategies like mean averaging and majority voting for combining LLM predictions
•Study covers English, German, Spanish, and Vietnamese languages, demonstrating improved cross-lingual generalization for hateful content detection

元記事を読む

関連記事

Moonshot AI launches open-weight Kimi K2.6 model to rival closed proprietary AI systems while supporting massive agent swarms

大規模言語モデル

Moonshot AI launches open-weight Kimi K2.6 model to rival closed proprietary AI systems while supporting massive agent swarms

THE DECODER·2026年4月20日

Noetik uses transformer AI models like TARIO-2 to address the 95% failure rate in cancer drug trials by reframing the problem as one of patient-treatment matching.

大規模言語モデル

Noetik uses transformer AI models like TARIO-2 to address the 95% failure rate in cancer drug trials by reframing the problem as one of patient-treatment matching.

Latent Space·2026年4月20日

Connie Ballmer's $80 million donation bolsters NPR as federal public broadcasting funding faces $1.1 billion cuts under Trump administration.

大規模言語モデル

Connie Ballmer's $80 million donation bolsters NPR as federal public broadcasting funding faces $1.1 billion cuts under Trump administration.

Fortune AI·2026年4月20日

Researchers investigate whether models trained to avoid deceptive behavior can maintain alignment when deployed in different environments.

大規模言語モデル

Researchers investigate whether models trained to avoid deceptive behavior can maintain alignment when deployed in different environments.

LessWrong AI·2026年4月20日

AWS introduces ToolSimulator, an LLM-powered framework within Strands Evals that enables safe, scalable testing of AI agents without risking live API calls or data exposure.

大規模言語モデル

AWS introduces ToolSimulator, an LLM-powered framework within Strands Evals that enables safe, scalable testing of AI agents without risking live API calls or data exposure.

Amazon AI Blog·2026年4月20日

AIニュースを毎日お届け

200以上のソースから厳選したAIニュースを毎日無料でお届けします。

無料で始める