Single view – International Journal of Current Science Research and Review

Explainable, Evidence-Based Verification of Arabic Claims via Multi-Source Retrieval and Cross-Lingual NLI

Ahmad AlfaqehiMasters in AI graduate student, Computer Science Department, College of Computers and Information Technology, Taif University Taif, Kingdom of Saudi Arabia
Khalid AljuaidMasters in AI graduate student, Computer Science Department, College of Computers and Information Technology, Taif University Taif, Kingdom of Saudi Arabia
Abdullah SheikhAssistant Professor of Computer Science, Computer Science Department, College of Computers and Information Technology, Taif University, Taif, Kingdom of Saudi Arabia

Vol 9 No 6 (2026): Volume 09 Issue 06 June 2026

Article Date Published : 29 June 2026 | Page No.: 3567-3578 | Google Scholar | Crossref doi for this article

Abstract :

We present a training-free, explainable system for verifying Arabic-language claims that combines Arabic Named-Entity Recognition (NER), parallel multi-source evidence retrieval, dense semantic reranking, and cross-lingual Natural Language Inference (NLI) under a single weighted verdict aggregator. Entities are extracted with CAMeLBERT-mix-NER and used to bias a parallel search over trusted Arabic RSS feeds, Google News, a verified-account X (Twitter) endpoint, and DuckDuckGo. Retrieved snippets are reranked by a multilingual-E5 encoder and scored by an XLM-RoBERTa-large checkpoint fine-tuned on XNLI/ANLI; per-source entailment and contradiction probabilities are combined through a weighted aggregator with multiplicatively capped priors over source authority, learned domain reputation, author credibility, and recency. We evaluate on the AraFacts benchmark and make the following contributions, each of which a reader can rely on: (i) a corrected, openly unit-tested aggregator that lets all retrieved evidence—not only official sources—drive the verdict; (ii) a rigorous, reproducible baseline study showing that AraFacts’s natural class imbalance (94% of claims are false-labelled) makes accuracy misleading and that even a well-tuned classical text-only classifier reaches only 0.40 macro-F1; and (iii) an explainable system packaged for deployment as a Streamlit application, a FastAPI service, and a Telegram bot, each exposing a per-source evidence trail. We also document and correct an evaluation error in an earlier version of this work. Code, scripts, and unit tests are released for full reproducibility.

Keywords :

Arabic NLP, fact-checking, information retrieval, Misinformation detection, natural language inference, retrieval-augmented verification, social-media integrity.

References :

Vlachos and S. Riedel, “Fact Checking: Task Definition and Dataset Construction,” in Proc. ACL Workshop on Language Technologies and Computational Social Science, 2014, pp. 18–22.
Thorne, A. Vlachos, C. Christodoulopoulos, and A. Mittal, “FEVER: A Large-Scale Dataset for Fact Extraction and Verification,” in Proc. NAACL-HLT, 2018, pp. 809–819.
Augenstein et al., “MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims,” in Proc. EMNLP-IJCNLP, 2019, pp. 4685–4697.
Schlichtkrull, Z. Guo, and A. Vlachos, “AVeriTeC: A Dataset for Real-World Claim Verification with Evidence from the Web,” in Proc. NeurIPS Datasets and Benchmarks Track, 2023.
Hassan, F. Arslan, C. Li, and M. Tremayne, “Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster,” in Proc. KDD, 2017, pp. 1803–1812.
Atanasova, J. G. Simonsen, C. Lioma, and I. Augenstein, “Generating Fact Checking Explanations,” in Proc. ACL, 2020, pp. 7352–7364.
Kotonya and F. Toni, “Explainable Automated Fact-Checking: A Survey,” in Proc. COLING, 2020, pp. 5430–5443.
Sheikh Ali, W. Mansour, T. Elsayed, and A. Al-Ali, “AraFacts: The First Large Arabic Dataset of Naturally Occurring Claims,” in Proc. WANLP, 2021, pp. 231–236.
Khouja, “Stance Prediction and Claim Verification: An Arabic Perspective,” in Proc. 3rd Workshop on Fact Extraction and VERification (FEVER), 2020, pp. 8–17.
Alhindi, A. Alabdulkarim, A. Alshehri, M. Abdul-Mageed, and P. Nakov, “AraStance: A Multi-Country and Multi-Domain Dataset of Arabic Stance Detection for Fact Checking,” in Proc. NLP4IF, 2021, pp. 57–65.
S. Hadj Ameur and H. Aliane, “AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset,” Procedia Computer Science, vol. 189, pp. 232–241, 2021.
Abouzied, F. Alam, R. Ali, and P. Papotti, “Combating Misinformation in the Arab World: Challenges and Opportunities,” Communications of the ACM, vol. 68, 2025, doi: 10.1145/3737450.
Najadat, M. Tawalbeh, and R. Awawdeh, “Fake News Detection for Arabic Headlines-Articles News Data Using Deep Learning,” Int. J. Electrical and Computer Engineering, vol. 12, no. 4, pp. 3951–3959, 2022.
Allam and A. E. Hassanien, “Detection of Fake News in Arabic Tweets Using Convolutional Neural Network,” in Proc. AISI, 2019, pp. 415–425.
Saleh, A. Alharbi, and S. H. Alsamhi, “OPCNN-FAKE: Optimized Convolutional Neural Network for Fake News Detection,” IEEE Access, vol. 9, pp. 129471–129489, 2021.
M. Alkudah, N. B. Idris, and M. A. M. Abushariah, “Fake News Detection in Arabic Media: Comparative Analysis of Machine Learning and Deep Learning Algorithms Using the Arabic Fake News Dataset,” PeerJ Computer Science, vol. 11, art. e3272, 2025.
Gupta and V. Srikumar, “X-FACT: A New Benchmark Dataset for Multilingual Fact Checking,” in Proc. ACL-IJCNLP, 2021, pp. 675–682.
Barrón-Cedeño et al., “Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media,” in Proc. CLEF, 2020, pp. 215–236.
Nakov et al., “Overview of the CLEF-2021 CheckThat! Lab,” in Proc. CLEF, 2021, pp. 264–291.
Shaar, N. Babulkov, G. Da San Martino, and P. Nakov, “That Is a Known Lie: Detecting Previously Fact-Checked Claims,” in Proc. ACL, 2020, pp. 3607–3618.
Haouari, M. Hasanain, R. Suwaileh, and T. Elsayed, “Tahaqqaq: A Real-Time System for Assisting Twitter Users in Arabic Claim Verification,” in Proc. SIGIR, 2023, pp. 3019–3023.
Antoun, F. Baly, and H. Hajj, “AraBERT: Transformer-based Model for Arabic Language Understanding,” in Proc. OSACT4, 2020, pp. 9–15.
Inoue, B. Alhafni, N. Baimukan, H. Bouamor, and N. Habash, “The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models,” in Proc. WANLP, 2021, pp. 92–104.
Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proc. NAACL-HLT, 2019, pp. 4171–4186.
Conneau et al., “Unsupervised Cross-Lingual Representation Learning at Scale,” in Proc. ACL, 2020, pp. 8440–8451.
Williams, N. Nangia, and S. Bowman, “A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference,” in Proc. NAACL-HLT, 2018, pp. 1112–1122.
Conneau et al., “XNLI: Evaluating Cross-lingual Sentence Representations,” in Proc. EMNLP, 2018, pp. 2475–2485.
Nie, A. Williams, E. Dinan, M. Bansal, J. Weston, and D. Kiela, “Adversarial NLI: A New Benchmark for Natural Language Understanding,” in Proc. ACL, 2020, pp. 4885–4901.
Vaswani et al., “Attention Is All You Need,” in Advances in NeurIPS, 2017, pp. 5998–6008.
Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” arXiv:1907.11692, 2019.
Karpukhin et al., “Dense Passage Retrieval for Open-Domain Question Answering,” in Proc. EMNLP, 2020, pp. 6769–6781.
Khattab and M. Zaharia, “ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT,” in Proc. SIGIR, 2020, pp. 39–48.
Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks,” in Proc. EMNLP-IJCNLP, 2019, pp. 3982–3992.
Wang, N. Yang, X. Huang, L. Yang, R. Majumder, and F. Wei, “Multilingual E5 Text Embeddings: A Technical Report,” arXiv:2402.05672, 2024.
Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” in Advances in NeurIPS, 2020, pp. 9459–9474.

Author's Affiliation

Ahmad Alfaqehi
Masters in AI graduate student, Computer Science Department, College of Computers and Information Technology, Taif University Taif, Kingdom of Saudi Arabia

Khalid Aljuaid
Masters in AI graduate student, Computer Science Department, College of Computers and Information Technology, Taif University Taif, Kingdom of Saudi Arabia

Abdullah Sheikh
Assistant Professor of Computer Science, Computer Science Department, College of Computers and Information Technology, Taif University, Taif, Kingdom of Saudi Arabia

Copyrights & License

Ahmad Alfaqehi, Khalid Aljuaid, Abdullah Sheikh, 2026

This work is licenced under a Creative Commons Attribution 4.0 International License.

Article Details

Issue: Vol 9 No 6 (2026): Volume 09 Issue 06 June 2026
Page No.: 3567-3578
Published : 29 June 2026
Section: Physical, Chemical Sciences, Engineering & Technology
DOI: https://doi.org/10.47191/ijcsrr/V9-i6-62

How to Cite :

Explainable, Evidence-Based Verification of Arabic Claims via Multi-Source Retrieval and Cross-Lingual NLI. Ahmad Alfaqehi, Khalid Aljuaid, Abdullah Sheikh, 9(6), 3567-3578. Retrieved from https://ijcsrr.org/single-view/?id=26765&pid=26445

Most read articles by the same author(s)

Ahmad Alfaqehi, Khalid Aljuaid, Abdullah Sheikh,

View

Citation

Downloads

Article Details

Issue: Vol 9 No 6 (2026): Volume 09 Issue 06 June 2026
Page No.: 3567-3578
Published : 29 June 2026
Section: Physical, Chemical Sciences, Engineering & Technology
DOI: https://doi.org/10.47191/ijcsrr/V9-i6-62

Explainable, Evidence-Based Verification of Arabic Claims via Multi-Source Retrieval and Cross-Lingual NLI

Vol 9 No 6 (2026): Volume 09 Issue 06 June 2026

Abstract :

Keywords :

References :

Author's Affiliation

Copyrights & License

Article Details

How to Cite :

Most read articles by the same author(s)

Downloads

Article Details

ABOUT IJCSRR

For Authors

Journal & Policies