Improving the Concept Search of Quran Using Linguistic Semantic Resources and Deep Learning

Since Quran is the supreme source of spiritual wisdom and knowledge used for recitation, analysis and guidance among Muslims and non-Muslims. A variety of new techniques and tools related to information retrieval and Natural Language Processing (NLP) have been created for the purpose of recitation and for searching Holy Quran’s Arabic and English translation text. An effort to improve Concept search of Quran is done in this research. This research introduces a framework containing four modules which are Word2Vector, LSI, LDA and Quranic WordNet. Quranic WordNet module is based on a database made from widely used WordNet for English translation of Quran. The data set for these modules is English translation of Holy Quran from 8 different authors. Different techniques of Natural Language Processing were used on these files of English translation. These four modules performs concept search on Holy Quran by giving results which are conceptually correct. Results of these modules are promising in terms of recall and precision and are compared with gold standard called Mushaf Al Tajweed. This proposed framework can be easily extended to other sources of Islamic knowledge without manual intervention e.g. Hadeeth, Fiqh, etc.