Princeton University Library uses American Library Association and Library of Congress (ALA-LC) approved Romanization Tables for their transliterations of Arabic, Azerbaijani*, Hebrew, Judeo-Arabic, Ladino, Ottoman Turkish, Persian, Syriac, Urdu**, and Yiddish languages into Roman script.
If you are having trouble locating works in your Main Catalog search, check to be sure you are using the correct transliteration. Links to PDFs of the current ALA-LC approved Romanization Tables for selected languages are listed below.
The relative value of script vs Romanization
Searching the Arabic script is not always reliable in actual WorldCat database.
Searching in Arabic script is more reliable due to the pronunciation of Arabic words that varies from country to other and how it was romanized .
Specific Arabic characters that are problematic in WorldCat
The diacritics of Romanized characters are not consistent. There is a mixture of ALA Romanization rules in both Unicode and non-Unicode characters set, in addition to non-ALA Romanized diacritics. We can detect non Unicode characters (2 characters to form one Arabic diacritical characters like ā which is a + ¯ Macron diacritics versus 1 character in Unicode ā ) i.e. Daqāʼiq al-ʻArabīyah versus Daqāʼiq al-ʻArabīyah . Apparently they are similar but you can discover the difference if you try to erase the character by using the back space key. This doesn’t affect the search results.
The problematic characters are: āáḍḥṣṭūẓ Ā Ḍ Ḥ Ī Ṣ Ṭ Ū Ẓ both small and capital forms.
Use or non-use of Arabic stop words
Al-taarif is a very controversial issue. In some library systems like Millennium the system will omit the al-taarif automatically in order to enable retrieval of titles with or without al-taarif. Maybe it is useful to use the same practice in WorldCat .
It is advisable not to skip the stop words in Arabic like ( min, ilá, ‘lá, ) from title search. You can skip them from Keyword search. As for waw –atf iit is advisable to be separated with a space from the word if the word includes al-taarif because the system will disregard any stop word like waw atf as first step then al-taarif in second step during the indexing process.
i.e. الحقيقة و الخيال
The no. of hits in the results with al-taarif is not always equal to the no. of hits without al-taarif in Title search.
Examples of discrepancies in the no. of hits in results of Title search due to al-taarif in title search:
الخيال = 409
والخيال = 84
و الخيال = 33
خيال = 471