Text Mining Resources

Library Resources For Text Mining

This guide provides information about text and data mining resources available freely or through the Princeton University Library. The Sources tab provides information on the availability of text mining on databases and resources licensed by Princeton University Library as well as freely available resources. In general licensed databases do not allow automated searching or downloading. Please follow the the provider's preferred mechanism for providing raw data as indicated in this guide. 

Just because you can scrape something, doesn't mean you should. Text data from library resources should only be obtained via database interface or API. For websites check the robot.txt file.

Flowchart for whether to scrape a website.