Helping ordinary people create extraordinary websites!
GET OUR NEWSLETTER
Your Email:
 

Scraping Links With PHP

By Justin Laing
2008-01-06


Is Scraping Content Legal?

There is no easy answer to this question. Many organizations scrap content from all over the web - Google, Yahoo, Microsoft, and many others. These companies get away with it under fair use and because site owners want to be included in the search results. However, there have been copyright infringement rulings against these companies.

The real answer is that it depends who you scrape and what you do with the content. Basic copyright law gives authors an automatic copyright on everything they create. But the same law permits fair use of copyrighted material. Fair use includes: criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research. But even these uses could be considered copyright infringement in some circumstances. So be careful before you claim “fair use” as your defense!

Here’s a couple sites that have granted you the right to use their content. They do require you to attribute the content to the author or the URL you scraped it from:

  • Wikipedia - GNU Free Documentation License
  • Open Directory Project - Open Directory License
  • Creative Commons Logo
    Creative Commons - Creative Commons Attribution 3.0
    Many sites publish their content under some form of the Creative Commons license. You can search for creative commons licensed works here: Creative Commons Search. Remember that it’s your responsibility to verify the copyright rules for anything you use, even stuff found using the Creative Commons Search.


Tutorial Pages:
» Scraping Links With PHP
» Get The Page Content
» Tip: Fake Your User Agent
» Using PHP’s DOM Functions To Parse The HTML
» XPath Makes Getting The Links You Want Easy
» Iterate And Store Your Links
» Your Completed Link Scraper
» What Else Could I Do With This Thing?
» Is Scraping Content Legal?


Originally posted on Makebeta


 | Bookmark
Related Tutorials:
» Port Scanning and Service Status Checking in PHP
» Web Database Access from Desktop Applications
» CubeCart 3.0 Installation and Configuration
» PHP Site Search Made Easy
» Installing and Configuring Drupal 6.1
» Desktop Application Development with PHP-GTK

Advertise with Us!


Tutorials Scripts Web Hosting Developer Manuals
Resources