Protect images from copyright theft
You may be wondering how to prevent your images from being scraped, leeched or otherwise saved and reused without your permission.
By its nature, the web is an open platform, making it almost impossible to totally prevent your images from being saved by a visitor. Once an image has been received by a users browser, it exists in their browser cache and can be accessed.
However, Sirv provides multiple ways to either block users from accessing your files or make it harder for them (and bots) to save your copyrighted images.
Hotlinking to a file on another website without consent is known as "leeching". Any file hosted on your Sirv account (or any other server) has a URL, meaning that another website could display your images simply by embedding the image in their page with the full image URL.
To prevent this, you can restrict which domains can serve your images. Each request that Sirv receives will be checked against your whitelist of domains.
1. Go to the Settings page of your Sirv account.
2. In the Domain Restriction section, click "Edit":
3. Click "Active":
4. Under Whitelisted domains, enter the domain name(s) which you wish to serve images to. Only domains listed here will be allowed to display your images - any other domain shall receive a 403 error (forbidden).
5. For additional protection, you can untick "Allow direct requests". This will deny empty referrer requests, to stop anyone from directly requesting your image URLs.
You can also untick the option to allow search engine bots to index your images, to stop them showing in Google Images results (and other search engines).
Referrer restriction will help you lock down your images to trusted domains. However, your images will still be served to your website, so they can still be saved by humans or scraped by bots. That is practically impossible to prevent but you can make it harder for them...
Sirv lets you create signed URLs which expire, to prevent an image (or other file) from being requested in any other variation.
Scraping is a technique performed by bots to go through a website page by page and save information from it. To protect your images from being scraped by a bot, identify the bots and block them from accessing your files.
A very determined scraper could try and avoid detection of its activity, but you can successfully discourage bots by making it hard for them. It's important to do this in a way that distinguishes between genuine traffic and scraping traffic, to avoid blocking real users and search engines.
Rate limiting is something that can be configured on your server, to limit the number of web pages served to IPs displaying suspicious activity. Learn more in this Guide to preventing Webscraping.
Sirv blocks all known scraping bots automatically. However, new bots can be created and existing bots can camouflage their activity. To stay ahead of the bots, check the top referral domains shown on your Analytics page, to identify any IPs and user agents that are requesting an abnormally high number of files. Investigate their authenticity and report them to Sirv if you believe there may be misuse.
Your Top 10 referral domains table looks like this:
Bots should follow the rules in your websites robots.txt file. Bad bots won't but good bots will.
Tell bots not index any of your images by disallowing specific file type extensions in your robots.txt file:
User-agent: * Disallow: *.jpg Disallow: *.png Disallow: *.gif
If you would like major search engines such as Google Images to index your images, you can disallow all bots by default and specifically allow certain bots, like so (Slurp is Yahoo's bot):
User-agent: Googlebot Allow: / User-agent: Googlebot-image Allow: / User-agent: Bingbot Allow: / User-agent: msnbot Allow: / User-agent: Slurp Allow: / User-agent: * Disallow: /
Brand your images by placing a visible logo or text watermark on your images. This is easy with Sirv. The location, style, opacity and size are all customizable. Watermarks can be applied either via the URL or a profile (recommended if you have many images).
Credit: Will Copestake & Sidetracked
To avoid obscuring your images with a visible watermark, you can apply a hidden watermark. Digimarc provides image watermarking that humans cannot see but software can. Sirv can automatically apply Digimarc to your images, which will then be identified by Digimarc in future, if they are served to another website. Digimarc is a paid, third-party service starting at $99/year. Create a Digimarc account.
One of the many optimizations Sirv makes to serve images as fast as possible is to remove meta data. If your meta data contains copyright or other important information, you can set Sirv to keep the meta in your optimized images. It won't stop people using images without your permission but it can help people find out who owns the image so they can ask to license it from you.
Simply set the "Strip image metadata" option to "Disabled" in your Default profile: