Protect images from copyright theft
You may be wondering how to prevent your images from being scraped, leeched or otherwise saved and reused without your permission.
By its nature, the web is an open platform, making it almost impossible to totally prevent your images from being saved by a visitor. Once an image has been received by a users browser, it exists in their browser cache, which could easily be saved by the user if they know how to access the image.
However, Sirv provides multiple ways to either block users from accessing your files or make it harder for them (and bots) to save your copyrighted images.
Hotlinking to a file on another website without consent is known as "leeching". Your files on Sirv are open to be requested by any site, so they could be embedded anywhere. This is the default for all web servers - the web is open for all.
To prevent this, you can enable Sirv's domain restriction feature, to specify which domains should be allowed to display your images. Domain restriction can also block direct requests and search engine bots.
The following image is from an account using domain protection, so it returns a 403 forbidden error:
Referrer restriction will allow the file to be served to whitelisted domains that you choose. However, the images served in the web pages of those domains can still be saved by the recipient or scraped by bots, so read on to learn what other protections you can use...
If you're using Sirv's image zoom feature, you can protect your big zoomable images by enabling image tiling. Sirv will serve your image sliced into lots of little square images, so that the users' browser does not receive the large image. This is very effective when combined with domain restriction (see item 1 above), to prevent a person from editing the URL to request the full size image.
Another protective feature of Sirv's image zoom is that the right-click "Save image" is disabled, making it harder for a user to download an image.
Sirv can prevent a file/files from being served unless they have a special token in the URL. JWT tokens expire after an amount of time that you choose. They can also contain image parameters (such as watermarks) which cannot be removed. This is a very powerful tool that would normally be complicated to implement, but Sirv has made it very accessible for developers.
Follow the steps to create JSON Web Tokens.
Sirv makes it easy to brand your images with a logo or text watermark. Choose the location, style, opacity and size. Watermarks can be applied either via the URL or a profile (recommended if you have many images).
Credit: Will Copestake & Sidetracked
Watermarks could be stripped from an image by editing the URL, so to prevent this, you can apply a strict profile. This profile will apply to all images in a folder (and its subfolders). Enable it from the Domains page of your Sirv account.
To avoid obscuring your images with a visible watermark, you can apply a hidden watermark. Digimarc is a third party service that provides image watermarking that humans cannot see but software can. Sirv can automatically apply Digimarc to your images, which can then be identified by Digimarc in future. Digimarc is designed for Enterprise usage, with typical cost starting from $2000/month. Contact Digimarc to discuss your requirements.
One of the many optimizations Sirv makes to serve images as fast as possible is to remove meta data. If your meta data contains copyright or other important information, you can set Sirv to keep the meta in your optimized images. It won't stop people using images without your permission but it can help people find out who owns the image so they can ask to license it from you.
Simply set the "Strip image metadata" option to "Disabled" in your Default profile:
Scraping is a technique performed by bots to go through a website page by page and save information from it. To protect your images from being scraped by a bot, identify the bots and block them from accessing your files.
A very determined scraper could try and avoid detection of its activity, but you can successfully discourage bots by making it hard for them. It's important to do this in a way that distinguishes between genuine traffic and scraping traffic, to avoid blocking real users and search engines.
Rate limiting is something that can be configured on your server, to limit the number of web pages served to IPs displaying suspicious activity. Learn more in this Guide to preventing Webscraping.
Sirv blocks all known scraping bots. However, new bots can be created and existing bots can camouflage their activity. To stay ahead of the bots, check the Browsers tab shown on your Analytics page, to identify any IPs and user agents that are requesting an abnormally high number of files. Investigate their authenticity and report them to Sirv if you believe there may be misuse.
Your Top 10 browsers table looks like this:
Bots should follow the rules in your websites' robots.txt file. Bad bots won't but good bots will. This will prevent your images from being indexed and shown in Google Image search results and other image search engines.
To tell bots not to index any of your images, disallow specific file type extensions in your robots.txt file:
User-agent: * Disallow: *.jpg Disallow: *.png Disallow: *.gif
If you would like major search engines such as Google Images to index your images, you can disallow all bots by default and specifically allow certain bots, like so (Slurp is Yahoo's bot):
User-agent: Googlebot Allow: / User-agent: Googlebot-image Allow: / User-agent: Bingbot Allow: / User-agent: msnbot Allow: / User-agent: Slurp Allow: / User-agent: Pinterestbot Allow: / User-agent: * Disallow: /