Romance, Historical, Contemporary, Paranormal, Young Adult, Book reviews, industry news, and commentary from a reader's point of view

Dealing with Scraping from a Blogger’s Point of View

All About Romance was a victim of plagiarism a week or so ago. A new blogger had taken some reviews of theirs and posted them as her own. This is actually more common than you may think. Many times, blog sites called scrapers will take your RSS feed and republish it entirely under a different name, generating ad income from your work. It’s bad and shouldn’t be done.

Fortunately for bloggers, we have the same rights as other owners of copyright. Under the DMCA, an internet hosting site only has protection from being sued for copyright infringement if they act quickly to respond to complaints of infringement. In fact, some argue that sites are too quick to take down content even when the content is not infringing.

Because of that, you need to follow each web hosting service’s rules for notifying them of infringement. Here is an example of Go Daddy.com’s copyright policy.

An electronic signature of the copyright owner, or a person authorized to act on behalf of the owner, of an exclusive copyright that has allegedly been infringed.

Identification of the copyrighted work claimed to have been infringed, or, if multiple copyrighted works at a single online site are covered by a single notification, a representative list of such works on that site.

Identification of the material that is claimed to be infringing or to be the subject of infringing activity and that is to be removed or access to which is to be disabled, and information reasonably sufficient to permit Go Daddy to locate the material.

Information reasonably sufficient to permit Go Daddy to contact the Complaining Party, such as an address, telephone number, and, if available, an electronic mail address at which the Complaining Party may be contacted.

A statement that the Complaining Party has a good faith belief that use of the material in the manner complained of is not authorized by the copyright owner, its agent, or the law.

A statement that the information in the notification is accurate, and under penalty of perjury, that the Complaining Party is the owner, or is authorized to act on behalf of the owner, of an exclusive right that is allegedly infringed.

Google/Blogger’s DMCA takedown procedure is as follows:

Digital Millennium Copyright Act – Blogger
You can file a notice of infringement with us through our online form. Submitting a notice online ensures the quickest handling and processing of your request.

To submit a notice by fax and/or mail, please use the following format (including section numbers):

1.Identify in sufficient detail the copyrighted work that you believe has been infringed upon. This post must include identification of the specific posts, as opposed to entire sites. Posts must be referenced by the permalink of the post. For example, "The copyrighted work at issue is the text that appears on http://example.com/test/2006_01_01.html#2106.

2. Identify the material that you claim is infringing the copyrighted work listed in item #1 above.

YOU MUST IDENTIFY EACH POST BY PERMALINK OR DATE THAT ALLEGEDLY CONTAINS THE INFRINGING MATERIAL. The permalink for a post is usually found by clicking on the timestamp of the post. For example, "The blog where my copyrighted work is published on is http://copyright.blogspot.com/archives/2006_01_02_example.html."

3. Provide information reasonably sufficient to permit Google to contact you (email address is preferred).

4. Include the following statement: “I have a good faith belief that use of the copyrighted material described above on the allegedly infringing web pages is not authorized by the copyright owner, its agent, or the law.”.

5. Include the following statement: “I swear, under penalty of perjury, that the information in the notification is accurate and that I am the copyright owner or am authorized to act on behalf of the owner of an exclusive right that is allegedly infringed.”

6. Sign the paper.

7. Send the written communication to the following address:

Google, Inc.
Attn: Google Legal Support, Blogger DMCA Complaints
1600 Amphitheatre Parkway
Mountain View, CA 94043

OR fax to:

, Attn: Blogger Legal Support, DMCA Complaints

This is a tedious step by step process but if you follow the instructions, the web hosting services are quick to take down the infringing information.

Scraping someone’s content will not stop but there are tools that we can utilize to ameliorate the effects of scraping. Using these tools are a blogger’s first and best line of defense.

Jane Litte is the founder of Dear Author, a lawyer, and a lover of pencil skirts. She spends her downtime reading romances and writing about them. Her TBR pile is much larger than the one shown in the picture and not as pretty. You can reach Jane by email at jane @ dearauthor dot com

32 Comments

  1. Tweets that mention Dealing with Scraping from a Blogger’s Point of View | Dear Author: Romance Novel Reviews, Industry News, and Commentary -- Topsy.com
    May 02, 2010 @ 04:09:38

    [...] This post was mentioned on Twitter by dearauthor. dearauthor said: New post: Dealing with Scraping from a Blogger's Point of View http://bit.ly/d9iyxi [...]

  2. tracy sharp
    May 02, 2010 @ 05:17:27

    This is really interesting. Thanks for posting it.

    ReplyReply

  3. Jayne
    May 02, 2010 @ 06:16:16

    I was pissed when I heard about the theft of the AAR reviews. That’s their hard work being ripped off. And from what I read elsewhere, the person who did it didn’t seem all that concerned that what she did was wrong.

    ReplyReply

  4. Joanne
    May 02, 2010 @ 06:35:01

    It must be frustrating since it can happen over and over.

    That particular blog that stole from AAR was closed but someone else will do almost the same thing. Or many someones. It’s only by accident that these scraping (I prefer stealing)sites are discovered and reported, no?

    Someone has to report their suspicions to the blog being copied and then action is taken?

    ReplyReply

  5. Ros
    May 02, 2010 @ 07:56:37

    I feel like there’s a difference between scraping and plagiarism. Both are morally wrong, of course, but I think that the appropriate response is different. A scraping site that uses some automated process to nick other people’s content is, I think, doomed to very quick failure. No one reads those sites. So yes, you could go through the DMCA procedure but my feeling is that it’s probably a waste of time and effort. None of your readership is going to be lost to the scraping blog.

    But a site which purports to be providing its own reviews or other content, but which is actually lifting that from elsewhere – yes, that could damage your readership if it’s done well. And yes, I definitely think it’s worth pursuing the individuals involved in that sort of theft.

    ReplyReply

  6. Mireya
    May 02, 2010 @ 08:00:16

    As co-owner of a reviews newsletter and it’s related website, allow me to add that the problem of reviews lifting or plagiarism does not only pertain to blogger reviewers. There is more to reviews plagiarism than scrapping, the same way that there are more types of reviewers out there than blog reviewers. Sadly, in the case of reviews websites, it is a bit more complicated to report incidents of plagiarism and often can take longer to clear things. Either way, anyone reviewing (blog or not) should be aware of (1) the potential for it happening and (2) that there are ways to report this type of incident.

    ReplyReply

  7. Shiloh Walker
    May 02, 2010 @ 08:09:10

    Many times, blog sites called scrapers will take your RSS feed and republish it entirely under a different name, generating ad income from your work.

    Okay, any time somebody rips off another person’s hard work, it aggravates the hell out of me, but THIS is one thing I never even considered-I ignore ads on blogs and very often click away when I see them because they slow things down, but I never even thought about the ad revenue. So double bad. :(

    ReplyReply

  8. Heidi Cullinan
    May 02, 2010 @ 09:17:35

    I had no idea this was even an issue. God, sometimes the world makes me tired.

    ReplyReply

  9. JulieB
    May 02, 2010 @ 10:27:06

    I’ve noticed that hosts are quick to take down infringing content without a DMCA if the content is used as a vehicle to distribute malware. A couple of years ago I had a problem with malware sites scraping my blog content and found that a short, factual note to the host’s abuse address was enough to get the site taken down within 24 hours.

    It goes without saying that if you’re checking a possibly infringing site, be sure you’re practicing safe computing.

    ReplyReply

  10. Robin
    May 02, 2010 @ 10:47:47

    Although it’s been mentioned a couple of times, not a lot of attention has been focused on the fact that the blog in question was using the same name as Kristie J.’s long and well-established Romance blog. IMO that’s just as significant as the review lifting.

    Because if someone who is not a legitimate blogger is using content and name recognition from legitimate and well-established venues, you have to wonder why. Ad content is one reason, but receiving ARCs has been suggested and that sounds likely to me, too.

    No question that this is an extremely frustrating situation, but at least there is direct recourse for bloggers having their content lifted and/or infringed. Most of the time the folks scraping/lifting content are not part of the Rom blogging community, so I don’t have that sense of moral outrage that I likely would if trusted Rom bloggers were plagiarizing. But I definitely think it’s important for bloggers to know how to get the scraped/infringed content removed, and as ambivalent as I am in general about the DMCA, in this case it offers an important protection for bloggers.

    ReplyReply

  11. Jane
    May 02, 2010 @ 10:56:34

    @Ros One of the sites we found scraping content was discovered because their links (using our content) was ranked pretty high in google results. I find it can be pretty harmful but the best thing to do is utilize the web tools that we have.

    ReplyReply

  12. Ann Marie
    May 02, 2010 @ 10:57:25

    How does one find out that their blog has been scraped, other than accidentally? I already don’t have time to read the blogs I subscribe to. ;)

    ReplyReply

  13. Jane
    May 02, 2010 @ 11:04:56

    @Ann Marie We’ve discovered it by link backs (the scrapers don’t change the links), reports from other readers, occasionally the sites pop up using search results.

    ReplyReply

  14. JulieB
    May 02, 2010 @ 11:09:00

    @Ann Marie: I have a couple of Google Alerts set. I also have a line of text in each entry that I can search for. I can hide it in the entry on my blog (make the text the same color as the background), but it’ll show up on RSS feeds. I can set a Google Alert on that. It’s not infallible, but it works well enough.

    ReplyReply

  15. Sarah
    May 02, 2010 @ 13:30:11

    Thanks for posting about this important issue. My blog content was recently scraped by another site; everything from the past 6 weeks had been stolen, and the other site included a copyright notice claiming it was theirs. I discovered it through my blog stats, and a guest blogger told me he was getting hits from the offending blog. Their ISP made it clear they wouldn’t look into the matter without a formal notice of DMCA copyright infringement, which I sent. I got a note back within 20 minutes that the site had been taken down. The procedure was tedious, but it got the job done.

    ReplyReply

  16. Catherine Delors
    May 02, 2010 @ 13:41:16

    Excellent idea, Julie. Thank you!

    ReplyReply

  17. FARfetched
    May 02, 2010 @ 15:58:57

    I got hit with a variation on this theme: someone pulled an episode of a long story on my blog and posted it translated into German. If I were sure it was a crappy machine translation, I’d have complained but I don’t know German well enough to say.

    ReplyReply

  18. Shiloh Walker
    May 02, 2010 @ 19:53:19

    @ Robin, You know, the ARC angle is one I’d never really even thought about. O_O

    ReplyReply

  19. dirtywhitecandy
    May 03, 2010 @ 02:50:54

    Thank you so much for this informative post. Although we put our writing, chapters, novel excerpts, stories on our blogs for free, the time we took to write them isn’t free, and it’s a basic human right to get credit for it! Grrr, blood boiling…

    ReplyReply

  20. Monday Morning Stepback: Stupidly Happy Edition « Read React Review
    May 03, 2010 @ 05:36:43

    [...] of us know by now that AAR and Ramblings on Romance were plagiarized. Jane of Dear Author has written a post detailing the recourse a blogger has if she is [...]

  21. DS
    May 03, 2010 @ 07:31:32

    The only time I accidentally clicked on a link that turned out to be a site with scraped posts, I was too worried about viruses and malware to think to report it to the genuine site.

    Of course with so many review sites online now it’s hard to determine who is legit and who isn’t.

    ReplyReply

  22. JulieB
    May 03, 2010 @ 09:32:09

    @Catherine Delors: I should have noted (shame on me!) that I got that idea from Plagiarism Today. They have some wonderful resources for dealing with content scrapers and plagiarists.

    ReplyReply

  23. viv
    May 03, 2010 @ 12:52:24

    Out of curiousity, exactly how is scraping different than the piracy issue you were ‘forced’ to blog about?

    I do appreciate that you might percieve a loss of income to the scraper, but to quote yourself:

    3) It is not the readers' job to a) get authors better royalties and b) ensure authors to make a living writing. Authors have no guaranteed right to earn a living off a writing. Some do and some don't. Just like anyone else out there who is working. No one has the right to earn a living doing a particularly thing.

    For the record, I do think it is unsavory and unethical to aquire other peoples work and distribute it without their permission. I would certainly not be inclined to visit those sites.

    I do applaud that you are not lumping all blog owners into this group, and attacking them wholesale, as you percieved authors to be doing to you. I just wish you had treated authors that had actually supported you with more respect

    So I suppose in the end, I must parahrase your own question to authors: What do you reasonably expect us, your readers to do about scraping? Why should show any more care about your dillema than you showed authors?

    ReplyReply

  24. Jane
    May 03, 2010 @ 13:03:21

    Hi @viv. Thanks for your comments. I don’t expect readers to do anything. I shared this information for other bloggers in case they weren’t aware that they could have this information taken down. Authors can and should use the DMCA to get copyrighted information taken down as well. As I stated in the blog post, it can be tedious but also effective.

    I also appreciate that you are an avid follower of Dear Author. We certainly appreciate your readership and support.

    As for treating authors with “respect”, I was responding to comments made in that thread that villified readers for sharing digital books, equating reader sharing with felonious activity. Perhaps when some authors start treating digital readers with respect and aren’t so quick to accuse readers of being thieves and engaging in a campaign to humiliate and shame readers for sharing, I can, in return, afford those authors respect.

    ReplyReply

  25. viv
    May 03, 2010 @ 14:11:31

    @Jane:
    I actually did go re-read that article today before I posted, mostly to make certain that my memory of your anger and disdain of authors that you showed in the piracy blog post was accurate.

    There were a number of authors in the original owner’s copyright post that did clearly conflate sharing (a legitimate and protected practice under many terms of service) and piracy (unauthorized distribution of content not too dissimilar to scraping).

    There were a few authors who were clearly horrified even at the concept of legitimate sharing, and did absolutely nothing good for the impression that you received, that all authors automatically equated sharing with stealing.

    I understand your aggravation, and growing disdain of those authors who behaved in that fashion. Considering the language they used, both in that post, and in the original twitter event that inspired the post, I do not blame you in the least in getting the impression that those authors believed that e-book readers were inherently without ethics or discretion when it came to where they they got those books. As a fellow e-book reader looking over those particular posts, I got the same feeling.

    The problem I had that there were two specific incidents in the copyrights post where authors came out in support of your position and you treated them like something you stepped in. Those were the ones I linked to.

    I actually thought both Meljean and Karen Templeton were fairly clear that they did not support the sharing = piracy camp.

    Now considering how long the thread had gone on before Karen came on, I can see how you saw her attempt to engage in the conversation as an attack on your position, regardless of how clear she tried to make it that to her, sharing did not equate piracy.

    She did include her own anecdotal experience that e-books were easier to pirate than print versions. You made it clear in your conversation with her that it was for this ‘sin’ you deemed her an enemy. You made it clear to her (350) that bringing up piracy at all was a derailment of the topic at hand.

    I can see how bringing up the ease of piracy in the reader copyright discussion could help build the confusion/equivalence between sharing and piracy that you were trying to break down. I can certainly see how increasingly frustrated you got as the ‘but piracy is bad” kept coming around.

    Trying to keep the topic on what is legally allowed the reader is your right and privilege as the blog owner. In this particular case though, the original twitter post that inspired you happened because of a conflation of piracy and sharing. It was the entire reasoning behind your outrage – that sharing is not piracy.

    That still brings in piracy, what it is and is not, as a part of the conversation. I actually thought that Karen made it fairly clear in a later post that she was trying to explain why she thought that this panicked equation of piracy and sharing was conflated. Considering the source of your post, I would think this would have actually thought this fit with the topic at hand.

    That she attempted to lighten the mood with humor that was unwelcome is unfortunate, but I see nothing in her posts that shows you disrespect.

    Looking at the amount of arguing you had to do in that thread, I can appreciate lashing out at a perceived attack, but I thought your treatment of Meljean was uncalled for.

    Meljean posted all of once; to say that she did not believe this way, and hoped her statement would help with your perception that all authors thought that because someone was an e-book reader, they were a thief.

    You verbally bitchslapped her. She said nothing about piracy, did nothing to implicate that people that shared books were bad, and you pretty much spit on her. To say the least, this still leaves a bad taste in my mouth.

    Your last paragraph gives the impression that because some authors have shown disrespect, no author deserves respect from you. This is emphasized with the two examples I gave above. Does that mean we should treat you with the same respect we would give a scraper?

    ReplyReply

  26. Jane
    May 03, 2010 @ 15:08:40

    @viv – I’ll have to go back and re-read my comments to Karen Templeton, but my response to Meljean was a joke and she knew it. I’m sorry that you were offended on her behalf but I assure you that I was not verbally bitchslapping her nor would she think that.

    I cannot control how you perceive me, through your own personal lense. If you want to view me as a scraper, that is your choice. And if you choose to deliberately misread my comments to suit your perception, that is also your choice.

    ReplyReply

  27. viv
    May 03, 2010 @ 16:52:29

    …You know, considering how high the emotions were on that thread, I never considered that you were speaking in jest to Meljean. You are correct, that does change quite a bit on the context of that exchange.

    I suppose a lot of this is how each of us perceives the world through our own lense.

    I don’t think you are a scraper, or have the same ethics/behaviors of a scraper, and I do apologize for giving that impression.

    I can only infer intention from what I read, in the context that they were presented, just as anyone else. That’s pretty much why blogs exist. I like to think part of why you allow commenting to the extent you do is at least partially so we can find common ground between our individual perceptions, and overcome perceptional failures.

    I am perfectly willing to admit that the high passions of the copyright post could very well have colored my perceptions of your comments in that post. I actually thought that most of your responses were perfectly reasonable, with the exception of the examples I already gave.

    I am very glad to find I misread the context of the interaction that troubled me the most. I do hope I misinterpreted your response to Karen as well. It is in the context of those interactions, after all, that lead me to believe that when you said ‘those authors’, you actually meant ‘all authors’.

    Thank you for taking the time to address my concerns here, even though the connection is admittedly dubious.

    ReplyReply

  28. MaryK
    May 03, 2010 @ 19:09:31

    @Jane: @viv: If anyone cares, I read the comment to Meljean as a joke.

    ReplyReply

  29. Meljean
    May 04, 2010 @ 01:01:36

    I definitely took it as a joke :-)

    Jane, thanks for the info on scraping. My blog content has been pitiful lately. Now I know how to quickly add new posts! I’m eyeing DA’s review of Meredith Duran’s latest in particular, because I’m dying to read it … but, alas, no time. I can pretend, though, and I’m sure no one would notice.

    Or maybe the review for Patience? I’d probably get more hits with that one.

    Decisions, decisions…

    ReplyReply

  30. Anion
    May 04, 2010 @ 17:02:42

    Just stepping in late here to say that Livejournal has a special policy for bot/scraping blogs. You don’t have to go through the DCMA legal phrasing, you just report it as a bot and they take it down. It’s a lovely simple process.

    ReplyReply

  31. Bonnie
    May 05, 2010 @ 11:45:50

    @FARfetched I speak/read fluent German and would gladly have a look if you wanted to post links here.

    ReplyReply

  32. Stumbling Over Chaos :: It’s far too late to come up with a remotely clever linkity title
    May 06, 2010 @ 01:03:40

    [...] What should you do if someone’s been scraping (ie, plagiarizing) your blog? [...]

Leave a Reply

Notify me of followup comments via e-mail. You can also subscribe without commenting.

%d bloggers like this: