Romance, Historical, Contemporary, Paranormal, Young Adult, Book reviews, industry news, and commentary from a reader's point of view

Author Frequently Asked Questions Regarding Text to Speech Functionality

At the suggestion of Peter Brantley of the Internet Archive, I offer up this Frequently Asked Questions for Authors regarding the Text To Speech (TTS) functionality that is the subject of debate. This may be an evolving document as more people provide input so that it adeqately addresses the issues. Please feel free to offer suggestions and/or revisions in the comments section.


Q: I’ve heard that there is some debate over Kindle’s Text to Speech Function. What is it and should I be concerned?

A: When Amazon released it’s Kindle 2 in February, it announced that it had included the ability for every document/book/written work on the Kindle to be real aloud using a robotic voice (either girl or boy). You can hear a sample of it here as read by Wil Wheaton. The TTS functionality was switched “on” as a default. Author’s Guild objected to this on the basis that the right to read a book out loud was an audio right, a derivative right of authors under the Copyright Law.

Q: What exactly is the legal argument that Authors Guild is making?

A: Ideas cannot be copyrighted, only their expression. For authors, that expression is the written work — a book, short story, novella, etc. For each written expression/work, there are rights the creator has to control how the work is copied, to whom it’s distributed, how it’s distributed and in what form, and the like. These rights are part of the creator/author’s copyright, which is their ownership interest in the written work, which is the intellectual property of the author. Authors can and do sell some of these rights to publishers, which is what allows the written work to reach a broader audience — the publisher then possesses the right and exercises it in producing, printing, and distributing the author’s work. While the author still owns the expression itself, the content, she does not necessarily retain every right she has in the process of bringing that content to a larger audience.

Rights are traditionally understood to be the following and can be found in Section 106 of the Copyright Act:

  • Reproduction: the copying of the work in a fixed form such as the print or ebook version.
  • Derivative work: works produced which are based on the original work.
  • Distribution: the selling, leasing or otherwise transmission of the work from the copyright holder to anyone else.
  • Public Display: show a copy of the work in public.
  • Public Performance: a performance of the work in public.

These rights can be broken down and sold in as many pieces as the author can envision and others are willing and able to pay. Some authors have been able to sell their digital rights separate from their print rights (a split of the reproduction right). Many authors make deals based on geography selling first their native state rights (i.e., North American rights) and then selling foreign rights (a split of the distribution rights).

The Authors Guild’s argument is that the audio right is a derivative one and that TTS is infringing on the derivative audio right. From the Copyright Act:

A "derivative work" is a work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted. A work consisting of editorial revisions, annotations, elaborations, or other modifications, which, as a whole, represent an original work of authorship, is a "derivative work"

Q: This sounds right to me because I do have audio rights for which I get a separate royalty when and if my book is made into an audiobook and sold.

A: The fact that an audio book is a derivative right doesn’t end the analysis. First, the TTS technology itself is not infringing because it can be used for substantially legitimate purposes such as increasing accessibility to the vision and learning impaired. Second, the transcription of text to voice is not a derivative work. It does not recast, transform or adapt the work into something else. The ability of a computer to read aloud to the purchaser of an ebook is not different than a parent reading to his child or a daughter reading to her vision impaired grandmother. The Authors Guild is arguing that the ebook with TTS enabled is essentially two rights: the reproduction of the work PLUS the derivative audio version. Instead, the digital copy with TTS enabled is simply one digital copy that could employ existing technology to have the text voiced.

Authors do have the right to control the public performance of their works as specified above. According to the Copyright Act:

To perform or display a work "publicly" means -’

(1) to perform or display it at a place open to the public or at any place where a substantial number of persons outside of a normal circle of a family and its social acquaintances is gathered; or

(2) to transmit or otherwise communicate a performance or display of the work to a place specified by clause (1) or to the public, by means of any device or process, whether the members of the public capable of receiving the performance or display receive it in the same place or in separate places and at the same time or at different times.

In order for the Kindle TTS to violate an author’s right to control the public performance of one’s work, the Kindle would have to be used to where a substantial number of people outside the family are gathered. Public performance does not constitute a reader employing technology to have words read aloud to him or her, particularly in the privacy of her home. In most cases, the Kindle TTS function is not a performance in any way covered under the Copyright Act; it is merely a convenience for the reader to have an electronic rendering of the words in the same way a car’s GPS system might read an address or a street name to a driver.

Q: I’m concerned that if a reader can employ a read aloud feature in her home or her car or while traveling or working out that audiobook sales will decline. Shouldn’t I protect myself against that?

A: First and most importantly, the individuals who stand to gain the most from TTS being enabled are those with vision and learning impairments. Currently those with vision impairments do not have the ability to read every book published. The aim of authors should be to get their books into the hands of every individual who is willing to purchase it in formats that meet the consumer’s abilities. Individuals with vision or learning impairments could be greatly aided by the TTS.

Second, the existing TTS technology is not on par with audiobook performances. Individuals who are buying audiobooks for performance reasons would not be satisfied with TTS. As evidenced by the Wil Wheaton example above, the performance quality difference between the computer generated reading and a specifically recorded audio book is much different. You could listen to the Kindle 2 and iPod Shuffle perform a scene from Star Trek II. The TTS is a much inferior product to the audio book. At Open eBook, there is another comparison between the TTS voice and a performance. As the Open eBook article notes, the disabling of TTS actually makes the book unreadable for those who are vision or learning impaired.

Authors Guild says that TTS technology is improving at leaps and bounds and that the sales of audiobooks are in jeopardy. Authors Guild’s argument fails in two ways. First, copyright laws deal only with existing infringement, not possibly infringing use sometime in future generations. Second, copyright infringement occurs when someone makes a copy (or infringes on another one of the exclusive rights granted to creators), not when someone loses money. For example, the VCR certainly changed the economic structure of the movie and television industry, but the question of whether the VCR infringed on copyright depended on details of its use. Likewise, a person retains copyright in works even if she has no intention of making a commercial profit on them.

Third, TTS is a ubiquitous technology. It exists in various forms in nearly every piece of electronics we own. Our phones contain TTS capabilities in the form of voice dialing. There are programs that will do the reverse of TTS by taking audio and transcribing it to text such as programs that will take your voicemail and create texts that are emailed to you. TTS exists on nearly every computer and soon will be incorporated into many other devices including but not limited to digital ebook readers.

Q: What should I do if I want my books to be enabled?

A: It depends. First, we have to operate under the erroneous principle that Amazon considers TTS a bundle of an author’s audio rights. Getting the TTS enabled would require the rightsholder (whether that is the author or publisher) to contact Amazon with two things:

a) Proof of the right of ownership over the audio rights.

b) A request to enable TTS.

Jane Litte is the founder of Dear Author, a lawyer, and a lover of pencil skirts. She spends her downtime reading romances and writing about them. Her TBR pile is much larger than the one shown in the picture and not as pretty. You can reach Jane by email at jane @ dearauthor dot com


  1. sallahdog
    May 24, 2009 @ 06:45:10

    there is a big difference between a robotic voice reading a book and a skilled reading of an audiobook. I am not a huge ebook fan, but spend wayyyy too much money with… It wouldn’t make me run out and buy a kindle and forgo an audiobook…

    It would be nice for someone like my mom, with cataracts, since a lot of the books or magazines she likes are not available in audio version


  2. TerryS
    May 24, 2009 @ 10:10:49

    Thank you for this concise reprisal of the problem and, more importantly, keeping this issue in the foreground. I consider TTS being turned off to be a major problem. In fact, if I know an author supports the Author’s Guild stance, no matter how much I may like an author, for me they become an automatic “do not buy” in any format. I have not purchased any books from Amazon since they “caved” to the Author’s Guild demand and disabled TTS on the Kindle…and I don’t even own a Kindle.

    Why? Access rights for the blind and the learning disabled to all books is more important. Did you know the fastest growing population of the blind are baby boomers? Also accidents, illness and war injuries occur every day. Even as we say it will never happen to me, the truth is diminished vision or blindness or disability affecting how we read could be the future of any one of us.

    I personally have no need and absolutely no interest in using TTS in place of an audio book. How fortunate I am my vision allows me unimpeded access to any reading material in the world. It’s my choice if I want to listen to an audio book, and even then my choices are limited. There isn’t now and never will be an audio book available for every print book. Why do we want to limit selection and narrow the world for those who have no choice but to depend on TTS?


  3. Silver James
    May 24, 2009 @ 10:17:01

    Do library reading services for the blind (those who read books to tape in order for blind patrons to check out) pay extra for this “right”?

    As an author, my goal is to get my book (once it is released) into the hands of as many readers and I can. As the mother of a daughter who has undergone cornea transplant surgery, cataract and lens implant, two separate glaucoma procedures, and laser surgery on the retina, I want her ability to “read” a book to remain as hassle free as possible.

    To me, this is an ADA (Americans with Disabilities Act) situation. I talk to my Garmin all the time. *He* may sound Australian, but no matter how many times he says, “Recalculating”, the inflection never changes. How long does a vision or learning impaired reader have to wait for books to come out in audio (or be read on tape through the auspicesis of a library for the blind)? I would think that this ability would help drive the sale of books – new releases and back lists.

    I need coffee as this has started to ramble so I’ll shut up here. Bottom line? TTS seems like a good idea to me as I think it would improve sales.


  4. Dear Author FAQ on text to speech debate « Malle Vallik's Blog
    May 24, 2009 @ 11:01:53

    [...] Author FAQ on text to speech debate Jane from Dear Author has started an FAQ on the text to speech debate.  For authors who want to know more about TTS and the debate do check [...]

  5. Aloe I
    May 24, 2009 @ 12:08:11

    This is absurdly biased. For all the legal posturing, your whole argument comes down to the fact that the TTS doesn’t sound like a real person. But in five years TTS may sound like Morgan Freeman, at which point an author’s audio rights will be worthless.

    You write:
    ” the transcription of text to voice is not a derivative work. It does not recast, transform or adapt the work into something else.

    This is fundamentally wrong. The ability of a computer to read a text file is transformation – and comparing it to a parent reading to a child is specious. By your definition, an actor could broadcast themselves reading a book aloud on television and it wouldn’t be infringing.

    It is surprising to see someone so appalled at #amazonfail lining up to help amazon steal revenue from authors.


  6. Robin
    May 24, 2009 @ 12:26:41

    @Aloe I: IMO your comment demonstrates the central fallacy of the AG’s position, namely that a right exists on the basis of “what could someday be.” The law is not sympathetic to such a position, demanding that in order to have a legal claim, a party must a) show injury, and b) show damages (otherwise known as the requirements of standing and ripeness).

    The AG appears predatory to me, and overreaching, in the way it continues to attempt to carve out a right based on potentialities. For one thing, TTS isn’t Morgan Freeman interpreting a book (that’s more akin to audiobooks, and they are covered under derivative rights), and even if the TTS sounded dead on human, why does that automatically create a derivative right under existing copyright law? You say that it is “specious” to compare a Kindle reading a book to a parent reading to a child, but why? You offer no reasoning for that dismissal, nor has anyone else I’ve seen who makes that denial. IMO the argument is actually stronger that TTS is less performative than the parent reading to the child, because the parent is more likely to assume different voices or tones while reading to a child. How does the argument go, exactly, in the other direction? And if your position is the one we should adopt, what does that mean for software programs, for example, where a person can speak into a microphone and have a computer write out the words? What happens to the analogy when you switch the positions of speaker and text?

    Beyond all that, though, that is, beyond the question of what might be but ISN’T, there is also the way the Author’s Guild has, IMO, conflated two very different things in making its arguments, namely the so-called market analysis and the pretense of a legal wrong. In other words, it seems to be saying that because this kind of thing can hurt authors’ marketability, it’s a violation of copyright. However, profitability is not guaranteed by copyright law. While a court may do a market analysis to determine whether there has been unlawful infringement of a copyright, copyright itself is not a protector of profit; it is a right to control certain aspects of a work that is limited in scope and duration. And the limitations of *scope* are just as important and real as the limitations in *time*.


  7. Aloe I
    May 24, 2009 @ 14:10:16

    Robin: I think TTS is transformative out of the box. I use Morgan Freeman to illustrate the point.

    That said:tucked in both your and Jane’s legal claims about copyright’s inability to legislate “what would someday be” is a tacit acknowledgment that if the TTS did sound like Morgan Freeman it would be transformative. And my point is that if you acknowledge that a better sounding bot would be transformative, then TTS is transformative even if it sounds like a tin robot. After all, I might prefer robo-lilt to Morgan Freeman.

    I’m calling the argument about parent’s reading to their children specious because it is an image used to evoke a world where the Author’s Guild wants to prevent parents from reading from their children (oh my!) But there is a world of difference between the noncommercial reading of book to a child and a wholesale grab of audio rights by Amazon.

    Frankly, if there is any company that is positioning itself to control access to literature, it’s amazon — aided by the digital evangelists who are its eager helpmeets.


  8. Jane
    May 24, 2009 @ 14:23:35

    @Aloe I My point is that TTS is not a derivative product. The act of a mother reading to her daughter is more of a transformative act than a computer device employing technology to read text aloud. There is a two layered component here. First, the technology itself is not abetting infringement. Second, the use of the technology is not transformative enough to create a derivative product and therefore it is not infringing.

    Your argue that it is transformative, but under the definitional code provisions, the text to speech doesn’t apply. First, a derivative work must be in fixed form. TTS creates no fixed form of the original work unlike an audio recording, be it digital or otherwise. Second, if it is simply a performance (ephemeral in time) then it must meet the public criteria which it does not. Merely arguing that the work is transformative is not enough for the TTS to become infringing although I would argue that it is not.

    I don’t disagree that Amazon is positioning itself to become the dominant market player for literature. I’ve argued that time and again here at Dear Author. However, the TTS is an issue far broader than the reaches of Amazon. It’s perfectly in line with my position as a consumer advocate.


  9. Robin
    May 24, 2009 @ 14:41:57

    @Aloe I:

    That said:tucked in both your and Jane's legal claims about copyright's inability to legislate “what would someday be” is a tacit acknowledgment that if the TTS did sound like Morgan Freeman it would be transformative.

    Not so. Here’s exactly what I said:

    For one thing, TTS isn't Morgan Freeman interpreting a book (that's more akin to audiobooks, and they are covered under derivative rights), and even if the TTS sounded dead on human, why does that automatically create a derivative right under existing copyright law?

    I do not think it’s legally sound to insist that if TTS were recognizable as a human voice, it would automatically create a derivative right. Note the language in the law:

    A “derivative work” is a work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted.[emphasis mine]

    IMO authors who are taking AG’s side in this are doing so against their own best interests, especially if they are doing so simply to spite Amazon. You don’t think that if Amazon had come out against TTS at the beginning, AG wouldn’t have been trumpeting their wisdom and slapping their metaphorical backs in solidarity? This is not about Amazon, and it’s not about taking legitimate rights away from authors. But as long as authors see it that way, they will, IMO, be laboring against their own best interests and against the rights they do have in favor of those they do not.

    What is so ironic here to me is that the DMCA has already stripped away long-standing rights that readers have had, most notably their rights under the first sale doctrine. And because of that, ebooks represents probably the one market that, if developed to its potential, could ostensibly wipe out a legitimate secondary market for books (now represented by the UBS, paperback swap, eBay, etc.) and a substantial drain on author royalties/publisher profits. And yet so many authors seem overwhelmed by a fear of piracy as an excuse not to expand the digital market (because, of course, no one scans in paper copies of books for download).

    It continues to flabbergast me, frankly, that the way in which digital books actually inhibit the legitimate legal rights of readers is totally ignored by authors and publishers when readers advocate for greater digitalization of publishing. I definitely see it as one of those ‘forest, meet trees’ situation, but more than that, I see it as a fundamental misunderstanding of IP rights and the general nature of IP. For example, how many authors know exactly what rights they have ceded to publishers, let alone the extent and limitations of those grants?

    I doubt I will ever truly understand why authors are so willing to unconditionally support corporate publishers — who are, IMO, the real rights hounds here — over and against their own interests and the interests of readers. Again, IMO this is indicative of the way in which some authors misunderstand the nature of the rights they hold, overestimate their control over rights that are limited, undervalue those they have that are valuable, claim rights they don’t have, and ignore those they do. And in the case of TTS, I do not believe that AG is doing anything to remedy that.


  10. Aloe I
    May 24, 2009 @ 15:05:36

    Jane – thanks for the response, but I still find fault in this line of reasoning.

    I’m happy to knowledge that the copyright law as it stands now isn’t going to give a satisfactory answer to this issue. Whether TTS falls under fixed form is one to be battled out by lawyers I suppose (I would argue that digital product that can be sold and revoked is fixed form – just because it doesn’t live on a hard disk doesn’t mean it doesn’t exist, or is fixed)

    But aside from the technicalities of whether or not TTS is infringement under current copyright law, what I don’t understand is why you wouldn’t be advocating protecting author’s rights in this case?

    There is a moral argument here — and one that is surprising clear if we stop assuming that all things that lead towards free content are good.

    If I can license audio rights for a book, and amazon sells an audio version of my book without compensating me, Amazon has, as Harlan Ellison might say, got their fingers in my pocket. It doesn’t make a blind bit of difference if the voice sounds like a tin-robot or a heavenly choir. I’m the creator. I have a right to sell my audio rights. I don’t see how you can square that with your position.

    You seem to be dancing around the issue of whether TTS is an audio book — and your claim seems to be that without the perfomative imbuing of personality it isn’t. But just because it’s a computer reading it and not Morgan Freeman doesn’t mean it’s not being performed. Video games routinely use much higher quality TTS programs to “perform” text. And honestly, we know that it’s likely that in our lifetime TTS will sound like Morgan Freeman. So the notion that the AG are don’t understand the technology is insulting. They understand exactly where this will go — which is why the argument that “it’s not a performance because it sounds like a robot” is particularly galling.


  11. Jane
    May 24, 2009 @ 17:19:04

    @Aloe I I’m not advocating for protecting author’s rights in this case because I don’t believe that TTS is a derivative audio work. The ebook is a derivative audio work, fixed in form. The spoken text is not and nor is it a public performance. Therefore TTS is not violating any right of an author. Believing that, I fail to see how this is a “moral issue”.

    My claim isn’t dancing around anything. I’ve specifically stated in previous comments and in this current one that I don’t believe that TTS is a derivative work, nor does it constitute a public performance. In fact, even if the text to speech function was superior to the audio book, I would make the same argument.


  12. Steve Weber
    May 24, 2009 @ 18:42:51

    Why did Amazon reverse themselves on this, about a week after this feature was introduced? My hunch is, someone at the company (belatedly) realized this would go to court very quickly, and that Amazon would lose. It’s a slam-dunk. Read the Copyright law.

    Look at it this way: Let’s suppose that a publisher’s books aren’t available in audiobook. The publisher doesn’t want to gamble on producing audiobooks — they’re not sure if the sales will cover the cost of production. Instead, the publisher downloads the books on a Kindle, records Kindle’s text-to-speech rendition of the books, then begins selling those files as audiobooks.

    How do you think Amazon would like them apples?


  13. Robin
    May 24, 2009 @ 18:49:57

    Seriously, who among you that thinks TTS is a derivative right will actually make an argument to support that position? Not just an assertion, but a substantive, legally-grounded analysis?


  14. Aloe I
    May 24, 2009 @ 19:17:27

    Jane – If that’s your position then you really should start this FAQ (which is ostensibly for the edification of authors) with an acknowledgment of your position. You’re presenting this as if it’s an unbiased legal overview, when in fact, you have a very clear position: “I'm not advocating for protecting author's rights in this case because I don't believe that TTS is a derivative audio work.”

    I suspect that most author’s will not share that point of view.

    Robin – it doesn’t take a legally-grounded analysis to show to reasonable people that Kindle’s TTS books are audio books. They are books. They are audio. It’s only the defenders of TTS who find themselve knotted up using copyright law to define TTS as something other than audio books.

    I’d challenge anyone to come up with a technical explanation of why TTS isn’t an audiobook (particularly now that Jane has admitted that the performative aspect of the work isn’t the point.) And if your argument comes down to the definition of “fixed works” in the digital age, I’d wager that Occam’s razor, and common sense, will sway the courts.

    And again, I don’t understand why Amazon and the consumer are more important here than the authors who, frankly, are behind it all


  15. Jane
    May 24, 2009 @ 19:24:14

    @Aloe I I have to confess I’m totally confused by what you are saying here. It is in the text of the post that I don’t believe that TTS is a derivative audio work. I don’t care that authors as a monolithic group do not agree with me. It is what I believe to be the correct interpretation of the law.

    Both Robin and I have laid out the position why, legally, TTS isn’t a) a derivative work and b) a public performance. Fixed works is actually an issue that the courts have grappled with in the digital age and it doesn’t apply to text to speech.

    Why shouldn’t the consumer be important? As Robin says, the consumers rights have been eroded to a far greater degree than the authors under digital publishing.


  16. Robin
    May 24, 2009 @ 19:34:15

    @Aloe I:

    Robin – it doesn't take a legally-grounded analysis to show to reasonable people that Kindle's TTS books are audio books. They are books. They are audio. It's only the defenders of TTS who find themselve knotted up using copyright law to define TTS as something other than audio books.

    AG’s position from the beginning has been that TTS is a violation of the author’s copyright, and therefore they have posed it as an inherently legal issue. Authors are concerned because they are being told by AG one of their legal rights is being violated. How can anything *but* a legal analysis be necessary and sufficient here?


  17. Aloe I
    May 24, 2009 @ 19:37:19

    Robin ,

    It’s a book that you listen to on headphones. It doesn’t take a lawyer to identify that as an audiobook. Although evidently it does take a lawyer to explain why it isn’t one.


  18. kirsten saell
    May 24, 2009 @ 23:12:34

    I'd challenge anyone to come up with a technical explanation of why TTS isn't an audiobook (particularly now that Jane has admitted that the performative aspect of the work isn't the point.)

    TTS isn’t a book of any kind. It’s a reading tool.

    Like other tools, such as VCRs and Tivo, it can be used in ways that are infringing (such as using TTS to read an ebook to a large public gathering or over a radio broadcast), and in ways that are not infringing (like plugging in your earbuds and listening in private). IIRC, VCRs were found to be not inherently in violation of copyright because the benefits of their non-infringing uses are outweighed by their potential infringing uses.

    An ebook is an ebook. It’s text in digital format. That’s all. If the technology exists for computers to interpret this text in robotic voice, a facsimile of Morgan Freeman, or to the tune of a Verdi opera, it’s still, in the form it is distributed to the consumer, only an ebook.

    If we’re debating the definition of transformative as it applies (or doesn’t) to TTS, doesn’t the same definition apply to teachers reading to children in the classroom? Or a conversion program that takes a PDF file and turns it into an epub file? One could argue that the act of reading itself, even in total silence and privacy, is a transformative act. Once the words leave the page and enter the reader’s brain, they become more than just the words.

    Since we’re pretending the technology will evolve to the point where TTS will sound like Morgan Freeman, why not take the argument a couple hundred years further into the future. If Commander Data reads a book aloud to Captain Picard, is he in violation of copyright simply because he’s a machine? IMO, a computer’s TTS is nothing more than eyes and a mouth. It doesn’t transform the work. It reads it.

    I’m an author. I’m in favor of TTS. Perhaps I would feel differently if, like many authors, I made a pathetic 6-8% royalty on my ebooks, and a larger percentage on audio. But the 30-40% I earn on each ebook I sell more than compensates me for the small fraction of readers who prefer or need to listen to their books. Either way, I’m making a decent buck off the sale. Readers have access to my work, and I have access to their dollars.

    If the Authors’ Guild got off their butts and started to push for better ebook royalties for all authors, that’s a campaign I could get behind. And if they were successful, I would imagine all this TTS nonsense would likely go away, like it should.

    In the meantime, there’s nothing stopping publishers from disabling TTS on specific ebooks should they wish to. In those cases, vision impaired readers will have to wait until a charity somewhere decides the book they want is worth the expense of recording or rendering in Braille, and that book will be provided at huge cost to a lot of people, including the reader–and the author will get no royalty at all.

    Yup, that’s a fine solution right there. Uh huh.


  19. Areader
    May 24, 2009 @ 23:54:38

    @Stephen. Amazon backed down because it was in their interests to back down. They saw a way to make money by DRM’ing the books up the wazoo. If they don’t already I’m sure publishers will have to pay for this, and if you ever want to enable TTS, which is now non-standard you’ll probably have to pay Amazon for the privilege.


  20. Maili
    May 25, 2009 @ 01:47:48

    @Aloe I

    I'd challenge anyone to come up with a technical explanation of why TTS isn't an audiobook

    I suspect you’re not usually an audiobook reader (CD/digital/cassette) and haven’t used TTS much because otherwise you wouldn’t ask that or even mention Morgan Freeman.

    The biggest technical difference between TTS and an audiobook: the voice actor. That’s one.

    The next difference is TTS relies on a program, not an actor, to read text. Your mention of using Morgan Freeman’s voice in TTS still doesn’t make it an audiobook. He can only offer one thing to a TTS engine: his voice. Not his vocal performance, just his voice.

    Already over a hundred years old, a typical TTS engine still hasn’t managed to master several issues, due to the limitations of its source-filter-recognition engine. It still can’t handle/identify technical expressions, pronunciations, emotion inflections, grammar styles, foreign words, brand names, unlisted words, heteronyms, accents, abbreviations, and certain names. And there’s the speech synthesis issue and there is a time lag when the recognition engine searches and finds a word in its vocabulary.

    With all that in mind, the general recognition of Morgan Freeman’s distinctive voice will be reduced (his voice in TTS format will reveal he can’t use his voice acting ability at all).

    From a reader’s point of view, TTS is severely inferior to an audiobook. Not surprising because TTS is practically a vocal transliteration of the written word. This is why TTS is seen as a reading aid.

    If the TTS engine and its recognition engine somehow improved during next twenty years, it would still not be recognised as an audiobook because it relies on a program – not a voice actor – to read text.

    In short, the voice actor is the biggest technical difference between TTS and an audiobook.

    Frankly, I find the opposition to TTS surprising – and perhaps, bizarre – because it’s akin to having a person arguing an ebook device shouldn’t have a Font Size function because it’s infringing on author’s Large Print rights.


  21. Maili
    May 25, 2009 @ 02:05:04

    @Steve Weber

    Why did Amazon reverse themselves on this, about a week after this feature was introduced? My hunch is, someone at the company (belatedly) realized this would go to court very quickly, and that Amazon would lose. It's a slam-dunk. Read the Copyright law.

    Because Amazon realised it’ll affect their working relationships with certain organisations and audiobook companies if they chose to ignore their objections. is a Amazon subsidiary and Amazon is looking to corner the digital audiobook market. That’s why Amazon quickly backed down when Authors’ Guild and others raised objections.

    Amazon knows the digital audiobook market is one of the fastest growing fields of the digital age (its sales performance is better than the ebook market’s performance, apparently) and they don’t want to miss out on it. And they will if they overruled those objections.

    The cynic in me says it’s nothing to do with copyright laws. It’s to do with wanting to protect Amazon itself.


  22. dotty
    May 25, 2009 @ 02:37:26

    If the authors are concerned about TTS being a substitute for audiobooks perhaps the should read some of the reviews at Audible.
    The narrator of the audio book is considered so important by many many customers of Audible, that if a narrator is not considered good then customers will not purchase any more books voiced by that narrator, no matter who the author is. They have awards for the best narrators and each book has an excerpt so customers can listen to the book to see if they like the voice.
    I find it strange that (some) authors don’t value these narrators of their books. I have personally become hooked on certain authors (J.D. Robb/Nora Roberts) because of the outstanding reading skills of the voice actors.
    In my opinion to suggest that TTS is a substitute for these wonderful voice actors is like saying cardboard cut outs can act parts in movies (although some actors…). Their is just no comparison.
    And the most disturbing thing of all is people who have sight problems are having opportunities taken away from them. It’s a shame, after all one presumes the blind person who wishes to utilize TTS on their ebook reader, will purchase the book to listen to, so it’s not like the author is being deprived of their royalties. Shame.


  23. Courtney Milan
    May 25, 2009 @ 06:04:19

    TTS is a display technology, not independent content.

    If you sell an e-book. you’re giving the person license to display that work in private. So they can read it on their computer, or download it to a device. They can change the size of the text or render it in high contrast colors. They can even get a digital projector and send the text onto their living room wall and read it that way. I just don’t care. Those are all automated changes in display, and the devices that render them as such are devices that provide display methods, not actual content.

    Another automated change in display, one that I think is implied in the ebook license, is the license to convert the text from visual information to tactile information–thus, for instance, this device here can be used to transform ebooks into braille encodings. Again, this is not actual content; it’s a way to deliver information.

    TTS isn’t any different. It blindly translates e-book text into the aural equivalent. It is nothing more than a different display.

    Audiobooks, on the other hand, are a lot more, and here’s why: The person reading the audio book UNDERSTANDS what is written on the page, and PERFORMS the reading because of it. That’s the transformative part of an audio book: The story is being run through someone else’s brain, and it takes on an additional cast that you could not get anywhere else. Take a step back and think about that–the reason audio books are transformative is because the person who is reading it aloud thinks about the text of the book. It is not read blindly. It is not read stupidly. It even pronounces all the words right.

    Sorry guys, this cannot be done by computer. Not now. Not in five years. Not in twenty years. This can’t be done by computer until the computer understands the book it’s reading, which it won’t do until we have artificially intelligent computers. We don’t. We aren’t even freaking close.

    If we ever have artificially intelligent computers–ones that can think about the text and transform it–it’s the intelligence that will make the resulting audio work transformational. The read-aloud bit . . . that’s just display.


  24. Imogen Howson
    May 25, 2009 @ 09:00:06

    I suspect that most author's will not share that point of view.

    I’m an author–published in ebooks only, so far–and I share the point of view that TTS is not a derivative audio work. IMO, TTS is to books what subtitles are to films–and I’ve never heard anyone argue that subtitles are akin to a “book of the film” and therefore should be handled as separate rights.

    Also, TTS is not a performance, no matter who it sounds like. I suppose if someone hooks their Kindle to a microphone and broadcasts it over the net, that would count as a public performance, but that’s quite different from someone using it at home/in their car.

    I’m very unhappy with the idea of limiting vision-impaired readers’ access to ebooks. Ebooks are already less accessible than print books (you have to have at least a computer and internet access to read them, and you can’t borrow them or buy them secondhand like you can print books)–I cringe at the idea of making them even less accessible by taking away the TTS facility.


  25. Imogen Howson
    May 25, 2009 @ 09:06:15

    If Commander Data reads a book aloud to Captain Picard, is he in violation of copyright simply because he's a machine?

    Ha. :-)

    And the font-size/Large Print right analogy is a really good one.


  26. kirsten saell
    May 25, 2009 @ 09:32:57

    Frankly, I find the opposition to TTS surprising – and perhaps, bizarre – because it's akin to having a person arguing an ebook device shouldn't have a Font Size function because it's infringing on author's Large Print rights.

    It’s akin to telling a reader he can’t wear glasses when reading, because the words are transformed when they pass through the lenses.

    Sorry guys, this cannot be done by computer. Not now. Not in five years. Not in twenty years. This can't be done by computer until the computer understands the book it's reading, which it won't do until we have artificially intelligent computers. We don't. We aren't even freaking close.

    I would still argue that even a computer with AI equivalent to a human reading a book aloud could still not be considered any different than a human doing the same thing–it might be a performance, and it might be transformative, but infringement would still depend on whether the AI was reading to a single person or small group in privacy, or delivering its rendition to large numbers of people.

    This whole thing is ridiculous. I have no objection to avarice (gimme gimme gimme)–but I think in this case it’s misguided and ultimately self-defeating. Ebooks are still a largely untapped market, but that’s changing, and I’d rather get behind a form of publishing that sells a book that can be accessed by a broad readership for an often very reasonable price and puts a tidy percentage in my pocket, than one whose growth will always be limited by the high costs of production. Again, I assert that if NY authors were making 40% off every ebook they sell, these objections would likely go away–especially since ebooks often cost so much less and are therefore more accessible to readers with limited budgets.

    If we’re to consider people who need their books read to them–those with vision impairments, severe dyslexia, etc–authors would be so much better off if they pushed for a better royalty on ebooks and forgot all this TTS nonsense. Audio books are expensive, and charities simply cannot keep up with demand. That leaves those readers either waiting (sometimes forever) for the books they want to be converted by a charity, or purchasing enormously expensive audiobooks. I’d rather have a blind person purchase every one of my books in digital (mmm, money….) than have them buy only one in audio because of money constraints, or a charity’s audio version, for which I get no royalty at all.

    And still, even if the technology was perfect, I still can’t say I’d have any real objection to it. Like VCRs, as long as the technology is used in non-infringing ways, it’s fine by me. Once that ebook is in the reader’s hands and their money is in my pocket, they can consume it any way they want–change the text to purple Wingdings, make it three inches high, have their significant other read it aloud. Or if the technology exists, have their computer sing it in vibrato, or transform it into electrical impulses and complex proteins and jack the whole thing directly into their brains. Makes no difference to me, because the ebook they purchased is still only an ebook. Text in digital form.


  27. Chicklet
    May 25, 2009 @ 11:42:18

    To me, TTS is an adaptive device, akin to not only large-print books, but even the ramp that allows wheelchair users access to their neighborhood library. TTS is what allows blind or limited-vision users, or anyone with learning disabilities, to read any book that is commercially available in electronic form, without having to wait for a publisher to release an audiobook or for a volunteer to record the book for a Services for the Blind organization.


  28. Courtney Milan
    May 25, 2009 @ 12:35:24

    I would still argue that even a computer with AI equivalent to a human reading a book aloud could still not be considered any different than a human doing the same thing-it might be a performance, and it might be transformative, but infringement would still depend on whether the AI was reading to a single person or small group in privacy, or delivering its rendition to large numbers of people.

    Yeah, consider that proviso added on to what I said, with agreement. I’m just trying to say the doomsday scenario is set at the point where there are machines smart enough to be also writing our books, at which point authors have a bigger problem than Kindle TTS rights.

    And–let’s set aside the NY versus e-published author thing–I’m NY published, and I don’t think TTS infringes. Neil Gaiman is NY published, and he doesn’t think TTS infringes. (Neil Gaiman is much cooler than I am, so I realize the inherent arrogance of pairing us like that.) How many NY-published authors really have an opinion or (even) care?

    Quite frankly, I’m happy if people read my book–no matter how they end up reading it.


  29. kirsten saell
    May 25, 2009 @ 13:32:22

    Quite frankly, I'm happy if people read my book-no matter how they end up reading it.

    That is something I wholeheartedly agree with. :)

    I only bring up NY versus e because many of the arguments I’ve seen from people who insist TTS damages authors, revolve around the fact that, sale for sale, audiobooks put more money in traditional authors’ pockets than ebooks do.

    It stands to reason, though, that a higher royalty on ebooks, paired with a low price that would have more people able to purchase (because frankly, if I were vision-impaired and had to choose between four TTS-enabled ebooks or one audiobook, I know where my money would be going), would neutralize or even reverse any potential monetary loss. I want as many readers as possible to have affordable, legal access to my books. I want to make money. I’m having serious trouble finding a relevant downside to TTS.

    I’ve never purchased an audiobook. Never had the desire to. But if I did have the desire to, the prices would certainly put severe limitations on the number of them I could purchase. Asking vision-impaired/learning disabled people to pay those kinds of prices or wait indefinitely on someone else’s charity, is deeply unfair. Especially when TTS itself is no more inherently infringing than a VCR. Arguing that a tool like TTS shouldn’t be available because it might be used for infringement is like saying kitchen knives should be banned because people are sometimes stabbed with them.


  30. Dear Author posts text to speech FAQ for authors | TeleRead: Bring the E-Books Home
    May 25, 2009 @ 14:40:58

    [...] you’re an author, or if you are just curious, you might want to go over to Dear Author and take a look and Jane’s FAQ. Here’s an excerpt: At the suggestion of Peter Brantley of the Internet Archive, I offer up [...]

  31. An update | Flight into Fantasy
    May 26, 2009 @ 11:58:54

    [...] Dear Author has posted a FAQ about the text-to-speech issue for authors. I have written about this issue quite enough myself, but it’s still one that interests me, and, hey, at least the comments section didn’t make me want to tear out my hair. Much. [...]

  32. Eric Patterson
    Dec 15, 2010 @ 23:18:44

    i think it is expensive to get a hair transplant but the procedure is well worth it -~`


  33. Andrys
    Nov 08, 2011 @ 09:49:59

    The law though is that if one qualifies for needing text-to-speech to read the book, then the text-to-speech version must be offered.. I can find out where you apply for that if you’d like.


Leave a Reply

Notify me of followup comments via e-mail. You can also subscribe without commenting.

%d bloggers like this: