Podcast Transcripts and the Mechanical Turk

One of the recurring theme at ETECH last week was the “mechanical Turk”. In his introductory Keynote, Bruce Sterling first suggested that the artificial intelligence (AI) dream had slowed down the development of computer science in general. Because of this, research has been focused on emulating humans with machines, instead of complementing humans. Tim O’reilly talked about IA (intelligence augmentation) as opposed to AI. According to Wikipedia:

The Turk was a famous hoax which purported to be a chess-playing automaton first constructed and unveiled in 1769 by Wolfgang von Kempelen (1734-1804)

Mechanical Turk

In other words, the mechanical Turk is about putting a human intelligence inside the machine.

New web services are now implementing this very simple principle for applications where humans are much better than computers. In fact, amazon.com is offering a new open platform for the development of third party web services. It’s called… guess what? The “amazon mechanical turk” on the theme “Artificial Artificial Intelligence”.

More interesting to you, dear readers of Broadcasting 2.0, is that some people have built a podcasting transcription service called castingwords.com and it is actually based on amazon’s Turk system. In fact, for 42 cents a minute, castingwords.com will transcribe almost any podcast (in English) within 24 hours with the help of amazon’s “tested” transcription Turks.

And why would prodcast transcripts be useful? I think that they would mainly help increase the overall “granularity” of the “podcastsphere”. This, in turn, would drive much better finding, remixing and sharing capabilities.

Most of the time right now, search results lead to full podcast files with variable durations ranging from some minutes up to over an hour. These searches normally operate on podcast names or short descriptions. As a consequence, a search for interviews with a specific politician (for example) would result in many hours of listening because there is no good mechanism to locate specific content inside a podcast itself.

With good podcasts, “chapters” are very handy here but if you’re like me, your favorite podcast has no chapters. In my case, it’s a 2.5 hours French speaking CBC daily podcast called “Indicatif Présent“. Can you imagine? 12.5 listening hours per week. Do I have time to listen to all this? No. Would I like to be able to locate stuff more precisely here? Absolutely. Why? Because I NEED to be able to skip what’s not interesting to me.

Along with podcast tagging and content “markers”, transcripts would also support very important functions like remixing and sharing.

To me, remixing is the capability that I need to aggregate my personal podcast stream based on podcast segments that I get from different sources. Remixing here is the ability to collate 10 minutes from one show here with 2 minutes from another one there with 15 minutes of music with…, and so on. Again, I can’t do that easily with my favorite CBC show right now. Podcast users, and probably most of us in the future, will want that flexibility. There is too much good content out there.

Finally, transcripts alone may not be the solution but we need mechanisms to annotate (or tag) media content like we do for photos (Flickr.com) or bookmarks (del.icio.us.com). Good annotation allows for better retrieving as well as sharing possibilities. Very often, I find myself having to write down podcasts timing information in order to retrieve specific segments or share them with friends. That’s not convenient at all.

Coming back to our mechanical Turk and my favorite CBC podcast. With castingwords.com, the whole transcript of a single show would amount to roughly 75$ (46cents/min. x 150 min.). So is there a reason why CBC can’t do it right away (they could do it themselves if they wished). What is 75$ in a 2.5 hours public radio show budget?

Observations of my own behavior make me think that it all comes down to this: either they do it or they won’t get my attention!

Technorati Tags : , , , , ,

  1. Web Tasarim Adana’s avatar

    This article helped for me. Thank you, web tasarim adana

  2. Jos Schuurmans’s avatar

    CastingWords.com now charge $ 2,50 per minute for delivery within 24 hours. That’s considerably steeper than 42 (euro?) cents a minute.

    Totally agree with your problem description, though. In order to search, scan or mix parts of audio content, we sorely need transcripts and annotations.

    My example: I listened to this great interview on Tech Nation the other day, and I would really like to quote parts of that conversation on my blog and elsewhere. I would also love to have a transcript in my archive or perhaps on a service like Evernote so that it would come up in future search results from my own “stuff”.
    (http://itc.conversationsnetwork.org/shows/detail4460.html)

    If I was interviewed on Tech Nation – or if I was the broadcaster – I would make damn sure to provide transcripts. Not only would it be a great service to listeners of the show; it would also be a great asset for search optimization and findability.

    Apparently there is fairly decent speech-to-text software on the market (http://lifehacker.com/230009/call-for-help-podcast-to-transcript).

    I’ve been wondering why there isn’t an open source software project addressing this need. Or is there?

  3. Chloe	Anderson’s avatar

    i enjoy Podcasting on my desktop PC. it really helps me share my ideas and thoughts over the internet,`-