-
Notifications
You must be signed in to change notification settings - Fork 137
Add podcast feature to website #992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add podcast feature to website #992
Conversation
I think we can afford a separate PR per audio recap. The integration with the newsletter is slick, I like it! I‘ve listened to a couple minutes, sounds good. I’ll let others chime in more, for now. |
There are some folks doing this(speaker diarization). I can dig a bit deeper after some more feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! I took a quick look at the preview, and I'm impressed! I left a few minor comments.
I'm in favor of submitting additional shows in batches (or individually). Maybe work from the present backwards so we can show the feature off sooner than latter (but I think that's up to whoever's doing the work).
Oh, can we maybe also say something about the licensing on the podcast? Y'all have been doing it, so I think it's up to you to choose a license (even if you want it proprietary), but I think we should explicitly mention the copyright either on the individual podcast pages or on the /en/podcast/ page. |
@bitschmidty: I’d be open to a permissive license, e.g. CC BY. Given our mission it makes sense to me that we allow commercial use, but I’d be okay to restrict it to share-alike (CC BY-SA), or even No Derivatives (CC BY-ND), if you would prefer it not to be remixed. On second thought, I see that the Bitcoin Optech content is generally licensed per MIT license, that’s fine with me, too. |
FWIW, I'd prefer to see Optech move to a CC license, as I don't think MIT is really designed for natural-language content (even if it does mention documentation). My preference would be CC-BY-SA as I've always been a copyleft guy. But when it comes to the podcast, I again think that's a decision for y'all. |
License: Id like to allow folks to reuse with attribution. "putting the headphones icon at the end of an item": Agreed! Transcripts: I had a vendor submit a free sample transcription to compare. See attached for quality. They estimate £84.00/hr of audio for a turnaround of 6+ days. This is the team that transcribes the What Bitcoin Did Podcast so they are somewhat familiar with the Bitcoin jargon. |
On a quick skim, that looks really good to me. Doing the quick math of an rough upper bound of two hours per week, 51 newsletters per year, times 1.21 exchange rate comes to ~$10k/year. That doesn't sound like much to me compared to the benefit of getting top-quality transcripts of conversations with subject matter experts about topics important to the future development of Bitcoin and LN. If they have a lower rate for slower turnaround, or if 6 days isn't fast enough for us, we could do the initial transcription for new episodes using software and then update with the human-translated version later. |
bec4b15
to
e3bd55e
Compare
Pushed updates:
I think we are ready to go live with this if all looks good. Edit: preview of news 239 podcast page: https://deploy-preview-992--bitcoinops.netlify.app/en/podcast/2023/02/23/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK with one non-critical suggestion.
I reviewed the diff, visited each of the edited pages, clicked a bunch of links, tested all of the podcast links I could, read a bunch of the transcript, and listened to a few segments using the JS player.
This is absolutely incredible. I'm just blown away. Thank you so much @bitschmidty for both the idea to do this and the follow-through. ❤️ ❤️ ❤️
en/publications.md
Outdated
|
||
{% else %} | ||
{:.center} | ||
Recent publications from our [blog posts][] and [newsletters][]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have the list, I think this "Recent publications..." line is redundant and should be removed.
Pushed changes for:
|
ACK ab113e1 with enthusiasm! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d like to second @harding's enthusiasm, this is amazing work, thank you a lot @bitschmidty. I clicked through the preview a bit, tried a bunch of links for jumping to parts of the transcript and to timestamps in the audio, and added the podcast to my podcast app—it all worked great. I just got a tiny nit that you can feel free to ignore if you don’t share it.
ACK ab113e1
fb3471f
to
84543ac
Compare
Took @xekyo 's suggestion, squashed, and merged! 🚀 |
At bitcointranscripts we developed a transcription pipeline that streamlines the AI-generation, human review, and publication of technical Bitcoin transcripts. I believe that incorporating Newsletter recaps into this system would be a perfect synergy. Adding the Newsletter recaps into our transcription pipeline would lead to faster transcript turnaround times while providing a different path to the community to delve into the topics and engage with the discussions. I already pushed yesterday's recap into the pipeline. The AI-generated transcript is now available, awaiting claim by a reviewer at review.btctranscripts.com. Once claimed, it will be reviewed, edited, and submitted for evaluation. Upon approval, the finalized transcript will be accessible at the original link. If that sounds interesting, I would love to explore what we need to do to integrate with your existing workflow. |
@kouloumos What I do currently for context, including audio editing:
Pros of btc transcripts proposed setup: quicker initial transcription, involvement of community, no $ cost Open to others feedback, @murchandamus @harding |
I must admit I do like the predictable turn-around and high quality of our current pipeline. I would imagine (and at a cursory glance seem to be correct) that the initial automatic transcript would not measure up to our human transcription and I would also guess that the improved transcript would take an unpredictable time on btctranscripts. That implies to me that we would first have a lower quality transcript and touch it up later again when we get the improved version. Overall, it seems like it would be higher touch to go with the new proposed route, and I’m not sure I see major upsides beyond a lower cost. @kouloumos, could you perhaps compare the advantages and disadvantages as you see them? While I don’t know how much the transcription costs, it seems to me that the current approach works well, but then @bitschmidty is doing all the work for that these days, so if @bitschmidty prefers what you propose, I’m happy to roll with it. |
Strongly agree with this:
Having high quality transcripts is really important to me but the method we use to get there isn't something I care much about (so long as it's ethical). I would like to continue hosting transcripts on our site as I want to incorporate them with our topics index and other site features in the future (when I finish the transition to Hugo), although I don't mind other sites also hosting them. |
Thank you all for your feedback and insights. After reflecting on your comments, I let some time pass to observe the review process for the AI-generated newsletter transcript we added. Unfortunately, it took a month for the transcript to be claimed and reviewed by a human, which is less than ideal. Moreover, there's a bottleneck in the evaluation stage, primarily because I'm currently managing these evaluations myself. Given these observations, I agree about the unpredictability in both the timing and quality of reviews. It's clear that now is not the right time to incorporate Newsletter recaps into our system. However, I'm optimistic about future improvements that could address these concerns:
I'm aware that our current system doesn't yet match the predictability and quality of your existing setup. However, I believe the proposed improvements could align our turnaround times and maintain high standards, with the added benefits of: automatic timestamp inclusion, quicker initial transcription, involvement of community, minimal $ cost and streamlined processes therefore minimal involvement required by you. Here’s how I envision the revised process:
We aren’t there yet, but your feedback is invaluable. I plan to revisit this proposal once we've implemented the necessary changes and feel confident in the enhanced system. In the meantime, any further input you have would be greatly appreciated. Thank you again for considering this integration and for the thoughtful discussion. |
Preview link of first episode
This PR adds an Optech Podcast to the site. Some notes:
In addition to feedback on approach and features, remaining todos: