ActivityPub Viewer

A small tool to view real-world ActivityPub objects as JSON! Enter a URL or username from Mastodon or a similar service below, and we'll send a request with the right Accept header to the server to view the underlying object.

Open in browser →
{ "@context": [ "https://join-lemmy.org/context.json", "https://www.w3.org/ns/activitystreams" ], "type": "Page", "id": "https://lemmy.dbzer0.com/post/41533749", "attributedTo": "https://lemmy.dbzer0.com/u/andrew0", "to": [ "https://beehaw.org/c/foss", "https://www.w3.org/ns/activitystreams#Public" ], "name": "Open Source Text-to-Speech and Speech-to-Text on Android?", "cc": [], "content": "<p>Hello everyone! I am interested in replacing the Google <em>Speech Recognition and Synthesis</em> app on Android. For Speech-to-Text (STT), I’ve tried <a href=\"https://github.com/woheller69/whisperIME\" rel=\"nofollow\">Whisper</a> and <a href=\"https://gitlab.futo.org/keyboard/voiceinput\" rel=\"nofollow\">FUTO</a>, and settled on the latter because it seemed to be more versatile. Also, FUTO seems to have some decent recognition, but not yet capable of handling all the languages that I want. Regardless, so far happy with STT. The only annoyance I have is that it does not appear as an option in the settings for Speech recognition :(</p>\n<p>However, I can’t seem to find any replacements that have good Text-to-Speech (TTS) quality. I tried <a href=\"https://github.com/espeak-ng/espeak-ng\" rel=\"nofollow\">espeak-ng</a> and <a href=\"https://github.com/RHVoice/RHVoice\" rel=\"nofollow\">RHVoice</a>, but both have robotic outputs.</p>\n<p>Given the recent advancements in AI, I was expecting that there would be ways to incorporate open source TTS models like <a href=\"https://huggingface.co/onnx-community/Kokoro-82M-v1.0-ONNX\" rel=\"nofollow\">Kokoro</a> to generate speech on the go. Nevertheless, I could not really find any such apps so far.</p>\n<p>Has anyone managed to completely replace the Google app with (an)other privacy-focused FOSS app(s)?</p>\n", "mediaType": "text/html", "source": { "content": "Hello everyone! I am interested in replacing the Google *Speech Recognition and Synthesis* app on Android. For Speech-to-Text (STT), I've tried [Whisper](https://github.com/woheller69/whisperIME) and [FUTO](https://gitlab.futo.org/keyboard/voiceinput), and settled on the latter because it seemed to be more versatile. Also, FUTO seems to have some decent recognition, but not yet capable of handling all the languages that I want. Regardless, so far happy with STT. The only annoyance I have is that it does not appear as an option in the settings for Speech recognition :(\n\nHowever, I can't seem to find any replacements that have good Text-to-Speech (TTS) quality. I tried [espeak-ng](https://github.com/espeak-ng/espeak-ng) and [RHVoice](https://github.com/RHVoice/RHVoice), but both have robotic outputs. \n\nGiven the recent advancements in AI, I was expecting that there would be ways to incorporate open source TTS models like [Kokoro](https://huggingface.co/onnx-community/Kokoro-82M-v1.0-ONNX) to generate speech on the go. Nevertheless, I could not really find any such apps so far. \n\nHas anyone managed to completely replace the Google app with (an)other privacy-focused FOSS app(s)?", "mediaType": "text/markdown" }, "attachment": [], "sensitive": false, "published": "2025-04-05T16:54:56.790983Z", "updated": "2025-04-05T17:12:43.432618Z", "audience": "https://beehaw.org/c/foss", "tag": [ { "href": "https://lemmy.dbzer0.com/post/41533749", "name": "#foss", "type": "Hashtag" } ] }