Replies: 5 comments 2 replies
-
@shellfly I'm facing a similar problem. Were you able to figure this out? |
Beta Was this translation helpful? Give feedback.
-
@breadface Unfortunately, I didn't get a valid way to do it. Currently, I'm doing webrtc through a third party software like LiveKit and Agora which provide official integration to OpenAI Realtime API. The integration code are all open source, the code might be helpful for the research of this issue, but I don't have time to dig into it yet. |
Beta Was this translation helpful? Give feedback.
-
Sorry I didn't respond to this sooner @shellfly @breadface can you share a file that has problems? My guess is non-20ms page sizes? I will fix and get the changes landed in Pion! |
Beta Was this translation helpful? Give feedback.
-
@Sean-Der I got a similar answer about 20-ms page size on StackOverflow, but I didn't find a way to modify the page size in Go, so I didn't try it. |
Beta Was this translation helpful? Give feedback.
-
From my experiences working with Text To Speech output, it will be better to receive the output in PCM format. With this you can resample it to sample rate 48Khz with stereo format, and stream it in 20ms pace. Without following this requirement, the audio won't play in the browser. If you receive the audio output in ogg format, most likely the sample rate is in 24Khz and mono format. To resample it, you will decode and encode it again. I made a PR to add the example of TTS to WebRTC app for your reference. cc @Sean-Der might need a review to merge. Thanks |
Beta Was this translation helpful? Give feedback.
-
I'm using Text to Speech service to generate streaming OGG data, I want to send the streaming output into a WebRTC audio track. I checked the play-from-disk example in the repo and the below code is what I have written so far.
The problem is that I can send data to the client but I can't hear the audio on the client.
Other things I have tested:
oggreader.NewWith(data)
tooggreader.NewWith(f)
wheref
is a file opened from disk. There is sound in the client but the speed is weird.It seems it's something about the ogg options, but I'm not familiar with it. Any idea how can I get this code to work?
Beta Was this translation helpful? Give feedback.
All reactions