Releases: joinly-ai/joinly
Releases · joinly-ai/joinly
v0.5.0
Added
- new tool to access screenshares and video snapshots inside the meeting for vision-enabled models (#128, #131)
- client changes to allow multimodal tool results (see below)
Improvements
- reduce deepgram transcription latency (#132)
- add speech event delay to avoid unwanted interruptions (#133)
- improve hybrid VAD (#134)
joinly-client (v0.1.15)
- new and more detailed system prompt and instructions (#116, #135)
- support for google as LLM provider (#118)
include
option for tool loading (#120)- custom post tool-call callbacks (#123, #125)
- improved tool result parsing including capabilities for multimodal results (#126)
- message history filtering for large tool results (#127)
- update README example (#130)
v0.4.2
Improvements
- reduce audio format conversions in pipeline (#113)
- hybrid VAD implementation between webrtc and silero (#114)
Fixed
- unmute automatically on start (#112)
- round start/end times in transcript for consistency (#109)
- keep webrtc package in lite build (#111)
joiny-client (v0.1.8)
- add direct
list_tools
andsession
access to joinly client (#104) - split system prompt and custom instructions (#105)
- use one mcp cient per server, instead of stateless proxy (#106)
- add message history limit to reduce token usage (#107)
- change date formatting to ISO (#108)
Others
- refactor to use type keyword (#103)
v0.4.1
Improvements
- upgrade kokoro-onnx to 0.4.9 (#88)
- add model improvemet program opt-out for deepgram (#90, #97)
- default to
--client
if meeting url is given (#93) - set default logging to info level for cli (#96)
- remove suffix for platform error messages (#98)
Fixed
joinly-client (v0.1.7)
- fix openai issues with openai>=1.99.2 (#91)
- fix gpt-5 usage by disabling sampling settings (#94)
- wrap tool calls for direct usage with client (#99)
- add debug logging for agent (#100)
- improve agent for gpt-5 (#101)
Others
- update environment names in release workflow (#87)
v0.4.0
New: joinly-client (v0.1.5)
- complete rewrite of all client functionality, now using a single
joinly-client
package (#84) - support for usage in code and via cli
Added
- LLM token usage tracking for a session (#75)
- STT/TTS API usage tracking, characters or audio minutes (#80)
- setting session configurations via
joinly-client
Improvements
- event bus for transcript updates (#65)
- live segment transcription at segment level (#63)
- improve lite image by removing unnecessary dependencies (#79)
- unify logging of live segments (#82)
--env-file
cli option to specify a custom environment file (#74)
Fixed
- set explicit logging level only for own logger (#78)
- correctly fail on unsupported meeting platforms (#77)
- fix fatal server crash on session errors (#64)
Others
v0.3.3
Improvements
- add health check endpoint to MCP server (#66)
- improve default voice selection for ElevenLabs TTS (#68)
- adapt logging levels for less noise in the logs (#69)
- update release workflow pipeline and update cuda image tag (#70)
- add lite image variant without local model weights (#71)
Fixed
- mark zoom waiting room as a successful join to fix potential timeouts (#67)
v0.3.2
Improvements
- allow setting session-specific settings from the client (e.g., which STT/TTS), this will be further improved in the next release with client improvements (#57)
- remove redundant leave on exit (#59, #62)
- remove browser agent (#56)
Fixed
- zoom additional passcode handling (#61)
- deepgram misses first word (#60)
- resource subscribe flow (#58)
- enforce maximum message length (#55)
- make opening menu panels more robust and remove deprecated timeout (#54)
- change last segment timing to start in example (#53)
- simplify chat timestamps (#52)
- await meeting provider join before initializing transcript (#51)
v0.3.1
Added
- ElevenLabs TTS support via
--tts elevenlabs
(#47) - new setting
--lang <language_code>
to set the language for TTS and STT (depends on support of services) (#46)
Improvements
- streamline speech controller implementation (#41)
- improve error handling and interrupts in speech controller (#41)
- force leave by closing page on failed leave action (#35)
- auto leave on session tear down (#38)
- set docker logging to plain format (#39)
Fixed
- handle exceptions during agent invocation (#45)
- log speech-to-text exceptions (#44)
- ensure aligned segment timestamps in transcript (#40)
- fail on failed deepgram connection (#37)
- propagate ProviderNotSupportedError (#36)
- stop adding a segment for an interrupted speech without any spoken text (#48)
- fix no new segment error due to compact transcript after interruption (#49)
v0.3.0
Added
- add
get_transcript
tool for fetching the meeting transcript with timestamp filters (#21) - real-time speaker attribution for the transcript, in core app and all platforms (#27)
- new tool
get_participants
to retrieve the current meeting participants with available meta-data (e.g., host, muted/unmuted) (#28)
Improvements
- better internal meeting time measurement with more accurate start and end times (#18)
- shared meeting clock object for synchronized internal time handling (#22)
- add speech through
speak_text
tool to the meeting transcript (not included intranscript://live
resource, but inget_transcript
) (#23, #24) - length-based TTS pre-chunking for better performance with long texts (#25)
- more compact transcripts by merging nearby segments of the same speaker for better LLM handling (#26)
- browser action improvements, for more robustness and some fixes (#30, #33, #34)
- teams live platform support using the existing teams platform actions (#31)