I’ve recently been playing with dissecting the workings of the G400 Doorbell and have come up with a proof of concept integration for implementing 2-way audio in Home Assistant. No cloud, no hub, fully local.
Disclaimer
This is NOT a complete production ready integration for the Aqara G400 Doorbell, it is a proof of concept designed to test if fully local two-way audio independent of any hub/app/cloud connection is possible with the device. It is recommended to use the doorbell in Home Assistant via the official HomeKit integration.
Please do not install this integration and expect it to provide a fully functioning doorbell, it is purely a proof of concept for 2-way audio in Home Assistant
Features
- Video streaming via RTSP (H.264)
- Two-way audio via go2rtc backchannel — speak through the doorbell from the HA dashboard
- Doorbell press detection via UDP multicast
- Audio file playback — play pre-recorded messages through the doorbell speaker
- Fully local — all communication stays on your LAN
I’ll hopefully be adding more features and making it work smoother in the coming future.
The end goal is to get the Aqara 2-way audio backchannel implemented natively in Go2RTC.
And now for the technical bit.
How it works
The Aqara G400 LAN talk protocol was reverse-engineered from the Aqara Android app. It uses three independent network channels:
| Channel | Protocol | Port | Purpose |
|---|---|---|---|
| Video/Audio stream | RTSP over TCP | 8554 | H.264 video + AAC audio from camera |
| Voice control | TCP | 54324 | Session management (start/stop/heartbeat) |
| Voice audio | UDP (RTP) | 54323 | AAC-LC ADTS frames to doorbell speaker |
| Doorbell press | UDP multicast | 230.0.0.1:10008 | Button press notification |
Control channel (TCP 54324)
Uses a custom LmLocalPacket binary format:
Magic (0xFEEF) | Type (1B) | Payload Length (2B) | Payload (N) | CRC-16 (2B)
- START_VOICE (type 0): Opens session, payload is epoch milliseconds
- STOP_VOICE (type 1): Closes session
- ACK (type 2): Response from camera (0 = success)
- HEARTBEAT (type 3): Sent every 5 seconds to keep session alive
CRC is CRC-16/KERMIT (polynomial 0x8408, init 0xFFFF, final XOR 0xFFFF).
Audio channel (UDP 54323)
RTP (RFC 3550) with payload type 97 (dynamic AAC):
- Codec: AAC-LC ADTS, 16kHz, mono, 32kbps
- Frame size: 1024 samples (64ms per frame)
- RTP timestamp clock: 16kHz
Available Services
| Service | Description |
|---|---|
aqara_doorbell.talk_start |
Open a voice session with the doorbell |
aqara_doorbell.talk_stop |
Close the active voice session |
aqara_doorbell.talk_audio_file |
Play an AAC audio file through the doorbell speaker |
Full details, install instructions, dashboard cards, and example automations are in the Github repository:





