音视频入门基础RTP专题18FFmpeg源码中,获取RTP的音频信息的实现上

2025-03-13 约 2940 字预计阅读 6 分钟

https://bing.ee123.net/img/rand?artid=146157021

音视频入门基础：RTP专题（18）——FFmpeg源码中，获取RTP的音频信息的实现（上）

由于本文篇幅较长，分为上、下两篇。

一、引言

通过FFmpeg命令可以获取到SDP描述的RTP流的的音频压缩编码格式、音频压缩编码格式的profile、音频采样率、通道数信息：

ffmpeg -protocol_whitelist "file,rtp,udp" -i XXX.sdp

而由《》可以知道，SDP协议中，a=rtpmap属性和a=fmtp属性中的config参数都会重复包含音频压缩编码格式、音频采样率、通道数音频这些信息。所以FFmpeg到底获取的是哪个地方的音频信息呢，本文为大家揭开谜底。

二、音频压缩编码格式

FFmpeg获取SDP描述的RTP流的音频压缩编码格式，是从SDP的a=rtpmap属性获取的。比如SDP中某一行的内容为：

a=rtpmap:97 MPEG4-GENERIC/48000/2

FFmpeg识别到上述“a=rtpmap”这个后，会把后面的字符串“MPEG4-GENERIC”提取出来，检测是否存在相应的音视频压缩编码格式。

具体可以参考：《》。

当识别到上述“a=rtpmap”这个后，sdp_parse_line函数中会调用sdp_parse_rtpmap函数：

else if (av_strstart(p, "rtpmap:", &p) && s->nb_streams > 0) {
            /* NOTE: rtpmap is only supported AFTER the 'm=' tag */
            get_word(buf1, sizeof(buf1), &p);
            payload_type = atoi(buf1);
            rtsp_st = rt->rtsp_streams[rt->nb_rtsp_streams - 1];
            if (rtsp_st->stream_index >= 0) {
                st = s->streams[rtsp_st->stream_index];
                sdp_parse_rtpmap(s, st, rtsp_st, payload_type, p);
            }
            s1->seen_rtpmap = 1;
            if (s1->seen_fmtp) {
                parse_fmtp(s, rt, payload_type, s1->delayed_fmtp);
            }
        } 

sdp_parse_rtpmap函数中又会调用ff_rtp_handler_find_by_name函数：

/* parse the rtpmap description: <codec_name>/<clock_rate>[/<other params>] */
static int sdp_parse_rtpmap(AVFormatContext *s,
                            AVStream *st, RTSPStream *rtsp_st,
                            int payload_type, const char *p)
{
//...
    if (par->codec_id == AV_CODEC_ID_NONE) {
            const RTPDynamicProtocolHandler *handler =
            ff_rtp_handler_find_by_name(buf, par->codec_type);
    //...
    }
//...
}

ff_rtp_handler_find_by_name函数定义如下，可以看到其内部调用了rtp_handler_iterate函数：

const RTPDynamicProtocolHandler *ff_rtp_handler_find_by_name(const char *name,
                                                       enum AVMediaType codec_type)
{
    void *i = 0;
    const RTPDynamicProtocolHandler *handler;
    while (handler = rtp_handler_iterate(&i)) {
        if (handler->enc_name &&
            !av_strcasecmp(name, handler->enc_name) &&
            codec_type == handler->codec_type)
            return handler;
    }
    return NULL;
}

rtp_handler_iterate函数定义如下，可以看到该函数内部遍历了全局数组rtp_dynamic_protocol_handler_list：

static const RTPDynamicProtocolHandler *rtp_handler_iterate(void **opaque)
{
    uintptr_t i = (uintptr_t)*opaque;
    const RTPDynamicProtocolHandler *r = rtp_dynamic_protocol_handler_list[i];

    if (r)
        *opaque = (void*)(i + 1);

    return r;
}

rtp_dynamic_protocol_handler_list数组定义如下：

static const RTPDynamicProtocolHandler *const rtp_dynamic_protocol_handler_list[] = {
    /* rtp */
    &ff_ac3_dynamic_handler,
    &ff_amr_nb_dynamic_handler,
    &ff_amr_wb_dynamic_handler,
    &ff_dv_dynamic_handler,
    &ff_g726_16_dynamic_handler,
    &ff_g726_24_dynamic_handler,
    &ff_g726_32_dynamic_handler,
    &ff_g726_40_dynamic_handler,
    &ff_g726le_16_dynamic_handler,
    &ff_g726le_24_dynamic_handler,
    &ff_g726le_32_dynamic_handler,
    &ff_g726le_40_dynamic_handler,
    &ff_h261_dynamic_handler,
    &ff_h263_1998_dynamic_handler,
    &ff_h263_2000_dynamic_handler,
    &ff_h263_rfc2190_dynamic_handler,
    &ff_h264_dynamic_handler,
    &ff_hevc_dynamic_handler,
    &ff_ilbc_dynamic_handler,
    &ff_jpeg_dynamic_handler,
    &ff_mp4a_latm_dynamic_handler,
    &ff_mp4v_es_dynamic_handler,
    &ff_mpeg_audio_dynamic_handler,
    &ff_mpeg_audio_robust_dynamic_handler,
    &ff_mpeg_video_dynamic_handler,
    &ff_mpeg4_generic_dynamic_handler,
    &ff_mpegts_dynamic_handler,
    &ff_ms_rtp_asf_pfa_handler,
    &ff_ms_rtp_asf_pfv_handler,
    &ff_qcelp_dynamic_handler,
    &ff_qdm2_dynamic_handler,
    &ff_qt_rtp_aud_handler,
    &ff_qt_rtp_vid_handler,
    &ff_quicktime_rtp_aud_handler,
    &ff_quicktime_rtp_vid_handler,
    &ff_rfc4175_rtp_handler,
    &ff_svq3_dynamic_handler,
    &ff_theora_dynamic_handler,
    &ff_vc2hq_dynamic_handler,
    &ff_vorbis_dynamic_handler,
    &ff_vp8_dynamic_handler,
    &ff_vp9_dynamic_handler,
    &gsm_dynamic_handler,
    &l24_dynamic_handler,
    &opus_dynamic_handler,
    &realmedia_mp3_dynamic_handler,
    &speex_dynamic_handler,
    &t140_dynamic_handler,
    /* rdt */
    &ff_rdt_video_handler,
    &ff_rdt_audio_handler,
    &ff_rdt_live_video_handler,
    &ff_rdt_live_audio_handler,
    NULL,
};

rtp_dynamic_protocol_handler_list数组中的每个元素都把SDP协议中的和具体的音视频压缩编码格式对应起来，比如“MPEG4-GENERIC”对应AAC：

const RTPDynamicProtocolHandler ff_mpeg4_generic_dynamic_handler = {
    .enc_name           = "mpeg4-generic",
    .codec_type         = AVMEDIA_TYPE_AUDIO,
    .codec_id           = AV_CODEC_ID_AAC,
    .priv_data_size     = sizeof(PayloadContext),
    .parse_sdp_a_line   = parse_sdp_line,
    .close              = close_context,
    .parse_packet       = aac_parse_packet,
};

X-MP3-draft-00"对应MP3ADU：

static const RTPDynamicProtocolHandler realmedia_mp3_dynamic_handler = {
    .enc_name   = "X-MP3-draft-00",
    .codec_type = AVMEDIA_TYPE_AUDIO,
    .codec_id   = AV_CODEC_ID_MP3ADU,
};

所以sdp_parse_rtpmap函数中执行语句：ff_rtp_handler_find_by_name(buf, par->codec_type)后，会通过遍历全局数组rtp_dynamic_protocol_handler_list，找到SDP协议中的对应的音视频压缩编码格式（比如“MPEG4-GENERIC”对应的是AAC），将其赋值给st->codecpar->codec_id（即AVCodecParameters的codec_id）：

/* parse the rtpmap description: <codec_name>/<clock_rate>[/<other params>] */
static int sdp_parse_rtpmap(AVFormatContext *s,
                            AVStream *st, RTSPStream *rtsp_st,
                            int payload_type, const char *p)
{
//...
    if (par->codec_id == AV_CODEC_ID_NONE) {
            const RTPDynamicProtocolHandler *handler =
            ff_rtp_handler_find_by_name(buf, par->codec_type);
    //...
    }
//...
}

然后在sdp_parse_line函数外部，通过avcodec_parameters_to_context函数将AVCodecParameters的codec_id赋值给AVCodecContext的codec_id：

int avcodec_parameters_to_context(AVCodecContext *codec,
                                  const AVCodecParameters *par)
{
//...
    codec->codec_id   = par->codec_id;
//...
}

然后在dump_stream_format函数中，通过avcodec_string函数中的语句：codec_name = avcodec_get_name(enc->codec_id) 拿到AVCodecContext的codec_id对应的音频压缩编码格式名称。最后再在dump_stream_format函数中将音频压缩编码格式打印出来：

void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode)
{
//...
    codec_name = avcodec_get_name(enc->codec_id);
//...
}

所以FFmpeg获取SDP描述的RTP流的音频压缩编码格式，是从SDP的“a=rtpmap”这一行获取的：

三、音频压缩编码格式的profile

音频压缩编码格式还有附带的profile（规格）。比如音频压缩编码格式为AAC，根据《ISO14496-3-2009.pdf》第124页，还有AAC Main、AAC LC、AAC SSR、AAC LTP这几种规格：

FFmpeg获取SDP描述的RTP流的音频压缩编码格式的profile，获取的是a=fmtp属性中的config参数中的信息，即AudioSpecificConfig中的audioObjectType。由《》可以知道，Audio Specific Config中存在一个占5位或11位的audioObjectType属性，表示音频对象类型：

0: Null

1: AAC Main

2: AAC LC (Low Complexity)

3: AAC SSR (Scalable Sample Rate)

4: AAC LTP (Long Term Prediction)

5: SBR (Spectral Band Replication)

6: AAC Scalable

7: TwinVQ

8: CELP (Code Excited Linear Prediction)

9: HXVC (Harmonic Vector eXcitation Coding)

10: Reserved

11: Reserved

12: TTSI (Text-To-Speech Interface)

13: Main Synthesis

14: Wavetable Synthesis

15: General MIDI

16: Algorithmic Synthesis and Audio Effects

17: ER (Error Resilient) AAC LC

18: Reserved

19: ER AAC LTP

20: ER AAC Scalable

21: ER TwinVQ

22: ER BSAC (Bit-Sliced Arithmetic Coding)

23: ER AAC LD (Low Delay)

24: ER CELP

25: ER HVXC

26: ER HILN (Harmonic and Individual Lines plus Noise)

27: ER Parametric

28: SSC (SinuSoidal Coding)

29: PS (Parametric Stereo)

30: MPEG Surround

31: (Escape value)

32: Layer-1

33: Layer-2

34: Layer-3

35: DST (Direct Stream Transfer)

36: ALS (Audio Lossless)

37: SLS (Scalable LosslesS)

38: SLS non-core

39: ER AAC ELD (Enhanced Low Delay)

40: SMR (Symbolic Music Representation) Simple

41: SMR Main

42: USAC (Unified Speech and Audio Coding) (no SBR)

43: SAOC (Spatial Audio Object Coding)

44: LD MPEG Surround

45: USAC

由《》可以知道，

FFmpeg源码中使用decode_audio_specific_config_gb函数来读取AudioSpecificConfig的信息。decode_audio_specific_config_gb函数中会调用ff_mpeg4audio_get_config_gb函数，而ff_mpeg4audio_get_config_gb函数中，通过语句：c->object_type = get_object_type(gb) 获取AudioSpecificConfig的audioObjectType属性。执行decode_audio_specific_config_gb函数后，m4ac指向的变量会得到从AudioSpecificConfig中解码出来的属性：

static inline int get_object_type(GetBitContext *gb)
{
    int object_type = get_bits(gb, 5);
    if (object_type == AOT_ESCAPE)
        object_type = 32 + get_bits(gb, 6);
    return object_type;
}

然后在decode_audio_specific_config_gb函数外部，通过aac_decode_frame_int函数将上一步得到的audioObjectType属性赋值给AVCodecContext的profile：

static int aac_decode_frame_int(AVCodecContext *avctx, AVFrame *frame,
                                int *got_frame_ptr, GetBitContext *gb,
                                const AVPacket *avpkt)
{
//...
    // The AV_PROFILE_AAC_* defines are all object_type - 1
    // This may lead to an undefined profile being signaled
    ac->avctx->profile = ac->oc[1].m4ac.object_type - 1;
//...
}

然后在dump_stream_format函数中，通过avcodec_string函数中的语句：profile = avcodec_profile_name(enc->codec_id, enc->profile)拿到上一步中得到的AVCodecContext的profile。最后再在dump_stream_format函数中将profile打印出来：

void avcodec_string(char *buf, int buf_size, AVCodecContext *enc, int encode)
{
//...
    profile = avcodec_profile_name(enc->codec_id, enc->profile);
//...
}

所以FFmpeg获取SDP描述的RTP流的的音频压缩编码格式的profile，获取的是a=fmtp属性中的config参数中的信息，即AudioSpecificConfig中的audioObjectType：