WO2013189030A1 - Monophonic or stereo audio coding method - Google Patents

Monophonic or stereo audio coding method Download PDF

Info

Publication number
WO2013189030A1
WO2013189030A1 PCT/CN2012/077155 CN2012077155W WO2013189030A1 WO 2013189030 A1 WO2013189030 A1 WO 2013189030A1 CN 2012077155 W CN2012077155 W CN 2012077155W WO 2013189030 A1 WO2013189030 A1 WO 2013189030A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoding
layer
stereo
enhancement layer
mono
Prior art date
Application number
PCT/CN2012/077155
Other languages
French (fr)
Chinese (zh)
Inventor
王磊
闫建新
Original Assignee
深圳广晟信源技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳广晟信源技术有限公司 filed Critical 深圳广晟信源技术有限公司
Priority to PCT/CN2012/077155 priority Critical patent/WO2013189030A1/en
Priority to CN201280000961.1A priority patent/CN104170007B/en
Publication of WO2013189030A1 publication Critical patent/WO2013189030A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

Definitions

  • This invention relates to the field of audio coding processing, and more particularly to a method of encoding mono or stereo.
  • AAC-SSR Advanced Audio Coding-Scalable Sampling Rate
  • MPEG-4 Part 3 and MPEG-2 Part 7 the encoding architecture is similar to its unique ARTAC (Adaptive Transform Acoustic Coding) encoding.
  • the coding scheme first divides the input digital audio signal into four frequency bands through a 4-band polyphase quadrature filter (PQF, PJF, and then performs one 256-point MDCT for each of the four frequency bands.
  • the coding scheme can also reduce the data rate by removing the high PQF band, and achieve bitstream layering by reducing the frequency band, thereby obtaining different bit rates and sampling rates.
  • the advantage of this coding scheme is that independent blocks can be selected independently in each frequency band. Or short block MDCT, so the high-frequency can use short block coding to enhance the time resolution; and the low-frequency use long block coding to obtain high frequency resolution.
  • the coding efficiency of the transform domain coefficients of adjacent parts will decrease. Summary of the invention
  • the present invention provides a method for encoding mono or stereo, comprising: dividing a mono or stereo audio signal into a basic layer and at least one enhancement layer; using mp3, A for the base layer AC, SBR, PS, and/or DRA coding mode coding; encoding at least one enhancement layer using mp3, AAC, SBR, PS, DRA, residual coding, partial parameter coding algorithm, and/or parameter coding algorithm, respectively.
  • the above dividing the mono or stereo audio signal into a base layer and an enhancement layer is: dividing the mono or stereo audio signal into a base layer and an enhancement layer based on the frequency band, and the base layer is mono or The low frequency encoding portion of the stereo; the enhancement layer is a mono or stereo high frequency encoding portion; or the stereo audio signal is divided into a base layer and an enhancement layer based on the channel, and the base layer transmits the left channel or the channel; The layer transmits the right channel or the difference channel; or the stereo audio signal is divided into a base layer and an enhancement layer based on the parametric stereo coding, the base layer transmits a single channel of the left and right channel downmix; the enhancement layer transmits the parameter stereo information; or The mono or stereo audio signal is divided into a base layer and an enhancement layer based on the residual differential layer structure.
  • the foregoing base layer and/or at least one enhancement layer are respectively coded by using a bandwidth extension algorithm.
  • the step of separately encoding the base layer and the enhancement layer obtained by dividing the residual difference layer structure comprises: supplementing the base layer low frequency coding part according to the enhancement layer low frequency residual; and modifying the parameter to the base layer by using the enhancement layer bandwidth extension.
  • the bandwidth extension parameters are adjusted.
  • the base layer includes encoding the downmixed channel low frequency portion for encoding and bandwidth extension and parametric stereo encoding information; and the enhancement layer transmits the residual encoding of the low frequency portion.
  • the base layer transmits the low frequency partial coding information of the downmixed mono signal; the enhancement layer transmits the low frequency partial residual coding information and the bandwidth extension and the parameter stereo coding information.
  • the step of encoding the base layer includes: encoding according to a code rate requirement of the base layer, and putting the obtained encoded data into a base layer transmission; comparing the original audio with the restored audio of the base layer decoding to obtain a residual signal.
  • the step of encoding the enhancement layer is to encode the residual signal as an enhancement layer.
  • the dividing the mono or stereo audio signal into a base layer, the first enhancement layer and the second enhancement layer is: dividing the mono or stereo audio signal into a base layer, a first enhancement layer, and a second enhancement layer, wherein the base layer is a mono or stereo low frequency coding portion; the first enhancement layer is a mono or stereo intermediate frequency coding portion; and the second enhancement layer is a mono or stereo high frequency coding portion.
  • the above-described residual channel layer structure divides the mono or stereo audio signal into a basic layer and at least one enhancement layer; and the step of encoding the base layer includes: encoding according to the code rate requirement of the base layer, The full-band basic quality coded data is placed in the base layer transmission; the original audio is compared with the base layer decoded and recovered audio to obtain a first-stage residual signal; and the first enhancement layer and/or the second enhancement layer are encoded
  • the method includes: encoding the first-level residual signal as the data of the first enhancement layer; removing the signal decoded and restored by the first enhancement layer from the input first-stage residual signal of the first enhancement layer coding, to obtain the second level Residual signal; encoding the second-level residual signal as the data of the second enhancement layer; sequentially obtaining the next-level residual signal according to the residual signal of the previous stage, and encoding the residual signal of the next-level as the next
  • the data of the level enhancement layer is encoded until all enhancement layers are completed.
  • the step of encoding the base layer comprises: performing MDCT transformation on the time domain data x[n] to obtain a spectral coefficient X[k] at the encoding end ; dividing the frequency domain coefficient into a plurality of subbands, belonging to the subband b The spectral coefficient is divided by a quantization step; the quantization step is rounded (nint) to obtain the quantized spectral coefficient Each quantization step size and spectral coefficient X [W is transmitted to the decoder.
  • the step of separately encoding the at least one enhancement layer comprises: performing MDCT transformation on the time domain data x[n] to obtain a spectral coefficient X[k] at the encoding end ; dividing the frequency domain coefficient into a plurality of subbands, belonging to The spectral coefficient of subband b is divided by a quantization step; after the quantization step is rounded (nint) to be quantized ; Each spectral coefficients and quantization step size XW transmitted to the decoding side; restored by the inverse quantization step size and quantization spectrum f W is the number of spectral coefficients f
  • ⁇ k] A b - X[k] .
  • the number is divided into multiple sub-bands, and the spectral coefficient belonging to the sub-band c is divided by a residual spectral coefficient quantization step, and the quantized residual is obtained by nint Transmitting the residual spectral coefficient quantization step size and the quantized residual spectral coefficient to the decoding end.
  • the present invention performs coarse layering on mono or stereo, generally only 2 or 3 layers, and is simple to implement to ensure more efficient compression without the various constraints of fine layering technology.
  • the best integrated sound quality can be obtained by flexibly controlling the quality of each channel; it is easy to meet channel coding requirements.
  • FIG. 1 is a schematic diagram of layering a mono or stereo according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a coding process of an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of layering an audio signal based on a layered structure of a frequency band according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of layering an audio signal based on a layered structure of a channel according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of layering an audio signal based on a layered structure of parametric stereo coding according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a layered structure according to an embodiment of the present invention
  • FIG. 7 is a schematic diagram of layering an audio signal based on a hierarchical structure of residuals according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a two-layer structure based on a residual difference layer when a base layer has a bandwidth extension algorithm according to an embodiment of the present invention
  • FIG. 9 is a schematic diagram of a two-layer structure based on a residual difference layer when the enhancement layer has a bandwidth extension algorithm according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of a two-layer architecture based on a residual differential layer with bandwidth extension and bandwidth extension correction in an enhancement layer according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of layering a stereo audio signal according to an embodiment of the present invention
  • FIG. 12 is a schematic structural diagram of layering a stereo audio signal according to an embodiment of the present invention
  • FIG. 14 is a schematic diagram of another audio layered multi-layer structure according to an embodiment of the present invention.
  • FIG. 15 is a schematic diagram of an audio layered structure according to an embodiment of the present invention.
  • 16 is a simplified schematic diagram of a dra algorithm according to an embodiment of the present invention.
  • FIG. 17 is a schematic diagram of a DRA kernel residual coding algorithm according to an embodiment of the present invention.
  • FIG. 18 is a schematic diagram of a layered structure of stereo audio according to an embodiment of the present invention. detailed description
  • the method for encoding mono or stereo in this embodiment includes:
  • Step S1 dividing the mono or stereo audio signal into a basic layer and at least one enhancement layer
  • Step S2 encoding the basic layer by using mp3, AAC, SBR, PS, and/or DRA coding modes
  • Step S3 encoding, by using at least one enhancement layer, mp3, AAC, SBR, PS, DRA, residual coding, partial parameter coding algorithm, and/or parameter coding algorithm.
  • the present invention provides a series of different layering schemes.
  • the present invention divides the mono or stereo audio signal into a basic layer and an enhancement layer based on the frequency band, sequentially from low frequency to high frequency.
  • the audio coding information of each frequency band is placed in the base layer and the enhancement layer.
  • the base layer is the low frequency encoding portion of the mono or stereo sound; the enhancement layer is the mono or stereo high frequency encoding portion.
  • the high frequency partial coding can participate in the same algorithm as the low frequency part, or use a parameter method such as a bandwidth extension algorithm.
  • the basic layer generally adopts normal coding algorithms such as mp3, AAC or DRA, etc.
  • the enhancement layer can still use normal coding algorithms, partial parameter coding algorithms such as intensity stereo, parameter coding algorithms such as bandwidth extension.
  • the advantage of the band stratification scheme is to guarantee the quality of the low frequencies. Referring to the channel-based hierarchical structure shown in FIG. 4, the audio signal is layered.
  • the present invention divides the stereo audio signal into a basic layer and an enhancement layer based on the channel, and the base layer transmits the left channel or the harmony.
  • the enhancement layer transmits the right channel or the difference channel.
  • the bandwidth extension algorithm can be selected for any single channel, such as the left channel or the channel, to improve subjective sound quality at low bit rates and to ensure a broadband quality.
  • the present invention divides a stereo audio signal into a basic layer and an enhancement layer based on parametric stereo coding, and the base layer transmits left and right channels.
  • Mixed single channel; enhancement layer transmits parametric stereo information.
  • each layer is coded under the layering scheme, and the low-band portion of the base layer may select a single channel after the left-right channel downmixing using the bandwidth extension algorithm; the enhancement layer transmission
  • the parameter is stereo information, and the high frequency portion of the downmix channel encoded by the transmission bandwidth extension algorithm can also be selected.
  • the layering scheme and the coding scheme can achieve higher quality at a low bit rate.
  • the audio signal is layered.
  • the present invention divides the mono or stereo audio signal into a basic layer and an enhancement layer based on the residual differential layer structure.
  • the steps of encoding the base layer and the enhancement layer include:
  • Step S21 Encoding according to a code rate requirement of the base layer, and putting the obtained coded data into a basic layer for transmission;
  • Step S22 Compare the original audio with the restored audio of the base layer decoding to obtain a residual signal.
  • Step S3 The step of encoding the enhancement layer is to encode the residual signal as an enhancement layer.
  • the normal encoding is first performed according to the code rate requirement of the first layer, and the encoded data is transmitted in the base layer; then the original audio and the base layer are decoded and restored.
  • the audio comparison acquires the residual signal (either in the time domain or in the transform domain) and continues to encode the residual signal as an enhancement layer.
  • the audio signal can be layered in a plurality of hierarchical structures.
  • the base layer shown in FIG. 8 has a two-layer structure diagram based on the residual difference layer when the bandwidth layer has a bandwidth extension algorithm
  • FIG. 9 is a schematic diagram of a two-layer structure based on the residual difference layer when the enhancement layer has a bandwidth extension algorithm
  • the basic layer has a two-layer structure diagram based on the residual differential layer with bandwidth extension and enhancement layer bandwidth extension correction.
  • the base layer low frequency coding portion is supplemented according to the enhancement layer low frequency residual, a more accurate low frequency portion is obtained, and the base layer bandwidth extension parameter is adjusted by the enhancement layer bandwidth extension correction parameter to better Restore the high frequency portion of each channel.
  • the base layer includes the channel low frequency partial coding of the downmix and the bandwidth extension and parametric stereo coding information, and the enhancement layer transmits the residual coding of the low frequency portion.
  • the base layer transmits the low frequency partial coding information of the downmixed mono signal
  • the enhancement layer transmits the low frequency partial residual coding information and the bandwidth extension and the parameter stereo coding. information.
  • the layered structure of the audio signal is simple, and the coding efficiency is improved.
  • the present invention also proposes that, in addition to a two-layer structure of a base layer and a reinforcement layer, the audio signal can be divided into a multilayer structure of a base layer and a plurality of enhancement layers.
  • FIG. 13 a schematic diagram of an audio layered multi-layer structure, which divides a mono or stereo audio signal into a base layer, a first enhancement layer, and a second enhancement layer, wherein the base layer is mono or stereo.
  • the present invention may further divide a mono or stereo audio signal into a base layer and at least one enhancement layer based on the residual differential layer structure.
  • the step S2 of encoding the base layer includes:
  • Step S21 Encoding according to the code rate requirement of the base layer, and putting the obtained full-band basic quality coded data into the base layer transmission;
  • Step S22 Compare the original audio with the audio restored by the base layer decoding to obtain a first-level residual signal.
  • the step S3 of encoding the first enhancement layer and/or the second enhancement layer includes:
  • Step S31 encoding the first-level residual signal as the data of the first enhancement layer
  • Step S32 removing the signal decoded and restored by the first enhancement layer from the input first-stage residual signal of the first enhancement layer coding, Obtaining a second level residual signal
  • Step S33 encoding the second-level residual signal as the data of the second enhancement layer
  • Step S34 sequentially obtaining the next-level residual signal according to the residual signal of the previous stage, and encoding the residual signal of the next stage as The data of the next level of enhancement layer is encoded until all enhancement layers are completed.
  • the present invention can implement Layer 2, Layer 3 or Layer 4 and above layering and encoding for audio signals, generally no more than four layers to simplify the layering and encoding process.
  • a specific example of the present invention is given here. Referring to FIG. 15, a schematic diagram of an audio hierarchy is shown, wherein the DRA core coding module is a standard algorithm for implementing DRA according to the standard GB/T 22726-2008. In the present invention, mono and stereo DRA coding is specifically referred to. The simple diagram of the dra algorithm is shown in Figure 16. Shown. In order to clearly describe this patent, the decoding end is also briefly described, wherein the decoding end module is shown in the dashed block diagram of FIG.
  • Step S211 at the encoding end, performing MDCT transformation on the time domain data x[n] to obtain a spectral coefficient X[k] ; in step S212, dividing the frequency domain coefficient into a plurality of subbands, and dividing the spectral coefficient belonging to the subband b by one Quantization step size;
  • Step S214, each quantization step size and spectral coefficient X [W are transmitted to the decoding end by various means (the steps of decoding the base layer at the decoding end are:
  • Step S4 using the quantization step size and the spectral coefficient W transmitted in step S214 to restore the inverse quantized spectral coefficient f[W
  • the inverse quantized spectral coefficient fc 3 ⁇ 4 IMDCT is obtained by inversely quantized time domain data.
  • the above SBR coding module is in accordance with the standard "ISO/IEC 14496-3:2001/Amd.l:2003,
  • the present invention further provides an example of separately encoding at least one enhancement layer based on the above coding of the base layer.
  • the DRA core residual coding module used in this embodiment is an intermediate module as shown in FIG. 16.
  • the schematic diagram of the DRA kernel residual coding algorithm shown in FIG. 17 shows that the base layer and the coding end of FIG. 18 are completely identical, that is, fully compatible.
  • the implementation of the base layer is as above.
  • the implementation steps of at least one enhancement layer coding in this embodiment as follows:
  • the following steps of adding the following enhancement layer in the base layer step 3 include:
  • step S317 dividing the residual spectral coefficient into a plurality of sub-bands, dividing the spectral coefficient belonging to the sub-band c by a residual spectral coefficient quantization step, and rounding (nint) the quantized residual spectral coefficient Step S318, transmitting the residual spectral coefficient quantization step size ⁇ and the quantized residual spectral coefficient to the solution
  • the process of decoding the at least one enhancement layer at the decoding end is as follows:
  • step S42 using the residual spectral coefficient quantization step size and the quantized residual spectral coefficient passed in step S34 to restore the inverse quantized residual spectral coefficient
  • Step S43 adding the inverse quantized spectral coefficient obtained in step S41 and the inverse quantized residual spectral coefficient obtained in step S42 to obtain an enhanced inverse quantized spectral coefficient [ ⁇ ]
  • X a [k] X[k] - E[k] , step S52, inversely quantized spectral coefficient f for enhancement.
  • W does IMDCT to get inverse quantized time domain data x[n]
  • the present invention further proposes that the total coding rate is 48 kbps, and the audio signal is divided into two layers by a residual differential layer structure, and each layer is 24 kbps as an example to describe the implementation steps of separately coding the base layer and the at least one enhancement layer in this embodiment.
  • Step S201 Encoding the base layer with a coding rate of 24 kbps at a coding bandwidth of 48 kbps, and obtaining a quantization step size of the 24 kbps code rate and a quantized spectral coefficient and an sbr code stream;
  • Step S301 multiplying the quantized spectral coefficients by the quantized step size at the encoding end to obtain an inverse quantized spectral coefficient at a coding rate of 24 kbps.
  • Step S302 subtracting the inverse quantized spectral coefficient f W from the original spectral coefficient x W to obtain a residual signal spectral coefficient E[k].
  • Step S303 using a 24 kbps code rate for the residual signal spectral coefficient £ [W, quantization, quantization method Consistent or similar to the quantization, the quantized step size ⁇ ⁇ quantized residual spectral coefficients of the quantized residual signal are obtained and transmitted to the decoding end.
  • the invention also proposes that if only stereo coding is performed, except for the above implementation, The encoding of the base layer and the at least one enhancement layer can be implemented with the next embodiment. An advantage of this embodiment over the previous embodiment is that higher quality can be obtained when the stereo total code rate is low.
  • a stereo audio layering structure diagram in this embodiment, the two stereo channels are downmixed into one channel and encoded by PS, wherein the PS code is in accordance with the standard ISO/IEC 14496-3:2001/Amd. 2:2004: "Parametric Coding for High Quality Audio" is implemented.
  • the DRA downmix channel coding is the same as the base layer coding principle and the procedure in FIG. 16; and the coding principle of the enhancement layer in this embodiment is the same as the DRA downmix channel residual coding, and therefore will not be described again.

Abstract

The present invention provides a monophonic or stereo audio coding method. The method comprises: dividing a monophonic or stereo audio signal into a basic layer and at least one enhanced layer; coding the basic layer by using a coding mode of mp3, AAC, SBR, PS and/or DRA; and coding the at least one enhanced layer by using a coding mode of mp3, AAC, SBR, PS, DRA, residual coding, a coding algorithm of part of parameters and/or a coding algorithm of parameters respectively. In the present invention, rough layering is performed on monophonic or stereo audio, where merely 2 or 3 layers are divided; in this manner, compression with a higher efficiency can be ensured in an easy manner, free of all kinds of technical constraints in a fine layering technology. The optimal comprehensive sound quality can be obtained by flexibly controlling the quality of each layer of sound track, and channel coding requirements can be easily met.

Description

对单声道或立体声进行编码的方法 技术领域  Method for encoding mono or stereo
本发明涉及音频编码处理领域,特别是涉及一种对单声道或立体声进行编 码的方法。  Field of the Invention This invention relates to the field of audio coding processing, and more particularly to a method of encoding mono or stereo.
在分层音频编码上,已经存在通过精细分层方式进行有损数字音频编码方 法及无损音频编码技术, 如 ISO/IEC 14496-3 MPEG-4 BSAC(Bit sliced arithmetic coding)比特片算术编码、 在 AVS (Audio Video coding Standard Workgroup of China) 中采用的类似于 MPEG-4 BSAC编码方法以及 MPEG-4 SLS (Scalable Lossless Coding)的无损增强层方式都可实现对音频进行精细分 层, 对每一层分别编码。 但精细分层方式存在编码效率低、 结构复杂、 处理逻 辑复杂度高等缺点。 In layered audio coding, there are already lossy digital audio coding methods and lossless audio coding techniques by fine layering, such as ISO/IEC 14496-3 MPEG-4 BSAC (Bit sliced arithmetic coding) bit slice arithmetic coding, The MPEG-4 BSAC encoding method and the MPEG-4 SLS (Scalable Lossless Coding) lossless enhancement layer method used in AVS (Audio Video coding Standard Workgroup of China) can achieve fine layering of audio for each layer. Coded separately. However, the fine layering method has the disadvantages of low coding efficiency, complicated structure, and high processing logic complexity.
现有技术中还有一种非精细分层的编码方案: 在 MPEG-4第三部分和 MPEG-2第七部分中都提供了可伸缩采样率编码算法 AAC-SSR (Advanced Audio Coding-Scalable Sampling Rate ), 首先是由 Sony提出的, 编码架构也类 似于其独有的 ARTAC (Adaptive Transform Acoustic Coding) 编码。 该编码方 案首先将输入的数字音频信号通过 4带的多相正交滤波器组 (PQF, Polyphase Quadrature Filter) 分割成 4个频带, 然后这 4个频带分别进行 1个 256点 MDCT There is also a non-fine layered coding scheme in the prior art: AAC-SSR (Advanced Audio Coding-Scalable Sampling Rate) is provided in both MPEG-4 Part 3 and MPEG-2 Part 7 ), first proposed by Sony, the encoding architecture is similar to its unique ARTAC (Adaptive Transform Acoustic Coding) encoding. The coding scheme first divides the input digital audio signal into four frequency bands through a 4-band polyphase quadrature filter (PQF, PJF, and then performs one 256-point MDCT for each of the four frequency bands.
(512样点窗长)或 8个 32点 (64样点窗长) MDCT。 该编码方案还可通过去除 高 PQF带的方式降低数据率, 通过减少频带的方式实现比特流分层, 从而获得 不同比特率和采样率。这种编码方案的好处是在每个频带内可以独立选择长块 或短块 MDCT, 因此对高频可使用短块编码增强时间分辨率; 而对低频使用长 块编码获得高频率分辨率。 但是由于 4个 PQF带间存在混迭, 因此相邻部分的 变换域系数编码效率会下降。 发明内容 (512 sample window length) or 8 32 points (64 sample window length) MDCT. The coding scheme can also reduce the data rate by removing the high PQF band, and achieve bitstream layering by reducing the frequency band, thereby obtaining different bit rates and sampling rates. The advantage of this coding scheme is that independent blocks can be selected independently in each frequency band. Or short block MDCT, so the high-frequency can use short block coding to enhance the time resolution; and the low-frequency use long block coding to obtain high frequency resolution. However, due to the aliasing between the four PQF bands, the coding efficiency of the transform domain coefficients of adjacent parts will decrease. Summary of the invention
为解决上述技术问题, 本发明提出一种对单声道或立体声进行编码的方 法, 包括: 将单声道或立体声音频信号分为一基本层及至少一增强层; 对基本 层采用 mp3、 A AC, SBR、 PS和 /或 DRA编码方式编码; 对至少一增强层分 别采用 mp3、 A AC, SBR、 PS、 DRA、 残差编码、 部分参数编码算法和 /或参 数编码算法编码。  In order to solve the above technical problem, the present invention provides a method for encoding mono or stereo, comprising: dividing a mono or stereo audio signal into a basic layer and at least one enhancement layer; using mp3, A for the base layer AC, SBR, PS, and/or DRA coding mode coding; encoding at least one enhancement layer using mp3, AAC, SBR, PS, DRA, residual coding, partial parameter coding algorithm, and/or parameter coding algorithm, respectively.
优选地, 上述将单声道或立体声音频信号分为一基本层和一增强层是: 基于频带将单声道或立体声音频信号分为一基本层和一增强层,基本层为单声 道或立体声的低频编码部分; 增强层为单声道或立体声的高频编码部分; 或基 于声道将立体声音频信号分为一基本层和一增强层,基本层传输左声道或和声 道; 增强层传输右声道或差声道; 或基于参数立体声编码将立体声音频信号分 为一基本层和一增强层, 基本层传输左右声道缩混的单个声道; 增强层传输参 数立体声信息;或基于残差分层结构将单声道或立体声音频信号分为一基本层 和一增强层。  Preferably, the above dividing the mono or stereo audio signal into a base layer and an enhancement layer is: dividing the mono or stereo audio signal into a base layer and an enhancement layer based on the frequency band, and the base layer is mono or The low frequency encoding portion of the stereo; the enhancement layer is a mono or stereo high frequency encoding portion; or the stereo audio signal is divided into a base layer and an enhancement layer based on the channel, and the base layer transmits the left channel or the channel; The layer transmits the right channel or the difference channel; or the stereo audio signal is divided into a base layer and an enhancement layer based on the parametric stereo coding, the base layer transmits a single channel of the left and right channel downmix; the enhancement layer transmits the parameter stereo information; or The mono or stereo audio signal is divided into a base layer and an enhancement layer based on the residual differential layer structure.
优选地, 上述对基本层和 /或至少一增强层, 分别采用带宽扩展算法进行 编码。  Preferably, the foregoing base layer and/or at least one enhancement layer are respectively coded by using a bandwidth extension algorithm.
优选地,上述对于基于残差分层结构划分得到的基本层和一增强层分别编 码的步骤包括: 根据增强层低频残差对基本层低频编码部分进行补充; 通过增 强层带宽扩展修正参数对基本层带宽扩展参数进行调整。  Preferably, the step of separately encoding the base layer and the enhancement layer obtained by dividing the residual difference layer structure comprises: supplementing the base layer low frequency coding part according to the enhancement layer low frequency residual; and modifying the parameter to the base layer by using the enhancement layer bandwidth extension. The bandwidth extension parameters are adjusted.
优选地, 上述音频信号为立体声的情况下, 基本层包含编码缩混的 声道低频部分进行编码以及带宽扩展和参数立体声编码信息;增强层传输低频 部分的残差编码。 优选地, 上述音频信号为立体声的情况下, 基本层传输缩混的单声 道信号的低频部分编码信息;增强层传输低频部分残差编码信息和带宽扩展及 参数立体声编码信息。 Preferably, in the case where the audio signal is stereo, the base layer includes encoding the downmixed channel low frequency portion for encoding and bandwidth extension and parametric stereo encoding information; and the enhancement layer transmits the residual encoding of the low frequency portion. Preferably, in the case where the audio signal is stereo, the base layer transmits the low frequency partial coding information of the downmixed mono signal; the enhancement layer transmits the low frequency partial residual coding information and the bandwidth extension and the parameter stereo coding information.
优选地, 上述对基本层编码的步骤包括: 根据对基本层的码率要求进行编 码,将得到的编码数据放入基本层传输; 将原始音频与基本层解码恢复后的音 频比较获取残差信号;而对增强层编码的步骤是对残差信号进行编码作为增强 层。  Preferably, the step of encoding the base layer includes: encoding according to a code rate requirement of the base layer, and putting the obtained encoded data into a base layer transmission; comparing the original audio with the restored audio of the base layer decoding to obtain a residual signal. And the step of encoding the enhancement layer is to encode the residual signal as an enhancement layer.
优选地, 上述将单声道或立体声音频信号分为一基本层、第一增强层和第 二增强层是: 基于频带将单声道或立体声音频信号分为一基本层、第一增强层 和第二增强层, 其中基本层为单声道或立体声的低频编码部分; 第一增强层为 单声道或立体声的中频编码部分;第二增强层为单声道或立体声的高频编码部 分。  Preferably, the dividing the mono or stereo audio signal into a base layer, the first enhancement layer and the second enhancement layer is: dividing the mono or stereo audio signal into a base layer, a first enhancement layer, and a second enhancement layer, wherein the base layer is a mono or stereo low frequency coding portion; the first enhancement layer is a mono or stereo intermediate frequency coding portion; and the second enhancement layer is a mono or stereo high frequency coding portion.
优选地, 上述基于残差分层结构将单声道或立体声音频信号分为一基本 层、 至少一增强层; 而对基本层编码的步骤包括: 根据对基本层的码率要求进 行编码, 将得到的全频带基本质量编码数据放入基本层传输; 将原始音频与基 本层解码恢复后的音频比较, 获得第一级残差信号; 而对第一增强层和 /或第 二增强层编码的步骤包括: 对第一级残差信号进行编码作为第一增强层的数 据;从第一增强层编码所输入的第一级残差信号中去除对第一增强层解码恢复 的信号, 获得第二级残差信号; 对第二级残差信号进行编码, 作为第二增强层 的数据; 依次根据上一级残差信号获得下一级残差信号,对下一级残差信号进 行编码作为下一级增强层的数据, 直至对所有增强层均完成编码。  Preferably, the above-described residual channel layer structure divides the mono or stereo audio signal into a basic layer and at least one enhancement layer; and the step of encoding the base layer includes: encoding according to the code rate requirement of the base layer, The full-band basic quality coded data is placed in the base layer transmission; the original audio is compared with the base layer decoded and recovered audio to obtain a first-stage residual signal; and the first enhancement layer and/or the second enhancement layer are encoded The method includes: encoding the first-level residual signal as the data of the first enhancement layer; removing the signal decoded and restored by the first enhancement layer from the input first-stage residual signal of the first enhancement layer coding, to obtain the second level Residual signal; encoding the second-level residual signal as the data of the second enhancement layer; sequentially obtaining the next-level residual signal according to the residual signal of the previous stage, and encoding the residual signal of the next-level as the next The data of the level enhancement layer is encoded until all enhancement layers are completed.
优选地, 上述对基本层编码的步骤包括: 在编码端, 对时域数据 x[n] 做 MDCT变换得到谱系数 X[k]; 将频域系数分成多个子带, 对其中属于子带 b的谱系数除以一个量化步长 ;对量化步长 取整 (nint)得到量化后的谱系数
Figure imgf000005_0001
每个量化步长 和谱系数 X[W传输到解码端。 优选地, 上述对至少一增强层分别编码的步骤包括: 在编码端, 对时域数 据 x[n]做 MDCT变换得到谱系数 X[k]; 将频域系数分成多个子带, 对其中属 于子带 b的谱系数除以一个量化步长 ; 对量化步长 取整 (nint)得到量化后
Figure imgf000006_0001
;每个量化步长 和谱系数 XW传输到解码 端; 用量化步长 和谱系数 f W恢复逆量化后的谱系数 f
Preferably, the step of encoding the base layer comprises: performing MDCT transformation on the time domain data x[n] to obtain a spectral coefficient X[k] at the encoding end ; dividing the frequency domain coefficient into a plurality of subbands, belonging to the subband b The spectral coefficient is divided by a quantization step; the quantization step is rounded (nint) to obtain the quantized spectral coefficient
Figure imgf000005_0001
Each quantization step size and spectral coefficient X [W is transmitted to the decoder. Preferably, the step of separately encoding the at least one enhancement layer comprises: performing MDCT transformation on the time domain data x[n] to obtain a spectral coefficient X[k] at the encoding end ; dividing the frequency domain coefficient into a plurality of subbands, belonging to The spectral coefficient of subband b is divided by a quantization step; after the quantization step is rounded (nint) to be quantized
Figure imgf000006_0001
; Each spectral coefficients and quantization step size XW transmitted to the decoding side; restored by the inverse quantization step size and quantization spectrum f W is the number of spectral coefficients f
^k] = Ab - X[k] . 用原始谱系数 减去逆量化后的谱系数 W, 得到残差 谱系数 E E[k] = X[k] - X[k] . 将残差谱系数 分成多个子带, 对其中属于 子带 c的谱系数除以一个残差谱系数量化步长 , 取整 (nint)得到量化后的残差
Figure imgf000006_0002
;将残差谱系数量化步长 和量化后的残差谱 系数 传输到解码端。
^k] = A b - X[k] . The inverse spectral coefficient W is subtracted from the original spectral coefficient to obtain the residual spectral coefficient EE[k] = X[k] - X[k] . The number is divided into multiple sub-bands, and the spectral coefficient belonging to the sub-band c is divided by a residual spectral coefficient quantization step, and the quantized residual is obtained by nint
Figure imgf000006_0002
Transmitting the residual spectral coefficient quantization step size and the quantized residual spectral coefficient to the decoding end.
本发明对单声道或立体声进行粗分层, 一般仅做 2或 3分层, 实现简单 可保证更高效率的压缩, 无需精细分层技术下的各种限制条件。可通过灵活控 制每层声道的质量, 获取最佳综合声音质量; 易于满足信道编码要求。 附图说明  The present invention performs coarse layering on mono or stereo, generally only 2 or 3 layers, and is simple to implement to ensure more efficient compression without the various constraints of fine layering technology. The best integrated sound quality can be obtained by flexibly controlling the quality of each channel; it is easy to meet channel coding requirements. DRAWINGS
图 1为本发明- -实施例对单声道或立体声进行分层的示意图;  1 is a schematic diagram of layering a mono or stereo according to an embodiment of the present invention;
图 2为本发明- -实施例的编码流程示意图;  2 is a schematic diagram of a coding process of an embodiment of the present invention;
图 3为本发明- -实施例基于频带的分层结构对音频信号进行分层的示意 图;  3 is a schematic diagram of layering an audio signal based on a layered structure of a frequency band according to an embodiment of the present invention;
图 4为本发明一实施例基于声道的分层结构对音频信号进行分层的示意 图;  4 is a schematic diagram of layering an audio signal based on a layered structure of a channel according to an embodiment of the present invention;
图 5为本发明一实施例基于参数立体声编码的分层结构对音频信号进行分 层的示意图; 图 6为本发明一实施例的分层结构示意图; FIG. 5 is a schematic diagram of layering an audio signal based on a layered structure of parametric stereo coding according to an embodiment of the present invention; FIG. 6 is a schematic diagram of a layered structure according to an embodiment of the present invention;
图 7为本发明一实施例基于残差的分层结构对音频信号进行分层的示意 图;  FIG. 7 is a schematic diagram of layering an audio signal based on a hierarchical structure of residuals according to an embodiment of the present invention; FIG.
图 8为本发明一实施例基本层具有带宽扩展算法时基于残差分层的二层 结构示意图;  FIG. 8 is a schematic diagram of a two-layer structure based on a residual difference layer when a base layer has a bandwidth extension algorithm according to an embodiment of the present invention; FIG.
图 9为本发明一实施例增强层具有带宽扩展算法时基于残差分层的二层 结构示意图;  FIG. 9 is a schematic diagram of a two-layer structure based on a residual difference layer when the enhancement layer has a bandwidth extension algorithm according to an embodiment of the present invention; FIG.
图 10为本发明一实施例基本层有带宽扩展和增强层有带宽扩展修正的基 于残差分层的二层吉构示意图;  FIG. 10 is a schematic diagram of a two-layer architecture based on a residual differential layer with bandwidth extension and bandwidth extension correction in an enhancement layer according to an embodiment of the present invention; FIG.
图 11为本发明一实施例一种对立体声音频信号分层的结构示意图; 图 12为本发明一实施例另一种对立体声音频信号分层的结构示意图; 图 13为本发明一实施例一种音频分层多层结构示意图;  FIG. 11 is a schematic structural diagram of layering a stereo audio signal according to an embodiment of the present invention; FIG. 12 is a schematic structural diagram of layering a stereo audio signal according to an embodiment of the present invention; FIG. A schematic diagram of an audio layered multilayer structure;
图 14为本发明一实施例另一种音频分层多层结构示意图;  FIG. 14 is a schematic diagram of another audio layered multi-layer structure according to an embodiment of the present invention; FIG.
图 15为本发明一实施例一种音频分层结构示意图;  FIG. 15 is a schematic diagram of an audio layered structure according to an embodiment of the present invention; FIG.
图 16为本发明一实施例的 dra算法简单示意图;  16 is a simplified schematic diagram of a dra algorithm according to an embodiment of the present invention;
图 17为本发明一实施例的 DRA核残差编码算法示意图;  FIG. 17 is a schematic diagram of a DRA kernel residual coding algorithm according to an embodiment of the present invention; FIG.
图 18为本发明一实施例的立体声音频分层结构示意图。 具体实施方式  FIG. 18 is a schematic diagram of a layered structure of stereo audio according to an embodiment of the present invention. detailed description
为详细说明本发明的技术内容、 构造特征、所达成的目的及效果, 下面将 结合实施例并配合附图予以详细说明。  The technical contents, structural features, objects and effects achieved by the present invention will be described in detail below with reference to the embodiments.
请参阅图 1示出的对单声道或立体声进行分层的示意图及图 2示出的编 码流程示意图, 本实施例对单声道或立体声进行编码的方法包括:  Please refer to the schematic diagram of layering mono or stereo shown in FIG. 1 and the coding flow diagram shown in FIG. 2. The method for encoding mono or stereo in this embodiment includes:
步骤 Sl、 将单声道或立体声音频信号分为一基本层及至少一增强层; 步骤 S2、 对基本层采用 mp3、 A AC, SBR、 PS和 /或 DRA编码方式编 码; 步骤 S3、 对至少一增强层分别采用 mp3、 A AC, SBR、 PS、 DRA、 残差编 码、 部分参数编码算法和 /或参数编码算法编码。 基于上述实施例, 本发明给出一系列不同的分层方案。 Step S1, dividing the mono or stereo audio signal into a basic layer and at least one enhancement layer; Step S2, encoding the basic layer by using mp3, AAC, SBR, PS, and/or DRA coding modes; Step S3: encoding, by using at least one enhancement layer, mp3, AAC, SBR, PS, DRA, residual coding, partial parameter coding algorithm, and/or parameter coding algorithm. Based on the above embodiments, the present invention provides a series of different layering schemes.
参照图 3示出的基于频带的分层结构对音频信号进行分层的示意图,本发 明基于频带将单声道或立体声音频信号分为一基本层和一增强层,依次从低频 到高频将每个频段的音频编码信息放入基本层和增强层。基本层为单声道或立 体声的低频编码部分; 增强层为单声道或立体声的高频编码部分。  Referring to the schematic diagram of layering the audio signal according to the frequency band-based hierarchical structure shown in FIG. 3, the present invention divides the mono or stereo audio signal into a basic layer and an enhancement layer based on the frequency band, sequentially from low frequency to high frequency. The audio coding information of each frequency band is placed in the base layer and the enhancement layer. The base layer is the low frequency encoding portion of the mono or stereo sound; the enhancement layer is the mono or stereo high frequency encoding portion.
在该分层方案下, 高频部分编码可以参与与低频部分同样的算法, 或者采 用参数方法如带宽扩展算法。 基本层一般采用正常的编码算法如 mp3, AAC 或 DRA等,增强层仍可使用正常编码算法、部分参数编码算法如强度立体声、 参数编码算法如带宽扩展等。 以频带分层方案的优点是保证低频的质量。 参照图 4示出的基于声道的分层结构对音频信号进行分层的示意图,本发 明基于声道将立体声音频信号分为一基本层和一增强层,基本层传输左声道或 和声道; 增强层传输右声道或差声道。  Under this layered scheme, the high frequency partial coding can participate in the same algorithm as the low frequency part, or use a parameter method such as a bandwidth extension algorithm. The basic layer generally adopts normal coding algorithms such as mp3, AAC or DRA, etc. The enhancement layer can still use normal coding algorithms, partial parameter coding algorithms such as intensity stereo, parameter coding algorithms such as bandwidth extension. The advantage of the band stratification scheme is to guarantee the quality of the low frequencies. Referring to the channel-based hierarchical structure shown in FIG. 4, the audio signal is layered. The present invention divides the stereo audio signal into a basic layer and an enhancement layer based on the channel, and the base layer transmits the left channel or the harmony. The enhancement layer transmits the right channel or the difference channel.
在该分层方案下, 带宽扩展算法可选择用于任何单个声道, 如左声道或和 声道, 能够改善低码率下的主观声音质量, 保证一个宽带的质量。 参照图 5示出的基于参数立体声编码的分层结构对音频信号进行分层的 示意图,本发明基于参数立体声编码将立体声音频信号分为一基本层和一增强 层, 基本层传输左右声道缩混的单个声道; 增强层传输参数立体声信息。  Under this layered scheme, the bandwidth extension algorithm can be selected for any single channel, such as the left channel or the channel, to improve subjective sound quality at low bit rates and to ensure a broadband quality. Referring to the schematic diagram of layered structure based on parametric stereo coding shown in FIG. 5, the present invention divides a stereo audio signal into a basic layer and an enhancement layer based on parametric stereo coding, and the base layer transmits left and right channels. Mixed single channel; enhancement layer transmits parametric stereo information.
参照图 6所示的分层结构示意图,在该分层方案下对各层进行编码, 基本 层的低频带部分可选择使用带宽扩展算法传输左右声道缩混后的单个声道;增 强层传输的是参数立体声信息,也可选择传输带宽扩展算法所编码的缩混声道 高频部分。 该分层方案及编码方案可在低比特率下得到较高的质量。 参照图 7示出的基于残差的分层结构对音频信号进行分层的示意图, 本发 明基于残差分层结构将单声道或立体声音频信号分为一基本层和一增强层。 Referring to the hierarchical structure diagram shown in FIG. 6, each layer is coded under the layering scheme, and the low-band portion of the base layer may select a single channel after the left-right channel downmixing using the bandwidth extension algorithm; the enhancement layer transmission The parameter is stereo information, and the high frequency portion of the downmix channel encoded by the transmission bandwidth extension algorithm can also be selected. The layering scheme and the coding scheme can achieve higher quality at a low bit rate. Referring to the residual-based hierarchical structure shown in FIG. 7, the audio signal is layered. The present invention divides the mono or stereo audio signal into a basic layer and an enhancement layer based on the residual differential layer structure.
在该分层结构下, 对基本层和增强层编码的步骤包括:  Under the hierarchical structure, the steps of encoding the base layer and the enhancement layer include:
步骤 S21、 根据对基本层的码率要求进行编码, 将得到的编码数据放入基 本层传输;  Step S21: Encoding according to a code rate requirement of the base layer, and putting the obtained coded data into a basic layer for transmission;
步骤 S22、 将原始音频与基本层解码恢复后的音频比较获取残差信号; 步骤 S3 , 对增强层编码的步骤是对残差信号进行编码作为增强层。  Step S22: Compare the original audio with the restored audio of the base layer decoding to obtain a residual signal. Step S3: The step of encoding the enhancement layer is to encode the residual signal as an enhancement layer.
具体来说,对单声道或立体声音频编码时, 首先根据第一层的码率要求完 成正常编码, 并将编码后的数据放入基本层中传输; 然后将原始音频与基本层 解码恢复后的音频比较获取残差信号 (可在时域也可在变换域), 再对残差信 号继续进行编码作为增强层。  Specifically, when encoding mono or stereo audio, the normal encoding is first performed according to the code rate requirement of the first layer, and the encoded data is transmitted in the base layer; then the original audio and the base layer are decoded and restored. The audio comparison acquires the residual signal (either in the time domain or in the transform domain) and continues to encode the residual signal as an enhancement layer.
进一步的, 还可采用多种分层结构对音频信号进行分层。 例如参照图 8示 出的基本层具有带宽扩展算法时基于残差分层的二层结构示意图; 图 9示出的 增强层具有带宽扩展算法时基于残差分层的二层结构示意图;以及图 10示出的 基本层有带宽扩展和增强层有带宽扩展修正的基于残差分层的二层结构示意 图。在图 10示出的结构中,根据增强层低频残差对基本层低频编码部分的补充, 获得更准确的低频部分,通过增强层带宽扩展修正参数对基本层带宽扩展参数 进行调整以便更好地恢复每个声道的高频部分。还可参照图 11示出的对立体声 音频信号分层的情况下,基本层包含编码缩混的声道低频部分编码以及带宽扩 展和参数立体声编码信息, 增强层传输低频部分的残差编码。参照图 12示出的 另一种对立体声音频信号分层情况下,基本层传输缩混的单声道信号的低频部 分编码信息,增强层传输低频部分残差编码信息和带宽扩展及参数立体声编码 信息。  Further, the audio signal can be layered in a plurality of hierarchical structures. For example, the base layer shown in FIG. 8 has a two-layer structure diagram based on the residual difference layer when the bandwidth layer has a bandwidth extension algorithm; FIG. 9 is a schematic diagram of a two-layer structure based on the residual difference layer when the enhancement layer has a bandwidth extension algorithm; The basic layer has a two-layer structure diagram based on the residual differential layer with bandwidth extension and enhancement layer bandwidth extension correction. In the structure shown in FIG. 10, the base layer low frequency coding portion is supplemented according to the enhancement layer low frequency residual, a more accurate low frequency portion is obtained, and the base layer bandwidth extension parameter is adjusted by the enhancement layer bandwidth extension correction parameter to better Restore the high frequency portion of each channel. Referring also to the layering of the stereo audio signal shown in Fig. 11, the base layer includes the channel low frequency partial coding of the downmix and the bandwidth extension and parametric stereo coding information, and the enhancement layer transmits the residual coding of the low frequency portion. Referring to another layered stereo audio signal shown in FIG. 12, the base layer transmits the low frequency partial coding information of the downmixed mono signal, and the enhancement layer transmits the low frequency partial residual coding information and the bandwidth extension and the parameter stereo coding. information.
采用本实施例的残差分层结构,对音频信号进行分层的结构简单, 编码效 率提高。 本发明还提出, 除了一基本层和一增强层的二层结构外,还可将音频信号 分为一基本层和多个增强层的多层结构。 With the residual differential layer structure of the embodiment, the layered structure of the audio signal is simple, and the coding efficiency is improved. The present invention also proposes that, in addition to a two-layer structure of a base layer and a reinforcement layer, the audio signal can be divided into a multilayer structure of a base layer and a plurality of enhancement layers.
参照图 13所示一种音频分层多层结构示意图, 基于频带将单声道或立体 声音频信号分为一基本层、第一增强层和第二增强层, 其中基本层为单声道或 立体声的低频编码部分; 第一增强层为单声道或立体声的中频编码部分; 第二 增强层为单声道或立体声的高频编码部分。  Referring to FIG. 13, a schematic diagram of an audio layered multi-layer structure, which divides a mono or stereo audio signal into a base layer, a first enhancement layer, and a second enhancement layer, wherein the base layer is mono or stereo. The low frequency encoding portion; the first enhancement layer is a mono or stereo intermediate frequency encoding portion; and the second enhancement layer is a mono or stereo high frequency encoding portion.
参照图 14的另一种音频分层多层结构示意图, 本发明还可基于残差分层 结构将单声道或立体声音频信号分为一基本层、 至少一增强层。  Referring to another audio layered multi-layer structure diagram of FIG. 14, the present invention may further divide a mono or stereo audio signal into a base layer and at least one enhancement layer based on the residual differential layer structure.
在该多层结构下, 对基本层编码的步骤 S2包括:  In the multi-layer structure, the step S2 of encoding the base layer includes:
步骤 S21、 根据对基本层的码率要求进行编码, 将得到的全频带基本质量 编码数据放入基本层传输;  Step S21: Encoding according to the code rate requirement of the base layer, and putting the obtained full-band basic quality coded data into the base layer transmission;
步骤 S22、 将原始音频与基本层解码恢复后的音频比较, 获得第一级残差 信号。  Step S22: Compare the original audio with the audio restored by the base layer decoding to obtain a first-level residual signal.
而对第一增强层和 /或第二增强层编码的步骤 S3包括:  The step S3 of encoding the first enhancement layer and/or the second enhancement layer includes:
步骤 S31、 对第一级残差信号进行编码作为第一增强层的数据; 步骤 S32、 从第一增强层编码所输入的第一级残差信号中去除对第一增强 层解码恢复的信号, 获得第二级残差信号;  Step S31, encoding the first-level residual signal as the data of the first enhancement layer; Step S32, removing the signal decoded and restored by the first enhancement layer from the input first-stage residual signal of the first enhancement layer coding, Obtaining a second level residual signal;
步骤 S33、 对第二级残差信号进行编码, 作为第二增强层的数据; 步骤 S34、 依次根据上一级残差信号获得下一级残差信号, 对下一级残差 信号进行编码作为下一级增强层的数据, 直至对所有增强层均完成编码。  Step S33, encoding the second-level residual signal as the data of the second enhancement layer; Step S34, sequentially obtaining the next-level residual signal according to the residual signal of the previous stage, and encoding the residual signal of the next stage as The data of the next level of enhancement layer is encoded until all enhancement layers are completed.
本发明对音频信号可实现二层、三层或四层及以上分层及编码,一般不超 过四层以简化分层及编码过程。 此处给出本发明的一个具体示例。参照图 15给出的一种音频分层结构示意 图, 其中 DRA核编码模块是按照标准 GB/T 22726-2008实现 DRA的标准算法。 在本发明中特指单声道和立体声的 DRA编码。 其中 dra算法简单示意图如图 16 所示。 为了清晰描述本专利, 特将解码端也做了简单描述, 其中解码端模块见 图 16的虚线框图。 The present invention can implement Layer 2, Layer 3 or Layer 4 and above layering and encoding for audio signals, generally no more than four layers to simplify the layering and encoding process. A specific example of the present invention is given here. Referring to FIG. 15, a schematic diagram of an audio hierarchy is shown, wherein the DRA core coding module is a standard algorithm for implementing DRA according to the standard GB/T 22726-2008. In the present invention, mono and stereo DRA coding is specifically referred to. The simple diagram of the dra algorithm is shown in Figure 16. Shown. In order to clearly describe this patent, the decoding end is also briefly described, wherein the decoding end module is shown in the dashed block diagram of FIG.
本实施例对基本层实现编码的步骤如下:  The steps of implementing coding of the base layer in this embodiment are as follows:
步骤 S211、 在编码端, 对时域数据 x[n]做 MDCT变换得到谱系数 X[k] ; 步骤 S212、 将频域系数分成多个子带, 对其中属于子带 b的谱系数除以一 个量化步长 ; Step S211, at the encoding end, performing MDCT transformation on the time domain data x[n] to obtain a spectral coefficient X[k] ; in step S212, dividing the frequency domain coefficient into a plurality of subbands, and dividing the spectral coefficient belonging to the subband b by one Quantization step size;
步骤 S213、 对 取整 (nint)得到量化后的谱系数 [W
Figure imgf000011_0001
骤 S214、 每个量化步长 和谱系数 X[W通过各种方式传输到解码端 ( 在解码端对基本层解码的步骤为:
Step S213, obtaining a quantized spectral coefficient for rounding (Nint) [W
Figure imgf000011_0001
Step S214, each quantization step size and spectral coefficient X [W are transmitted to the decoding end by various means ( the steps of decoding the base layer at the decoding end are:
步骤 S4、 用步骤 S214传过来的量化步长 和谱系数 W恢复逆量化后的 谱系数 f[W  Step S4, using the quantization step size and the spectral coefficient W transmitted in step S214 to restore the inverse quantized spectral coefficient f[W
X[k] = Ab - X[k] 步骤 S51、 对逆量化谱系数 fc ¾IMDCT得到逆量化的时域数据 。 上述 SBR编码模块是按照标准 "ISO/IEC 14496-3:2001/Amd.l:2003, X[k] = A b - X[k] Step S51, the inverse quantized spectral coefficient fc 3⁄4 IMDCT is obtained by inversely quantized time domain data. The above SBR coding module is in accordance with the standard "ISO/IEC 14496-3:2001/Amd.l:2003,
Bandwidth Extension"实现的。在本专利中将 SBR放在基本层中可以在较低码率 下得到较高质量。由于 SBR的实现与本专利无关,并且 SBR编码模块是可选的, 所以本专利不做具体描述。 本发明又提出一示例, 基于上述对基本层的编码,对至少一增强层分别编 码。 本实施例采用的 DRA核残差编码模块如图 16所示的中间模块。 由图 17示 出的 DRA核残差编码算法示意图可以看到, 基本层和图 18的编码端完全一致 即完全兼容。其中基本层的实现如上。本实施例至少一增强层编码的实现步骤 如下: Bandwidth Extension is implemented. In this patent, placing SBR in the base layer can achieve higher quality at a lower code rate. Since the implementation of SBR is not related to this patent, and the SBR coding module is optional, this patent The present invention further provides an example of separately encoding at least one enhancement layer based on the above coding of the base layer. The DRA core residual coding module used in this embodiment is an intermediate module as shown in FIG. 16. The schematic diagram of the DRA kernel residual coding algorithm shown in FIG. 17 shows that the base layer and the coding end of FIG. 18 are completely identical, that is, fully compatible. The implementation of the base layer is as above. The implementation steps of at least one enhancement layer coding in this embodiment as follows:
在上述基本层步骤 3后增加如下实现增强层的编码步骤包括:  The following steps of adding the following enhancement layer in the base layer step 3 include:
步骤 S311、 在编码端, 对时域数据 x[n]做 MDCT变换得到谱系数 X[k]; 步骤 S312、 将频域系数分成多个子带, 对其中属于子带 b的谱系数除以一 水- :化步长 ; 步骤 S313、 对量化歩长 取整 (nint)得到量化后的谱系数 W
Figure imgf000012_0001
步骤 S314、 将每个量化步长 和谱系数 ^]传输到解码端; 步骤 S315、 用量化歩长 和谱系数 ^]恢复逆量化后的谱系数 W X[k] = Ab-X[k], 步骤 S316、用原始谱系数 X[W减去逆量化后的谱系数 ],得到残差谱系 数 E[W
Step S311, at the encoding end, performing MDCT transform on the time domain data x[n] to obtain a spectral coefficient X[k] ; in step S312, dividing the frequency domain coefficient into a plurality of subbands, and dividing the spectral coefficient belonging to the subband b by one Water - : step size; step S313, rounding the quantized length (nint) to obtain the quantized spectral coefficient W
Figure imgf000012_0001
Step S314, transmitting each quantization step and spectral coefficient ^] to the decoding end; Step S315, restoring the inverse quantized spectral coefficient WX[k] = A b -X[k] with the quantization length and the spectral coefficient ^] Step S316, using the original spectral coefficient X [W minus the inverse quantized spectral coefficient], to obtain the residual spectral coefficient E[W
E[k] = X[k]-X[k], 步骤 S317、 将残差谱系数 分成多个子带, 对其中属于子带 c的谱系数 除以一个残差谱系数量化步长 , 取整 (nint)得到量化后的残差谱系数
Figure imgf000012_0002
骤 S318、将残差谱系数量化步长^和量化后的残差谱系数 传输到解
E[k] = X[k]-X[k], step S317, dividing the residual spectral coefficient into a plurality of sub-bands, dividing the spectral coefficient belonging to the sub-band c by a residual spectral coefficient quantization step, and rounding (nint) the quantized residual spectral coefficient
Figure imgf000012_0002
Step S318, transmitting the residual spectral coefficient quantization step size ^ and the quantized residual spectral coefficient to the solution
在解码端对至少一增强层分别解码的流程如下: The process of decoding the at least one enhancement layer at the decoding end is as follows:
步骤 S41、用步骤 S214传过来的量化步长 和谱系数 ]恢复逆量化后的 谱系数 f[W X[k] = Ab - X[k] , 步骤 S42、 用步骤 S34传过来的残差谱系数量化步长 和量化后残差谱系 数 恢复逆量化后的残差谱系数 Step S41, using the quantization step size and spectral coefficient passed in step S214] to recover the inverse quantized spectral coefficient f[W X[k] = A b - X[k] , step S42, using the residual spectral coefficient quantization step size and the quantized residual spectral coefficient passed in step S34 to restore the inverse quantized residual spectral coefficient
E[k] = Ae - E[k] . 步骤 S43、 将步骤 S41得到的逆量化的谱系数 ^]和步骤 S42得到的逆量化 的残差谱系数 相加得到增强的逆量化谱系数 [^] E[k] = A e - E[k] . Step S43, adding the inverse quantized spectral coefficient obtained in step S41 and the inverse quantized residual spectral coefficient obtained in step S42 to obtain an enhanced inverse quantized spectral coefficient [ ^ ]
Xa [k] = X[k] - E[k] , 步骤 S52、 对增强的逆量化谱系数 fW做 IMDCT得到逆量化的时域数据 x[n] X a [k] = X[k] - E[k] , step S52, inversely quantized spectral coefficient f for enhancement. W does IMDCT to get inverse quantized time domain data x[n]
本发明又提出以总编码码率 48kbps, 音频信号以残差分层结构分成两层, 每层 24kbps为例详细说明本实施例对基本层和至少一增强层分别编码的实现 步骤。 The present invention further proposes that the total coding rate is 48 kbps, and the audio signal is divided into two layers by a residual differential layer structure, and each layer is 24 kbps as an example to describe the implementation steps of separately coding the base layer and the at least one enhancement layer in this embodiment.
步骤 S201、 以 48kbps的编码带宽, 用 24kbps编码码率编码基本层, 得到 24kbps编码码率的量化步长 和量化后的谱系数 以及 sbr码流;  Step S201: Encoding the base layer with a coding rate of 24 kbps at a coding bandwidth of 48 kbps, and obtaining a quantization step size of the 24 kbps code rate and a quantized spectral coefficient and an sbr code stream;
步骤 S301、在编码端用量化后谱系数 乘以量化步长 得到 24kbps编码 码率下的逆量化谱系数  Step S301, multiplying the quantized spectral coefficients by the quantized step size at the encoding end to obtain an inverse quantized spectral coefficient at a coding rate of 24 kbps.
步骤 S302、用原始谱系数 x W减去逆量化谱系数 f W得到残差信号谱系数 E[k] . 步骤 S303、 以 24kbps编码码率对残差信号谱系数 £[W做量化, 量化方法可 以和 量化一致或相似,得到量化的残差信号的量化步长 Δ Ρ量化后的残差 谱系数 并传输到解码端。 本发明还提出, 如果仅对立体声进行分层编码, 除了用上述实施例外, 还 可以用下一实施例实现对基本层和至少一增强层的编码。本实施例对比上一实 施例的优点是, 当立体声总编码码率很低时可以得到更高的质量。 Step S302, subtracting the inverse quantized spectral coefficient f W from the original spectral coefficient x W to obtain a residual signal spectral coefficient E[k]. Step S303, using a 24 kbps code rate for the residual signal spectral coefficient £ [W, quantization, quantization method Consistent or similar to the quantization, the quantized step size Δ Ρ quantized residual spectral coefficients of the quantized residual signal are obtained and transmitted to the decoding end. The invention also proposes that if only stereo coding is performed, except for the above implementation, The encoding of the base layer and the at least one enhancement layer can be implemented with the next embodiment. An advantage of this embodiment over the previous embodiment is that higher quality can be obtained when the stereo total code rate is low.
如图 18所示的一种立体声音频分层结构示意图,本实施例将立体声两个声 道下混成一个声道并用 PS编码,其中 PS编码是按照标准 ISO/IEC 14496-3:2001/ Amd.2:2004: "Parametric Coding for High Quality Audio"实现的。 其中 DRA下混 声道编码和图 16的基本层编码原理和步骤相同;和本实施例增强层的编码原理 和 DRA下混声道残差编码相同, 故不赘述。  As shown in FIG. 18, a stereo audio layering structure diagram, in this embodiment, the two stereo channels are downmixed into one channel and encoded by PS, wherein the PS code is in accordance with the standard ISO/IEC 14496-3:2001/Amd. 2:2004: "Parametric Coding for High Quality Audio" is implemented. The DRA downmix channel coding is the same as the base layer coding principle and the procedure in FIG. 16; and the coding principle of the enhancement layer in this embodiment is the same as the DRA downmix channel residual coding, and therefore will not be described again.
本发明的对单声道或立体声进行编码的方法由上述揭露的方法,可以达到 所述目的和效果,然而以上所揭露仅为本发明的较佳实施例, 自不能以此限定 本发明的权利范围, 至于本发明的其它等效修饰或变化, 均应涵盖在本发明的 权利要求范围内。  The method for encoding mono or stereo according to the present invention achieves the above objects and effects by the method disclosed above, but the above disclosure is only a preferred embodiment of the present invention, and the right to the present invention cannot be limited thereby. Scope, other equivalent modifications or variations of the invention are intended to be included within the scope of the appended claims.

Claims

权 利 要 求 书 claims
1、 一种对单声道或立体声进行编码的方法, 其特征在于, 包括: 将单声道或立体声音频信号分为一基本层及至少一增强层; 1. A method for encoding mono or stereo, characterized in that it includes: dividing the mono or stereo audio signal into a basic layer and at least one enhancement layer;
对所述基本层采用 mp3、 A AC, SBR、 PS和 /或 DRA编码方式编码; 对所述至少一增强层分别采用 mp3、 AAC、 SBR、 PS、 DRA、 残差编码、 部分参数编码算法和 /或参数编码算法编码。 The base layer is encoded using mp3, AAC, SBR, PS and/or DRA coding methods; the at least one enhancement layer is encoded using mp3, AAC, SBR, PS, DRA, residual coding, partial parameter coding algorithm and /or parameter encoding algorithm encoding.
2、 根据权利要求 1所述的对单声道或立体声进行编码的方法,其特征在 于, 所述将单声道或立体声音频信号分为一基本层和一增强层是: 2. The method for encoding mono or stereo according to claim 1, characterized in that dividing the mono or stereo audio signal into a basic layer and an enhancement layer is:
基于频带将单声道或立体声音频信号分为一基本层和一增强层, 所述基 本层为单声道或立体声的低频编码部分; 所述增强层为单声道或立体声的高 频编码部分; 或 The mono or stereo audio signal is divided into a basic layer and an enhancement layer based on the frequency band. The basic layer is the low-frequency encoding part of the mono or stereo; the enhancement layer is the high-frequency encoding part of the mono or stereo. ; or
基于声道将立体声音频信号分为一基本层和一增强层, 所述基本层传输 左声道或和声道; 所述增强层传输右声道或差声道; 或 Divide the stereo audio signal into a base layer and an enhancement layer based on the channel, the base layer transmits the left channel or the harmony channel; the enhancement layer transmits the right channel or the difference channel; or
基于参数立体声编码将立体声音频信号分为一基本层和一增强层, 所述 基本层传输左右声道缩混的单个声道; 增强层传输参数立体声信息; 或 The stereo audio signal is divided into a basic layer and an enhancement layer based on parametric stereo coding. The basic layer transmits a single channel of left and right channel downmix; the enhancement layer transmits parametric stereo information; or
基于残差分层结构将单声道或立体声音频信号分为一基本层和一增强 层。 Based on the residual layered structure, the mono or stereo audio signal is divided into a base layer and an enhancement layer.
3、根据权利要求 2所述的对单声道或立体声进行编码的方法,其特征在 于, 对所述基本层和 /或至少一增强层, 分别采用带宽扩展算法进行编码。 3. The method for encoding mono or stereo according to claim 2, characterized in that the base layer and/or at least one enhancement layer are encoded using a bandwidth extension algorithm respectively.
4、 根据权利要求 2所述的对单声道或立体声进行编码的方法, 其特征 在于, 对于基于残差分层结构划分得到的基本层和一增强层分别编码的步骤 包括: 4. The method for encoding mono or stereo according to claim 2, characterized in that the step of separately encoding the basic layer and an enhancement layer obtained by dividing the residual layered structure includes:
根据增强层低频残差对基本层低频编码部分进行补充; Supplement the low-frequency coding part of the base layer based on the low-frequency residual of the enhancement layer;
通过增强层带宽扩展修正参数对基本层带宽扩展参数进行调整。 The base layer bandwidth expansion parameter is adjusted through the enhancement layer bandwidth expansion correction parameter.
5、根据权利要求 2所述的对单声道或立体声进行编码的方法,其特征在 于: 5. The method for encoding mono or stereo according to claim 2, characterized in that:
音频信号为立体声的情况下,所述基本层包含编码缩混的声道低频部分进 行编码以及带宽扩展和参数立体声编码信息; When the audio signal is stereo, the basic layer includes coding of the low-frequency part of the downmixed channel, bandwidth extension and parametric stereo coding information;
所述增强层传输低频部分的残差编码。 The enhancement layer transmits the residual coding of the low-frequency part.
6、根据权利要求 2所述的对单声道或立体声进行编码的方法,其特征在 于: 6. The method for encoding mono or stereo according to claim 2, characterized in that:
音频信号为立体声的情况下,所述基本层传输缩混的单声道信号的低频部 分编码信息; When the audio signal is stereo, the base layer transmits the low-frequency part encoding information of the downmixed mono signal;
所述增强层传输低频部分残差编码信息和带宽扩展及参数立体声编码信 息。 The enhancement layer transmits low-frequency part residual coding information and bandwidth extension and parametric stereo coding information.
7、 根据权利要求 2所述的对单声道或立体声进行编码的方法, 其特征在 于, 所述对基本层编码的步骤包括: 7. The method for encoding mono or stereo according to claim 2, characterized in that the step of encoding the base layer includes:
根据对基本层的码率要求进行编码,将得到的编码数据放入基本层传输; 将原始音频与基本层解码恢复后的音频比较获取残差信号; 而 Encode according to the code rate requirements of the base layer, and put the resulting encoded data into the base layer for transmission; compare the original audio with the audio after decoding and recovery of the base layer to obtain the residual signal; and
所述对所述增强层编码的步骤是对所述残差信号进行编码作为增强层。 The step of encoding the enhancement layer is encoding the residual signal as an enhancement layer.
8、根据权利要求 1所述的对单声道或立体声进行编码的方法,其特征在 于, 所述将单声道或立体声音频信号分为一基本层、 第一增强层和第二增强 层是: 8. The method for encoding mono or stereo according to claim 1, characterized in that: dividing the mono or stereo audio signal into a basic layer, a first enhancement layer and a second enhancement layer is :
基于频带将所述单声道或立体声音频信号分为一基本层、 第一增强层和 第二增强层, 其中基本层为单声道或立体声的低频编码部分; 第一增强层为 单声道或立体声的中频编码部分; 所述第二增强层为单声道或立体声的高频 编码部分。 The mono or stereo audio signal is divided into a base layer, a first enhancement layer and a second enhancement layer based on the frequency band, where the base layer is the low-frequency coding part of the mono or stereo; the first enhancement layer is the mono or a stereo mid-frequency coding part; the second enhancement layer is a mono or stereo high-frequency coding part.
9、根据权利要求 1所述的对单声道或立体声进行编码的方法,其特征在 于, 基于残差分层结构将单声道或立体声音频信号分为一基本层、 至少一增 强层; 而 9. The method for encoding mono or stereo according to claim 1, characterized in that the mono or stereo audio signal is divided into a basic layer and at least an incremental layer based on the residual layered structure. strong layer; while
所述对基本层编码的步骤包括: The steps of encoding the base layer include:
根据对基本层的码率要求进行编码, 将得到的全频带基本质量编码数据 放入基本层传输; Encode according to the code rate requirements of the base layer, and put the obtained full-band basic quality encoded data into the base layer for transmission;
将原始音频与基本层解码恢复后的音频比较, 获得第一级残差信号; 而 对所述第一增强层和 /或第二增强层编码的步骤包括: Compare the original audio with the audio recovered after base layer decoding to obtain the first-level residual signal; and the step of encoding the first enhancement layer and/or the second enhancement layer includes:
对所述第一级残差信号进行编码作为第一增强层的数据; Encoding the first-level residual signal as data of the first enhancement layer;
从第一增强层编码所输入的第一级残差信号中去除对所述第一增强层解 码恢复的信号, 获得第二级残差信号; Remove the signal recovered by decoding the first enhancement layer from the first-level residual signal input by the first enhancement layer encoding to obtain the second-level residual signal;
对所述第二级残差信号进行编码, 作为第二增强层的数据; Encode the second-level residual signal as the data of the second enhancement layer;
依次根据上一级残差信号获得下一级残差信号, 对下一级残差信号进行 编码作为下一级增强层的数据, 直至对所有增强层均完成编码。 The next-level residual signal is obtained in turn based on the upper-level residual signal, and the next-level residual signal is encoded as the data of the next-level enhancement layer until all enhancement layers are encoded.
10、 根据权利要求 1至 9任意一项所述的对单声道或立体声进行编码 的方法, 其特征在于, 对基本层编码的步骤包括: 10. The method for encoding mono or stereo according to any one of claims 1 to 9, characterized in that the step of encoding the base layer includes:
在编码端, 对时域数据 x[n]做 MDCT变换得到谱系数 X[k] ; At the encoding end, perform MDCT transformation on the time domain data x[n] to obtain the spectral coefficients X[k] ;
将频域系数分成多个子带, 对其中属于子带 b的谱系数除以一个量化步 长 ; Divide the frequency domain coefficients into multiple sub-bands, and divide the spectral coefficients belonging to sub-band b by a quantization step;
对量化步长 取整 (nint)得到量化后的谱系数 f
Figure imgf000017_0001
Round the quantization step size (nint) to obtain the quantized spectral coefficient f
Figure imgf000017_0001
每个量化步长 和谱系数 W传输到解码端。 Each quantization step size and spectral coefficient W are transmitted to the decoding end.
11、根据权利要求 10所述的对单声道或立体声进行编码的方法,其特征 在于, 对所述至少一增强层分别编码的步骤包括: 11. The method for encoding mono or stereo according to claim 10, characterized in that the step of encoding the at least one enhancement layer separately includes:
在编码端, 对时域数据 x[n]做 MDCT变换得到谱系数 X[k] ; At the encoding end, perform MDCT transformation on the time domain data x[n] to obtain the spectral coefficients X[k] ;
将频域系数分成多个子带,对其中属于子带 b的谱系数除以一个量化步长 Δ 对量化步长 取整 (nint)得到量化后的谱系数 W
Figure imgf000018_0001
Divide the frequency domain coefficients into multiple sub-bands, and divide the spectral coefficients belonging to sub-band b by a quantization step size Δ Round the quantization step size (nint) to obtain the quantized spectral coefficient W
Figure imgf000018_0001
每个量化步长 和谱系数 传输到解码端; Each quantization step size and spectral coefficient are transmitted to the decoding end;
用量化步长 和谱系数 恢复逆量化后的谱系数 f Use the quantization step size and spectral coefficient to restore the inverse quantized spectral coefficient f
X[k] = Ab-X[k], 用原始谱系数 XW减去逆量化后的谱系数 ^], 得到残差谱系数 £W E[k] = X[k]-X[k]. 将残差谱系数 £[W分成多个子带,对其中属于子带 c的谱系数除以一个残 X[k] = A b -X[k], subtract the inverse quantized spectral coefficient ^ ] from the original spectral coefficient XW to obtain the residual spectral coefficient £W E[k] = X[k]-X[k] . Divide the residual spectral coefficient £ [W into multiple sub-bands, and divide the spectral coefficient belonging to sub-band c by a residual
;谱系数量化步长 Δ«, 取整 (nint)得到量化后的残差谱系数
Figure imgf000018_0002
;Page quantification step size Δ «, rounded (nint) to obtain the quantized residual spectrum coefficient
Figure imgf000018_0002
PCT/CN2012/077155 2012-06-19 2012-06-19 Monophonic or stereo audio coding method WO2013189030A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2012/077155 WO2013189030A1 (en) 2012-06-19 2012-06-19 Monophonic or stereo audio coding method
CN201280000961.1A CN104170007B (en) 2012-06-19 2012-06-19 To monophonic or the stereo method encoded

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/077155 WO2013189030A1 (en) 2012-06-19 2012-06-19 Monophonic or stereo audio coding method

Publications (1)

Publication Number Publication Date
WO2013189030A1 true WO2013189030A1 (en) 2013-12-27

Family

ID=49768020

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/077155 WO2013189030A1 (en) 2012-06-19 2012-06-19 Monophonic or stereo audio coding method

Country Status (2)

Country Link
CN (1) CN104170007B (en)
WO (1) WO2013189030A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105900170B (en) * 2014-01-07 2020-03-10 哈曼国际工业有限公司 Signal quality based enhancement and compensation of compressed audio signals
CN110556118B (en) * 2018-05-31 2022-05-10 华为技术有限公司 Coding method and device for stereo signal
CN114708874A (en) 2018-05-31 2022-07-05 华为技术有限公司 Coding method and device for stereo signal
CN111768793B (en) * 2020-07-11 2023-09-01 北京百瑞互联技术有限公司 LC3 audio encoder coding optimization method, system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1623185A (en) * 2002-03-12 2005-06-01 诺基亚有限公司 Efficient improvement in scalable audio coding
CN1905010A (en) * 2005-07-29 2007-01-31 索尼株式会社 Apparatus and method for encoding audio data, and apparatus and method for decoding audio data
US20070208557A1 (en) * 2006-03-03 2007-09-06 Microsoft Corporation Perceptual, scalable audio compression
CN101167126A (en) * 2005-04-28 2008-04-23 松下电器产业株式会社 Audio encoding device and audio encoding method
CN101206860A (en) * 2006-12-20 2008-06-25 华为技术有限公司 Method and apparatus for encoding and decoding layered audio
CN101800048A (en) * 2009-02-10 2010-08-11 数维科技(北京)有限公司 Multi-channel digital audio coding method based on DRA coder and coding system thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5171256B2 (en) * 2005-08-31 2013-03-27 パナソニック株式会社 Stereo encoding apparatus, stereo decoding apparatus, and stereo encoding method
WO2008062990A1 (en) * 2006-11-21 2008-05-29 Samsung Electronics Co., Ltd. Method, medium, and system scalably encoding/decoding audio/speech

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1623185A (en) * 2002-03-12 2005-06-01 诺基亚有限公司 Efficient improvement in scalable audio coding
CN101167126A (en) * 2005-04-28 2008-04-23 松下电器产业株式会社 Audio encoding device and audio encoding method
CN1905010A (en) * 2005-07-29 2007-01-31 索尼株式会社 Apparatus and method for encoding audio data, and apparatus and method for decoding audio data
US20070208557A1 (en) * 2006-03-03 2007-09-06 Microsoft Corporation Perceptual, scalable audio compression
CN101206860A (en) * 2006-12-20 2008-06-25 华为技术有限公司 Method and apparatus for encoding and decoding layered audio
CN101800048A (en) * 2009-02-10 2010-08-11 数维科技(北京)有限公司 Multi-channel digital audio coding method based on DRA coder and coding system thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHOU, HONG ET AL.: "The Scalability Audio Coding", AUDIO ENGINEERING, September 2000 (2000-09-01), pages 3 - 7 *

Also Published As

Publication number Publication date
CN104170007B (en) 2017-09-26
CN104170007A (en) 2014-11-26

Similar Documents

Publication Publication Date Title
KR101428608B1 (en) Spectrum flatness control for bandwidth extension
US8862480B2 (en) Audio encoding/decoding with aliasing switch for domain transforming of adjacent sub-blocks before and subsequent to windowing
US11908484B2 (en) Apparatus and method for generating an enhanced signal using independent noise-filling at random values and scaling thereupon
JP5418930B2 (en) Speech decoding method and speech decoder
US20090306993A1 (en) Method and apparatus for lossless encoding of a source signal, using a lossy encoded data stream and a lossless extension data stream
CN113936675A (en) Audio encoder and decoder for frequency domain processor and time domain processor
CN105453176A (en) Audio encoder, audio decoder and related methods using two-channel processing within an intelligent gap filling framework
MX2007009887A (en) Near-transparent or transparent multi-channel encoder/decoder scheme.
CN112786063A (en) Audio encoder and decoder using frequency domain processor, time domain processor and cross processor for sequential initialization
EP1749296A1 (en) Multichannel audio extension
CN109074812B (en) Apparatus and method for MDCT M/S stereo with global ILD and improved mid/side decisions
TW202016924A (en) Multisignal encoder, multisignal decoder, and related methods using signal whitening or signal post processing
US10332526B2 (en) Audio encoding apparatus and method, and audio decoding apparatus and method
CN111210832A (en) Bandwidth extension audio coding and decoding method and device based on spectrum envelope template
WO2013189030A1 (en) Monophonic or stereo audio coding method
KR20050027179A (en) Method and apparatus for decoding audio data
Gunawan et al. Investigation of lossless audio compression using ieee 1857.2 advanced audio coding
WO2010099752A1 (en) Stereo coding method, device and encoder
KR102546098B1 (en) Apparatus and method for encoding / decoding audio based on block
WO2016023322A1 (en) Multichannel acoustic signal encoding method, decoding method and device
Ghaderi et al. Wideband speech coding using ADPCM and a new enhanced bandwidth extension method
Hansen et al. Fine-grain scalable audio coding based on envelope restoration and the SPIHT algorithm
De Meuleneire et al. Algebraic quantization of transform coefficients for embedded audio coding
Ghaderi et al. Wideband speech and audio coding using a new spectral replication method based on parametric stereo coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12879437

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO F1205N DATED 26-05-2015)

122 Ep: pct application non-entry in european phase

Ref document number: 12879437

Country of ref document: EP

Kind code of ref document: A1