WO2000045378A2 - Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching - Google Patents

Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching Download PDF

Info

Publication number
WO2000045378A2
WO2000045378A2 PCT/SE2000/000158 SE0000158W WO0045378A2 WO 2000045378 A2 WO2000045378 A2 WO 2000045378A2 SE 0000158 W SE0000158 W SE 0000158W WO 0045378 A2 WO0045378 A2 WO 0045378A2
Authority
WO
WIPO (PCT)
Prior art keywords
time
frequency
resolution
signal
spectral envelope
Prior art date
Application number
PCT/SE2000/000158
Other languages
French (fr)
Other versions
WO2000045378A3 (en
Inventor
Lars Gustaf Liljeryd
Kristofer KJÖRLING
Per Ekstrand
Fredrik Henn
Original Assignee
Lars Gustaf Liljeryd
Kjoerling Kristofer
Per Ekstrand
Fredrik Henn
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from SE9900256A external-priority patent/SE9900256D0/en
Application filed by Lars Gustaf Liljeryd, Kjoerling Kristofer, Per Ekstrand, Fredrik Henn filed Critical Lars Gustaf Liljeryd
Priority to AU25856/00A priority Critical patent/AU2585600A/en
Priority to US09/763,128 priority patent/US6978236B1/en
Publication of WO2000045378A2 publication Critical patent/WO2000045378A2/en
Priority to RU2002111665/09A priority patent/RU2236046C2/en
Priority to ES00968271T priority patent/ES2223591T3/en
Priority to BRPI0014642A priority patent/BRPI0014642B1/en
Priority to CNB008136025A priority patent/CN1172293C/en
Priority to JP2001528974A priority patent/JP4035631B2/en
Priority to DE60012198T priority patent/DE60012198T2/en
Priority to PT00968271T priority patent/PT1216474E/en
Priority to EP00968271A priority patent/EP1216474B1/en
Priority to AU78212/00A priority patent/AU7821200A/en
Priority to PCT/SE2000/001887 priority patent/WO2001026095A1/en
Priority to DK00968271T priority patent/DK1216474T3/en
Priority to AT00968271T priority patent/ATE271250T1/en
Publication of WO2000045378A3 publication Critical patent/WO2000045378A3/en
Priority to HK03101398.3A priority patent/HK1049401B/en
Priority to JP2005292384A priority patent/JP4628921B2/en
Priority to JP2005292388A priority patent/JP4334526B2/en
Priority to US11/246,283 priority patent/US7181389B2/en
Priority to US11/246,284 priority patent/US7191121B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Definitions

  • the present invention relates to a new method and apparatus for efficient coding of spectral envelopes in audio coding systems
  • the method may be used both for natural audio coding and speech coding and is especially suited for coders usmg SBR [WO 98/57436] and other high frequency reconstruction methods
  • Audio source codmg techniques can be divided into two classes natural audio codmg and speech codmg Natural audio codmg is commonly used for music or arbitrary signals at medium bitrates, and generally offers wide audio bandwidth Speech coders are basically limited to speech reproduction but can on the other hand be used at very low bitrates, albeit with low audio bandwidth In both classes, the signal is generally separated into two major signal components, the "spectral envelope” and the corresponding "residual" signal Throughout the following description, the term “spectral envelope” refers to the coarse spectral distribution of the signal in a general sense, e g filter coefficients m an linear prediction based coder or a set of time-frequency averages of subband samples in a subband coder The term “residual” refers to the fine spectral distribution in a general sense, e g the LPC error signal or subband samples normalized usmg the above time-frequency averages "Envelope data” refers to the quantized and coded spectral envelope, and "resi
  • the spectral envelope is a function of two variables time and frequency
  • the encodmg can be done by exploiting redundancy m either direction of the time/frequency plane
  • codmg of the spectral envelope is performed in the frequency direction usmg delta coding (DPCM), linear prediction (LPC), or vector quantization (VQ) SUMMARY OF THE INVENTION
  • DPCM frequency direction usmg delta coding
  • LPC linear prediction
  • VQ vector quantization
  • the present mvention provides a new method and an apparatus for spectral envelope encoding
  • the mvention teaches how to perform and signal compactly a time/frequency mapping of the envelope representation, and further, encode the spectral envelope data efficiently using adaptive time/frequency direction codmg
  • a time/frequency grid with low temporal and high frequency resolution is used as default
  • the temporal resolution is increased at the expense of frequency resolution
  • the mvention describes two schemes for signalling of the time and frequency resolution used One scheme allows arbitrary selection of instantaneous resolution by explicit signalling of time segment borders and frequency resolutions, whereas the other exploits the fact that transients are separated at least by a minimum time, T mm , ⁇ order to reduce the required number of control bits
  • a transient detector decides whether the current granule contains a transient, and if so, determines the position of the onset of the transient The position withm the granule is encode
  • the present mvention presents a new and efficient method for scalefactor redundancy codmg
  • a dirac pulse in the time domain transforms to a constant in the frequency domain, and a drrac in the frequency domain, l e a single smusoid, corresponds to a signal with constant magnitude m the time domain Simplified, on a short term basis, the signal shows less variations in one domam than the other
  • usmg prediction or delta codmg, codmg efficiency is mcreased if the spectral envelope is coded in either time- or frequency-direction depending on the signal characte ⁇ stics
  • Figs la - lb illustrate uniform respective non-uniform sampling in time of the spectral envelope
  • Figs 2a - 2c illustrate transient detector look-ahead and granule mterdependency
  • Figs 3a - 3f illustrate segments with different time and frequency resolutions, and the corresponding control signals Fig 4 illustrates time/frequency switched envelope codmg
  • Fig 5 is a block diagram of an encoder using the envelope coding according to the mvention
  • Fig 6 is a block diagram of a decoder usmg the envelope codmg according to the invention DESCRIPTION OF PREFERRED EMBODIMENTS
  • Fig 1 shows the time/frequency representation of a musical signal where sustained chords are combmed with sharp transients with mainly high frequency contents
  • the chords have high energy and the transient energy is low, whereas the opposite is true
  • m the highband
  • the envelope data that is generated durmg time intervals where transients are present is dominated by the high intermittent transient energy
  • the decoder the spectral envelope of the transposed signal is estimated using the same mstantaneous t ⁇ me-/frequency resolution as used for the analysis of the original highband
  • An equalization of the transposed signal is then performed, based on dissimilarities m the spectral envelopes E g amplification factors in an envelope adjustmg filterbank are calculated as the quotients between o ⁇ gmal signal and transposed signal scalefactors
  • the transposed signal has the same chord to transient energy ratio as the lowband
  • the gams needed in order to adjust the transposed transients to the correct level thus cause the
  • the prmcipal solution is to maintam a low update rate durmg tonal passages, which make up the majority of a typical programme material, and by means of a transient detector localize the transient positions, and update the envelope data close to the leading flanks, see Fig lb
  • the update rate is momentarily increased in a time interval after the transient start This eliminates gain induced post-echoes
  • the time segmenting during the decay is not as crucial as findmg the start of the transient, as will be explained later
  • a lower frequency resolution can be used durmg the transient, keeping the data size within limits
  • a non-uniform samplmg in time and frequency as outlined above is applicable both on subband coders and linear prediction based coders
  • Typical coders operate on a block basis, where every block represents a fixed time interval Those blocks will be referred to as "granules"
  • granules Let a granule have a length of q time quantization steps, hereinafter called “subgranules"
  • subgranules a transient detector look-ahead can be employed on the encoder side Having this additional information, envelope data spanning across borders of granules can be comprised This enables a more flexible selection of time/frequency resolutions, and faciliates constant bitrate operation, smce parts of the payload can be moved between consecutive granules Referring to Fig 2, the granules are divided into eight subgranules
  • the transient detector operates on granules with the same timespan as the granule that overlap 50% of two consecutive granules, that is, the transient detector look-ahead is half a granule
  • the transient detector has detected a transient
  • the segment borders and frequency resolutions (number of coefficients or scalefactors) must be signalled If a non-uniform sampling according to Fig 2 is to be employed, the problem of envelope data spanning over the granule borders must be dealt with Furthermore, the signalling must be flexible enough to cover all combinations of interest, without generating a too large amount of control data
  • transients can occur withm a granule m C combinations, ranging from no transient at all to q transients, where C is given by
  • the first step towards an efficient signalling is to employ two time sampling modes, uniform and non-uniform sampling in time
  • the uniform mode is used during quasi-stationary passages, and employs high frequency resolution and relatively long time segments, both of which are predefined Hence this mode does not require any signalling of segment borders or frequency resolutions
  • One bit is sufficient to signal the time samplmg mode to the decoder
  • the non-uniform mode is used durmg transient passages and requires additional signalling Two such signalling systems are proposed by the present invention
  • the first system hereinafter referred to as the "border-signalling system" uses one bit per subgranule to signal whether a segment border is present at the subgranule left border or not Envelope data corresponding to a segment is always sent m the granule in which the segment starts This means that the number of envelopes transmitted m a granule equals the number of left borders in the granule or the bit sum of the q border bits
  • the segment frequency resolutions are signalled with dynamically allocated control bits, e g one bit per envelope Again, this number of bits is derived from the q border bits
  • Fig 3 Some examples of grouping of subgranules into time segments are given m Fig 3, where the subgranules are numbered from 000 to 111 L denotes low frequency resolution and H denotes high resolution
  • L denotes low frequency resolution
  • H denotes high resolution
  • the number of scalefactors or coefficients m a high resolution segment is assumed to be two times that of a low resolution segment
  • Figure 3a shows a reference system, constantly using the highest possible time and frequency resolution
  • the relative data matrix size is one by definition, and obviously no control signal bits are needed in this system If no transient is present in or next to a specific granule, the granule is divided mto two segments of equal length and the envelope representations are calculated using high frequency resolution If the two envelope representations do not differ more than a certain amount, only one set of high resolution envelope data is sent Those cases are illustrated by Figs 3b and 3c, where the control signal "Uniform" tells that uniform sampling in time is used, and the signal “Low
  • the second system heremafter referred to as the "position-signalling system" is intended for very low bitrate applications and utilizes some musical signal properties m order to reduce the number of control signal bits
  • the positions-signalling system are intended for very low bitrate applications and utilizes some musical signal properties m order to reduce the number of control signal bits
  • Eq 1 many of the states described by Eq 1 are not very likely, and would also generate too large amounts of envelope data to be practical at a limited bitrate
  • the following simplifications can be made with little or no sacrifice of quality for practical signals
  • the minimum time-span between consecutive transients in music programme material can be estimated in the following way In musical notation, the rhythmic "pulse" is described by a time signature expressed as a fraction A/B, where A denotes the number of "beats" per bar and MB is the type of note corresponding to one beat, for example a l A note, commonly referred to as a quarter note
  • A denotes the number of "beats” per bar
  • MB is the type of note corresponding to one beat, for example a l A note, commonly referred to as a quarter note
  • BPM Beats Per Mmute
  • T q The necessary time resolution T q must also be established In some cases a transient original signal has its mam energy in the highband to be reconstructed This means that the encoded spectral envelope must carry all the "timing" information The desired timing precision thus determines the resolution needed for encodmg of leading flanks T q is much smaller than the minimum note period T nm ⁇ n , smce small time deviations within the period clearly can be heard In most cases however, the transient has significant energy in the lowband
  • T m the gain- induced pre-echoes must fall within the so called pre- or backward masking time T m of the human auditory system m order to be inaudible Hence T q must satisfy two conditions
  • T m ⁇ T Formula mm (otherwise the notes would be so fast that they could not be resolved) and according to ["Modeling the Additivity of Nonsimultaneous Masking", Hearmg Res , vol 80, pp 105-118 (1994)], T m amounts to 10-20 ms Smce T classroom mm is in the 50ms range, a reasonable selection of T q according to Eq 3 results m that the second condition is also met
  • the precision of the transient detection in the encoder and the time resolution of the analysis/synthesis filterbank must also be considered when selecting T q
  • the note-off position has little or no effect on the perceived rhythm
  • most instruments do not exhibit sharp trailing flanks, but rather a smooth decay curve, l e a well defined note-off time does not exist
  • the post- or forward masking time is substantially longer than the pre-masking time
  • Time/Frequency Switched Scalefactor Encoding Utilizing a time to frequency transform it can be shown that a pulse in the time domain corresponds to a flat spectrum m the frequency domain, and a "pulse" in the frequency domain, l e a single sinusoidal, corresponds to a quasi-stationary signal in the time domain In other words a signal usually shows more transient properties m one domain than the other In a spectrogram, l e a time/frequency matrix display, this property is evident, and can advantageously be used when codmg spectral envelopes
  • a tonal stationary signal can have a very sparse spectrum not suitable for delta codmg m the frequency-direction, but well suited for delta codmg in the time-direction, and vice versa This is displayed in Fig 4 Throughout the following description a vector of scale factors calculated at time n 0 represents the spectral envelope
  • Y(k,n 0 ) [a ⁇ , a 2 , a 3 , , a k , ,a N ], (Eq 5) where ⁇ a N are the amplitude values for different frequencies Common practice is to code the difference between adjacent values in the frequency-direction at a given time, which yields
  • Start values are transmitted whenever the spectral envelope is coded m the frequency direction but not when coded in the time direction smce they are available at the decoder, through the previous envelope
  • the proposed algorithm also require extra information to be transmitted, namely a time/frequency flag indicating in which direction the spectral envelope was coded
  • the T/F algorithm can advantageously be used with several different coding schemes of the scalefactor-envelope representation apart from DPCM and Huffman, such as ADPCM, LPC and vector quantisation
  • the proposed T/F algorithm gives significant bitrate-reduction for the spectral-envelope data, up to around 20% reduction compared to commonly used delta-coding techniques If the number of scalefactors per octave is constant, it is possible to delta code on an octave basis instead of delta codmg of adjacent scale factors
  • the analogue input signal is fed to an A/D- converter 501, forming a digital signal
  • the digital audio signal is fed to a perceptual audio encoder 502, where source codmg is performed
  • the digital signal is fed to a transient detector 503 and to an analysis filterbank 504, which splits the signal mto its spectral equivalents (subband signals)
  • the transient detector could operate on the subband signals from the analysis bank, but for generality purposes it is here assumed to operate on the digital time domam samples directly
  • the transient detector divides the signal mto granules and determines, accordmg to the invention, whether subgranules withm the granules is to be flagged as transient
  • This information is sent to the envelope groupmg block 505, which specifies the time/frequency grid to be used for the current granule Accordmg to the grid, the block combmes the uniform sampled subband signals, to form the non-uniform
  • the decoder side of the invention is shown in Fig 6
  • the demultiplexer 601 restores the signals and feeds the approp ⁇ ate part to an audio decoder 602, which produces a low band digital audio signal
  • the envelope information is fed from the demultiplexer to the envelope decodmg block 603, which, by use of control data, determines in which direction the current envelope are coded and decodes the data
  • the low band signal from the audio decoder is routed to the transposition module 604, which generates a replicated high band signal consisting of one or several harmomcs from the low band signal.
  • the high band signal is fed to an analysis filterbank 606, which is of the same type as on the encoder side.
  • the subband signals are combined m the scalefactor grouping umt 607.
  • the same type of combination and time/frequency distribution of the subband samples is adopted as on the encoder side.
  • the envelope information from the demultiplexer and the information from the scalefactor groupmg umt is processed m the gam control module 608.
  • the module computes gam factors to be applied to the subband samples before recombmation in the synthesis filterbank block 609.
  • the output from the synthesis filterbank is thus an envelope adjusted high band audio signal This signal is added to the output from the delay unit 605, which is fed with the low band audio signal.
  • the delay compensates for the processmg time of the high band signal
  • the obtamed digital wideband signal is converted to an analogue audio signal m the digital to analogue converter 610

Abstract

The present invention provides a new method and an apparatus for spectral envelope encoding. The invention teaches how to perform and signal compactly a time/frequency mapping of the envelope representation, and further, encode the spectral envelope data efficiently using adaptive time/frequency directional coding. The method is applicable to both natural audio coding and speech coding systems and is especially suited for coders using SBR [WO 98/57436] and other high frequency reconstruction methods.

Description

EFFICIENT SPECTRAL ENVELOPE CODING USING VARIABLE TIME/FREQUENCY RESOLUTION AND TIME/FREQUENCY SWITCHING
TECHNICAL FIELD
The present invention relates to a new method and apparatus for efficient coding of spectral envelopes in audio coding systems The method may be used both for natural audio coding and speech coding and is especially suited for coders usmg SBR [WO 98/57436] and other high frequency reconstruction methods
BACKGROUND OF THE INVENTION
Audio source codmg techniques can be divided into two classes natural audio codmg and speech codmg Natural audio codmg is commonly used for music or arbitrary signals at medium bitrates, and generally offers wide audio bandwidth Speech coders are basically limited to speech reproduction but can on the other hand be used at very low bitrates, albeit with low audio bandwidth In both classes, the signal is generally separated into two major signal components, the "spectral envelope" and the corresponding "residual" signal Throughout the following description, the term "spectral envelope" refers to the coarse spectral distribution of the signal in a general sense, e g filter coefficients m an linear prediction based coder or a set of time-frequency averages of subband samples in a subband coder The term "residual" refers to the fine spectral distribution in a general sense, e g the LPC error signal or subband samples normalized usmg the above time-frequency averages "Envelope data" refers to the quantized and coded spectral envelope, and "residual data" to the quantized and coded residual At medium and high bitrates, the residual data constitutes the ma part of the bitstream while the envelope data is merely a fraction At very low bitrates, the envelope data constitutes a comparably larger part of the bitstream Hence, it is indeed important to represent the spectral envelope compactly when using lower bitrates
Older prior art audio coders and most speech coders use static, relatively short, time segments in the generation of envelope data to achieve good temporal resolution However, this prevents from optimal utilisation of the frequency domain masking known from psycho-acoustics To improve coding gam through the use of narrow filterbands with steep slopes, and still achieve good temporal resolution during transient passages, modern audio coders employ adaptive window switching, I e they switch time segment lengths depending on the signals statistics Clearly a minimum usage of the short segments is a prerequisite for maximum coding gain Unfortunately, long transition windows are needed to alter the segment lengths, limiting the switching flexibility
The spectral envelope is a function of two variables time and frequency The encodmg can be done by exploiting redundancy m either direction of the time/frequency plane Generally, codmg of the spectral envelope is performed in the frequency direction usmg delta coding (DPCM), linear prediction (LPC), or vector quantization (VQ) SUMMARY OF THE INVENTION
The present mvention provides a new method and an apparatus for spectral envelope encoding The mvention teaches how to perform and signal compactly a time/frequency mapping of the envelope representation, and further, encode the spectral envelope data efficiently using adaptive time/frequency direction codmg In the absence of transients, l e for quasi-stationary signals, a time/frequency grid with low temporal and high frequency resolution is used as default In the vicinity of transients, the temporal resolution is increased at the expense of frequency resolution The mvention describes two schemes for signalling of the time and frequency resolution used One scheme allows arbitrary selection of instantaneous resolution by explicit signalling of time segment borders and frequency resolutions, whereas the other exploits the fact that transients are separated at least by a minimum time, T mm, ∞ order to reduce the required number of control bits In the encoder, a transient detector decides whether the current granule contains a transient, and if so, determines the position of the onset of the transient The position withm the granule is encoded and sent to the decoder Both the encoder and decoder share rules that specify the time/frequency distribution of the spectral envelope samples, given a certain combination of subsequent control signals, ensuring an unambiguous decodmg of the envelope data The rules can be realised as a book of tables explicitly specifying the division of the current granule in terms of samples m the time/frequency plane The variable time/frequency resolution method is also applicable on envelope encoding based on prediction Instead of groupmg of subband samples, predictor coefficients are generated for time segments of varying lengths according to the system Different predictor orders may be used for transient and quasi-stationary (tonal) segments
The present mvention presents a new and efficient method for scalefactor redundancy codmg A dirac pulse in the time domain transforms to a constant in the frequency domain, and a drrac in the frequency domain, l e a single smusoid, corresponds to a signal with constant magnitude m the time domain Simplified, on a short term basis, the signal shows less variations in one domam than the other Hence, usmg prediction or delta codmg, codmg efficiency is mcreased if the spectral envelope is coded in either time- or frequency-direction depending on the signal characteπstics
BRIEF DESCRIPTION OF THE DRAWINGS
The present mvention will now be described by way of illustrative examples, not limiting the scope or spirit of the mvention, with reference to the accompanying drawings, in which
Figs la - lb illustrate uniform respective non-uniform sampling in time of the spectral envelope Figs 2a - 2c illustrate transient detector look-ahead and granule mterdependency
Figs 3a - 3f illustrate segments with different time and frequency resolutions, and the corresponding control signals Fig 4 illustrates time/frequency switched envelope codmg
Fig 5 is a block diagram of an encoder using the envelope coding according to the mvention Fig 6 is a block diagram of a decoder usmg the envelope codmg according to the invention DESCRIPTION OF PREFERRED EMBODIMENTS
The below-described embodiments are merely illustrative for the principles of the present invention for efficient envelope codmg It is understood that modifications and variations of the arrangements and the details described herem will be apparent to others skilled in the art It is the intent, therefore, to be limited only by the scope of the impending patent claims and not by the specific details presented by way of description and explanation of the embodiments herem
Generation of Envelope Data
Most audio and speech coders have in common that both envelope data and residual data are transmitted and combined durmg the synthesis at the decoder Two exceptions are coders employing PNS ["Improving Audio
Codecs by Noise Substitution", D Schultz, JAES, vol 44, no 7/8, 1996], and coders employing SBR In case of SBR, considermg the highband, only the spectral course structure needs to be transmitted since a residual signal is reconstructed from the lowband This puts higher demands on how to generate envelope data, in particular due to lack of "timing" information contained m the original residual signal This problem will now be demonstrated by means of an example
Fig 1 shows the time/frequency representation of a musical signal where sustained chords are combmed with sharp transients with mainly high frequency contents In the lowband the chords have high energy and the transient energy is low, whereas the opposite is true m the highband The envelope data that is generated durmg time intervals where transients are present is dominated by the high intermittent transient energy At the SBR process m the decoder, the spectral envelope of the transposed signal is estimated using the same mstantaneous tιme-/frequency resolution as used for the analysis of the original highband An equalization of the transposed signal is then performed, based on dissimilarities m the spectral envelopes E g amplification factors in an envelope adjustmg filterbank are calculated as the quotients between oπgmal signal and transposed signal scalefactors For this kind of signal, a problem arises The transposed signal has the same chord to transient energy ratio as the lowband The gams needed in order to adjust the transposed transients to the correct level thus cause the transposed chords to be amplified relative the oπgmal highband level for the full duration of the envelope data containing transient energy These momentarily too loud chord fragments are perceived as pre- and post echoes to the transient, see Fig la This kind of distortion will hereinafter be referred to as "gam induced pre- and post echoes" The phenomenon can be eliminated by constantly updatmg the envelope data at such a high rate that the time between an update and an arbitrarily located transient is guaranteed to be short enough not to be resolved by the human hearing However, this approach would drastically increase the amount of data to be transmitted and is thus not practical
Therefore a new envelope data generation scheme is presented The prmcipal solution is to maintam a low update rate durmg tonal passages, which make up the majority of a typical programme material, and by means of a transient detector localize the transient positions, and update the envelope data close to the leading flanks, see Fig lb This eliminates gain induced pre-echoes In order to represent the decay of the transients well, the update rate is momentarily increased in a time interval after the transient start This eliminates gain induced post-echoes The time segmenting during the decay is not as crucial as findmg the start of the transient, as will be explained later In order to compensate for the smaller time steps, a lower frequency resolution can be used durmg the transient, keeping the data size within limits A non-uniform samplmg in time and frequency as outlined above is applicable both on subband coders and linear prediction based coders
Some prior art coders employ vaπable time/frequency resolution as well In case of subband coders, this is commonly achieved through switching of the filterbank size Such a change m size can not take place immediately, so called transition wmdows are needed, and thus the update points can not be chosen freely When using SBR, the filterbank can be designed to meet both the highest temporal and highest frequency resolution needed Thus the varying time and frequency sampling can be obtained by grouping of the subband samples from a fixed filterbank m different ways In other words, by keeping the filterbank size constant, high frequency resolution or high time resolution can be obtamed mstantaneously In case of prediction based coders, no elaborate time/frequency resolution switching schemes are known from prior art
Typical coders operate on a block basis, where every block represents a fixed time interval Those blocks will be referred to as "granules" Let a granule have a length of q time quantization steps, hereinafter called "subgranules" In applications where there are non-critical delay restrictions, as m point to multipoint broadcastmg, a transient detector look-ahead can be employed on the encoder side Having this additional information, envelope data spanning across borders of granules can be comprised This enables a more flexible selection of time/frequency resolutions, and faciliates constant bitrate operation, smce parts of the payload can be moved between consecutive granules Referring to Fig 2, the granules are divided into eight subgranules The transient detector operates on granules with the same timespan as the granule that overlap 50% of two consecutive granules, that is, the transient detector look-ahead is half a granule The transient detector has detected a transient m subgranule 6 at time n-\, and a transient m subgranule 7 at time n With these values as input to the time/frequency resolution controlling algorithms, the corresponding time/frequency grid for granule n might be as shown in Fig 2c As seen from the figure, subgranule 7 of the granule at time n-\ is mcluded in the time/frequency grid of granule n Moreover, it is possible to use an analysis by synthesis approach, l e having a decoder in the encoder to assess the most beneficial time/frequency sampling
Control Signalling
In order to correctly interpret the received envelope data, the segment borders and frequency resolutions (number of coefficients or scalefactors) must be signalled If a non-uniform sampling according to Fig 2 is to be employed, the problem of envelope data spanning over the granule borders must be dealt with Furthermore, the signalling must be flexible enough to cover all combinations of interest, without generating a too large amount of control data
Theoretically, transients can occur withm a granule m C combinations, ranging from no transient at all to q transients, where C is given by
Figure imgf000006_0001
In order to signal C states, ln2(Q = ln2(2*) = q bits are required, corresponding to one bit per subgranule If different frequency resolutions are to be used in the segments, even more bits might be required m order to signal the frequency resolution chosen However, in low bitrate applications the number of control signal bits must be kept at a minimum The first step towards an efficient signalling is to employ two time sampling modes, uniform and non-uniform sampling in time The uniform mode is used during quasi-stationary passages, and employs high frequency resolution and relatively long time segments, both of which are predefined Hence this mode does not require any signalling of segment borders or frequency resolutions One bit is sufficient to signal the time samplmg mode to the decoder The non-uniform mode is used durmg transient passages and requires additional signalling Two such signalling systems are proposed by the present invention
The first system, hereinafter referred to as the "border-signalling system", uses one bit per subgranule to signal whether a segment border is present at the subgranule left border or not Envelope data corresponding to a segment is always sent m the granule in which the segment starts This means that the number of envelopes transmitted m a granule equals the number of left borders in the granule or the bit sum of the q border bits The segment frequency resolutions are signalled with dynamically allocated control bits, e g one bit per envelope Again, this number of bits is derived from the q border bits
Some examples of grouping of subgranules into time segments are given m Fig 3, where the subgranules are numbered from 000 to 111 L denotes low frequency resolution and H denotes high resolution In the example the number of scalefactors or coefficients m a high resolution segment is assumed to be two times that of a low resolution segment Figure 3a shows a reference system, constantly using the highest possible time and frequency resolution The relative data matrix size is one by definition, and obviously no control signal bits are needed in this system If no transient is present in or next to a specific granule, the granule is divided mto two segments of equal length and the envelope representations are calculated using high frequency resolution If the two envelope representations do not differ more than a certain amount, only one set of high resolution envelope data is sent Those cases are illustrated by Figs 3b and 3c, where the control signal "Uniform" tells that uniform sampling in time is used, and the signal "LowTime" mdicates whether one or two envelopes are sent Hence, the control signal overhead is two bits The - symbol means that the signal is not transmitted Figs 3d - 3f show some cases where a transient, denoted by T, is present The border-signalling system uses 8 bits to signal sub-granule left borders, and a varying number of bits to signal the frequency resolution within the sub-granules Those signals are called
"Borders" and "LowFreq" respectively The "TranPos" signal is not part of this system, and will be explained later The right border of the last segment m a granule equals the first left border in the subsequent granule P means that the corresponding envelope data was sent in the previous granule, Fig 3f The signalling overhead varies between 12 and 13 bits m Figs 3d - 3f Notice that the transient cases d and f generate the same data matrix size as the non- transient case b Furthermore, it is possible to design a scheme that keeps the matrix size constant, if desired For a typical programme material, the system has a performance similar to that of the reference system, at data matrix sizes of only 0 125 to 0 375 times the reference size Hence a major data reduction is achieved when using the dynamic selection of time- and frequency resolution according to the present invention
The second system, heremafter referred to as the "position-signalling system", is intended for very low bitrate applications and utilizes some musical signal properties m order to reduce the number of control signal bits As will be shown below, many of the states described by Eq 1 are not very likely, and would also generate too large amounts of envelope data to be practical at a limited bitrate According to the present invention, the following simplifications can be made with little or no sacrifice of quality for practical signals
1 Only the transient start position needs to be transmitted The time and frequency grouping around this position can be handled by employing a set of rules in the encoder and decoder, which are based on the properties of typical transients
2 There exists a fixed rmmmum time-span between consecutive transients, l e transients can not be arbitrarily close to one another It is thus possible to introduce a blocking time in the transient detection/signalling system, reducing the number of states
The minimum time-span between consecutive transients in music programme material can be estimated in the following way In musical notation, the rhythmic "pulse" is described by a time signature expressed as a fraction A/B, where A denotes the number of "beats" per bar and MB is the type of note corresponding to one beat, for example a lA note, commonly referred to as a quarter note Let t denote the tempo in Beats Per Mmute (BPM) The time per note of type 1/C is then given by
Tn = (60/t)*(B/ [s] (Eq 2)
Most music pieces fall within the 70 - 160 BPM range, and in 4/4 time signature the fastest rhythmical patterns are for most practical cases made up from 1/32 or 32 nd notes This yields a minimum time T„mm = (60/160)*(4/32) = 47 ms Of course lower time periods than this may occur, but such fast sequences (>21 tones per second) almost get the character of buzz and need not be fully resolved
The necessary time resolution Tq must also be established In some cases a transient original signal has its mam energy in the highband to be reconstructed This means that the encoded spectral envelope must carry all the "timing" information The desired timing precision thus determines the resolution needed for encodmg of leading flanks Tq is much smaller than the minimum note period Tnmιn, smce small time deviations within the period clearly can be heard In most cases however, the transient has significant energy in the lowband The above described gain- induced pre-echoes must fall within the so called pre- or backward masking time Tm of the human auditory system m order to be inaudible Hence Tq must satisfy two conditions
Tq « T„mm (Eq 3) Tq < Tm (Eq 4)
Obviously Tm < T„mm (otherwise the notes would be so fast that they could not be resolved) and according to ["Modeling the Additivity of Nonsimultaneous Masking", Hearmg Res , vol 80, pp 105-118 (1994)], Tm amounts to 10-20 ms Smce T„mm is in the 50ms range, a reasonable selection of Tq according to Eq 3 results m that the second condition is also met Of course the precision of the transient detection in the encoder and the time resolution of the analysis/synthesis filterbank must also be considered when selecting Tq
Tracking of trailing flanks is less crucial, for several reasons First, the note-off position has little or no effect on the perceived rhythm Second, most instruments do not exhibit sharp trailing flanks, but rather a smooth decay curve, l e a well defined note-off time does not exist Third, the post- or forward masking time is substantially longer than the pre-masking time Accordmg to the present invention, the above transient start information can be used for implicit signalling of segment borders and frequency resolutions immediately after/between transients This will now be described, again referring to Fig 3, assuming a granule length selected accordmg to 8Tq <= Tπmι„, 1 e a maximum of one transient is likely to occur withm a granule In this position-signalling system the "Borders" and "LowFreq" signals are replaced by a smgle signal, "TranPos", consistmg of three bits When a transient is present, the position within the granule is signalled by "TranPos", see Fig 3d - 3f This value, m combination with the control signals of the precedmg granule, determines the time/frequency grid used for the current granule These grids are described by rules or tables that are available to both the encoder and decoder Given the common tables and the control signals "Uniform" and either "LowTime" or "TranPos" of the current and the previous granule, unambiguous decodmg of the envelope data is ensured To put the saving obtained by the use of the position-signalling system mstead of the border- signalling system into perspective, a hypothetical low bitrate envelope encoder is studied Assume granules of length l6Tq <= Tnm , an average number of scalefactors per granule of 40 and an average number of bits per scalefactor of 3 due to lossless codmg The average number of segments in granules containing transients, n, is assumed to be 3 For transients, the signalling overheads are Bborde, = l + r7 + n = l + 16 + 3 = 20 and Bposmo„ = 1 + ceιl{ln2(16)} = 1 + 4 = 5 Thus the savmg is around 20 - 5 = 15 bits, corresponding to about 5 scalefactors or 12 5 % of the envelope data, I e it is significant at such low bitrates
Time/Frequency Switched Scalefactor Encoding Utilising a time to frequency transform it can be shown that a pulse in the time domain corresponds to a flat spectrum m the frequency domain, and a "pulse" in the frequency domain, l e a single sinusoidal, corresponds to a quasi-stationary signal in the time domain In other words a signal usually shows more transient properties m one domain than the other In a spectrogram, l e a time/frequency matrix display, this property is evident, and can advantageously be used when codmg spectral envelopes
A tonal stationary signal can have a very sparse spectrum not suitable for delta codmg m the frequency-direction, but well suited for delta codmg in the time-direction, and vice versa This is displayed in Fig 4 Throughout the following description a vector of scale factors calculated at time n0 represents the spectral envelope
Y(k,n0)=[aι, a2, a3, , ak, ,aN], (Eq 5) where Ά aN are the amplitude values for different frequencies Common practice is to code the difference between adjacent values in the frequency-direction at a given time, which yields
Figure imgf000009_0001
In order to be able to decode this, the start value aj needs to be transmitted As stated above this delta-codmg scheme can prove to be most inefficient if the spectrum only contains a few stationary tones This can result m a delta codmg yielding a higher bit rate than regular PCM codmg In order to deal with this problem, a time/frequency switching method, hereinafter referred to as T/F-codmg, is proposed The scalefactors are quantized and coded both in the time- and frequency-direction For both cases, the required number of bits is calculated for a given coding error, or the error is calculated for a given number of bits Based upon this, the most beneficial codmg direction is selected As an example, DPCM and Huffman redundancy coding can be used Two vectors are calculated, £ and D,
Figure imgf000010_0001
D, (k,n0)=[-ι(no)--ι(n0-l),?i2(no)--2(no- ), ,aN(n0)-aN(«o-l)] (Eq 8) The correspondmg Huffman tables, one for the frequency direction and one for the time direction, state the number of bits required m order to code the vectors The coded vector requirmg the least number of bits to code represents the preferable codmg direction The tables may initially be generated using some minimum distance as a time/frequency switching criterion
Start values are transmitted whenever the spectral envelope is coded m the frequency direction but not when coded in the time direction smce they are available at the decoder, through the previous envelope The proposed algorithm also require extra information to be transmitted, namely a time/frequency flag indicating in which direction the spectral envelope was coded The T/F algorithm can advantageously be used with several different coding schemes of the scalefactor-envelope representation apart from DPCM and Huffman, such as ADPCM, LPC and vector quantisation The proposed T/F algorithm gives significant bitrate-reduction for the spectral-envelope data, up to around 20% reduction compared to commonly used delta-coding techniques If the number of scalefactors per octave is constant, it is possible to delta code on an octave basis instead of delta codmg of adjacent scale factors
Practical implementations An example of the encoder side of the invention is shown in Fig 5 The analogue input signal is fed to an A/D- converter 501, forming a digital signal The digital audio signal is fed to a perceptual audio encoder 502, where source codmg is performed In addition, the digital signal is fed to a transient detector 503 and to an analysis filterbank 504, which splits the signal mto its spectral equivalents (subband signals) The transient detector could operate on the subband signals from the analysis bank, but for generality purposes it is here assumed to operate on the digital time domam samples directly The transient detector divides the signal mto granules and determines, accordmg to the invention, whether subgranules withm the granules is to be flagged as transient This information is sent to the envelope groupmg block 505, which specifies the time/frequency grid to be used for the current granule Accordmg to the grid, the block combmes the uniform sampled subband signals, to form the non-uniform sampled envelope values As an example, these values might be the average or maximum energy for the subband samples combined The envelope values are, together with the groupmg information, fed to the envelope encoder block 506 This block decides m which direction (time or frequency) to encode the envelope values The resultmg signals, the output from the audio encoder, the wideband envelope information, and the control signals are fed to the multiplexer 507, forming a serial bitstream that is transmitted or stored
The decoder side of the invention is shown in Fig 6 The demultiplexer 601 restores the signals and feeds the appropπate part to an audio decoder 602, which produces a low band digital audio signal The envelope information is fed from the demultiplexer to the envelope decodmg block 603, which, by use of control data, determines in which direction the current envelope are coded and decodes the data The low band signal from the audio decoder is routed to the transposition module 604, which generates a replicated high band signal consisting of one or several harmomcs from the low band signal. The high band signal is fed to an analysis filterbank 606, which is of the same type as on the encoder side. The subband signals are combined m the scalefactor grouping umt 607. By use of control data from the demultiplexer, the same type of combination and time/frequency distribution of the subband samples is adopted as on the encoder side. The envelope information from the demultiplexer and the information from the scalefactor groupmg umt is processed m the gam control module 608. The module computes gam factors to be applied to the subband samples before recombmation in the synthesis filterbank block 609. The output from the synthesis filterbank is thus an envelope adjusted high band audio signal This signal is added to the output from the delay unit 605, which is fed with the low band audio signal. The delay compensates for the processmg time of the high band signal Finally, the obtamed digital wideband signal is converted to an analogue audio signal m the digital to analogue converter 610

Claims

I. A method for spectral envelope coding m a source codmg system where said system comprises an encoder representmg all operations performed prior to storage or transmission, and a decoder representing all operations performed after storage or transmission, characterised by at said encoder, perform a statistical analysis of the input signal, based on the outcome of said analysis, select the instantaneous time and frequency resolution to be used m the spectral envelope representation, usmg said resolution, generate data representmg said spectral envelope, transmit said data together with a control signal describing said resolution, and at said decoder, using said control signal and said data in the synthesis of the output signal
2 A method accordmg to claim 1, characterised in that said instantaneous time and frequency resolution is obtamed by groupmg of elements in a time/frequency representation of said mput signal, and calculating a scalefactor for every one of said groups
3 A method according to claim 2, characterised in that said time/frequency representation is generated by a filterbank
4. A method accordmg to claim 3, characterised in that said filterbank is of fixed size
5. A method accordmg to claim 1, characterised in that said data is generated by a linear predictor
6. A method according to claim 1, characterised in that said analysis employs a transient detector
7. A method according to claim 6, characterised in that said instantaneous resolution is switched from a default combination of higher frequency resolution and lower time resolution to a combination of lower frequency resolution and higher time resolution at the onset of a transient
8. A method accordmg to claim 1, characterised in that said control signal describes positions within a granule of constant update rate, generated by said analysis, and said instantaneous resolution is chosen based on the positions within current and neighbouring granules, by the use of rules available to both said encoder and said decoder
9. A method accordmg to claim 8, characterised in that at most one position per granule is signalled
10. A method accordmg to claim 1, characterised in that said control signal describes borders withm a granule of constant update rate, said instantaneous resolution is signalled once per border, and that one set of data is sent per border withm a granule
II. A method accordmg to claim 2, characterised in that said scalefactors are coded both m the time and frequency direction, the momentarily most beneficial direction is determined, said most beneficial direction is used for said transmission
12. A method accordmg to claim 11, characterised in that the direction which generates the least coding error for a given number of bits is chosen
13. A method accordmg to claim 11, characterised in that the direction which generates the least number of bits for a given codmg error is chosen
14. A method accordmg to claim 13, characterised in that lossless coding is employed and separate tables are used for said time and frequency directions, m particular where said tables are used for selection of codmg direction
15. An apparatus for encoding of a spectral envelope of a signal to be decoded by a decoder, characterised by means for performing a statistical analysis of the input signal, means for selection of the instantaneous time and frequency resolution to be used in a spectral envelope representation of said mput signal, based on the outcome of said analysis, means for generation of data representmg said spectral envelope, usmg said resolution, and means for transmission of said data together with a control signal descπbmg said resolution
16. An apparatus for decodmg of a spectral envelope of a signal encoded by an encoder, characterised by means for interpretation of a received control signal in order to determine the instantaneous time and frequency resolution used in a spectral envelope representation of an encoded signal, means for decodmg of received envelope data based on said spectral envelope representation, using said control signal, and means for using said decoded envelope data in the synthesis of the output signal
PCT/SE2000/000158 1999-01-27 2000-01-26 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching WO2000045378A2 (en)

Priority Applications (19)

Application Number Priority Date Filing Date Title
AU25856/00A AU2585600A (en) 1999-01-27 2000-01-26 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
US09/763,128 US6978236B1 (en) 1999-10-01 2000-01-26 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
DK00968271T DK1216474T3 (en) 1999-10-01 2000-09-29 Effective spectral envelope curve coding using variable time / frequency resolution
AT00968271T ATE271250T1 (en) 1999-10-01 2000-09-29 CODING THE ENVELOPE OF THE SPECTRUM USING VARIABLE TIME/FREQUENCY RESOLUTION
DE60012198T DE60012198T2 (en) 1999-10-01 2000-09-29 ENCODING THE CORD OF THE SPECTRUM BY VARIABLE TIME / FREQUENCY RESOLUTION
PCT/SE2000/001887 WO2001026095A1 (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
BRPI0014642A BRPI0014642B1 (en) 1999-10-01 2000-09-29 spectral envelope coding using variable time-frequency resolution and time-frequency shifting
CNB008136025A CN1172293C (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
JP2001528974A JP4035631B2 (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time / frequency resolution and time / frequency switching
RU2002111665/09A RU2236046C2 (en) 1999-10-01 2000-09-29 Effective encoding of spectrum envelope with use of variable resolution in time and frequency and switching time/frequency
PT00968271T PT1216474E (en) 1999-10-01 2000-09-29 EFFICIENT CODE OF SPECIAL ENVELOPE USING RESOLUTION TIME / VARIABLE FREQUENCY
EP00968271A EP1216474B1 (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution
AU78212/00A AU7821200A (en) 1999-10-01 2000-09-29 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
ES00968271T ES2223591T3 (en) 1999-10-01 2000-09-29 EFFECTIVE CODIFICATION OF SPECIAL ENVELOPE USING A RESOLUTION TIME / VARIABLE FREQUENCY.
HK03101398.3A HK1049401B (en) 1999-10-01 2003-02-24 Effective spectral envelope coding method and coding/encoding apparatus thereof
JP2005292388A JP4334526B2 (en) 1999-10-01 2005-10-05 Efficient spectral envelope coding using variable time / frequency resolution and time / frequency switching
JP2005292384A JP4628921B2 (en) 1999-10-01 2005-10-05 Efficient spectral envelope coding using variable time / frequency resolution and time / frequency switching
US11/246,283 US7181389B2 (en) 1999-10-01 2005-10-11 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
US11/246,284 US7191121B2 (en) 1999-10-01 2005-10-11 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
SE9900256-0 1999-01-27
SE9900256A SE9900256D0 (en) 1999-01-27 1999-01-27 Method and apparatus for improving the efficiency and sound quality of audio encoders
SE9903552A SE9903552D0 (en) 1999-01-27 1999-10-01 Efficient spectral envelope coding using dynamic scalefactor grouping and time / frequency switching
SE9903552-9 1999-10-01

Related Child Applications (3)

Application Number Title Priority Date Filing Date
US09763128 A-371-Of-International 2000-01-26
US11/246,284 Division US7191121B2 (en) 1999-10-01 2005-10-11 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
US11/246,283 Division US7181389B2 (en) 1999-10-01 2005-10-11 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching

Publications (2)

Publication Number Publication Date
WO2000045378A2 true WO2000045378A2 (en) 2000-08-03
WO2000045378A3 WO2000045378A3 (en) 2000-11-16

Family

ID=26663488

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2000/000158 WO2000045378A2 (en) 1999-01-27 2000-01-26 Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching

Country Status (3)

Country Link
AU (1) AU2585600A (en)
SE (1) SE9903552D0 (en)
WO (1) WO2000045378A2 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002093560A1 (en) * 2001-05-10 2002-11-21 Dolby Laboratories Licensing Corporation Improving transient performance of low bit rate audio coding systems by reducing pre-noise
US7246065B2 (en) 2002-01-30 2007-07-17 Matsushita Electric Industrial Co., Ltd. Band-division encoder utilizing a plurality of encoding units
US7283954B2 (en) 2001-04-13 2007-10-16 Dolby Laboratories Licensing Corporation Comparing audio using characterizations based on auditory events
EP1603117A3 (en) * 2001-07-10 2008-02-06 Coding Technologies Sweden AB Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US7461002B2 (en) 2001-04-13 2008-12-02 Dolby Laboratories Licensing Corporation Method for time aligning audio signals using characterizations based on auditory events
US7508947B2 (en) 2004-08-03 2009-03-24 Dolby Laboratories Licensing Corporation Method for combining audio signals using auditory scene analysis
US7610205B2 (en) 2002-02-12 2009-10-27 Dolby Laboratories Licensing Corporation High quality time-scaling and pitch-scaling of audio signals
WO2010003543A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for calculating bandwidth extension data using a spectral tilt controlling framing
US7711123B2 (en) 2001-04-13 2010-05-04 Dolby Laboratories Licensing Corporation Segmenting audio signals into auditory events
US8041578B2 (en) 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8126721B2 (en) 2006-10-18 2012-02-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8170882B2 (en) 2004-03-01 2012-05-01 Dolby Laboratories Licensing Corporation Multichannel audio coding
US8280743B2 (en) 2005-06-03 2012-10-02 Dolby Laboratories Licensing Corporation Channel reconfiguration with side information
US8417532B2 (en) 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8605911B2 (en) 2001-07-10 2013-12-10 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
WO2014118179A1 (en) * 2013-01-29 2014-08-07 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoders, audio decoders, systems, methods and computer programs using an increased temporal resolution in temporal proximity of onsets or offsets of fricatives or affricates
US9043200B2 (en) 2005-04-13 2015-05-26 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Adaptive grouping of parameters for enhanced coding efficiency
US9761237B2 (en) 2001-11-29 2017-09-12 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
RU2650031C2 (en) * 2013-08-29 2018-04-06 Долби Интернэшнл Аб Frequency band table design for high frequency reconstruction algorithms
US10157623B2 (en) 2002-09-18 2018-12-18 Dolby International Ab Method for reduction of aliasing introduced by spectral envelope adjustment in real-valued filterbanks

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5504832A (en) * 1991-12-24 1996-04-02 Nec Corporation Reduction of phase information in coding of speech
US5581653A (en) * 1993-08-31 1996-12-03 Dolby Laboratories Licensing Corporation Low bit-rate high-resolution spectral envelope coding for audio encoder and decoder
US5737718A (en) * 1994-06-13 1998-04-07 Sony Corporation Method, apparatus and recording medium for a coder with a spectral-shape-adaptive subband configuration
WO1998057436A2 (en) * 1997-06-10 1998-12-17 Lars Gustaf Liljeryd Source coding enhancement using spectral-band replication
US5852806A (en) * 1996-03-19 1998-12-22 Lucent Technologies Inc. Switched filterbank for use in audio signal coding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5504832A (en) * 1991-12-24 1996-04-02 Nec Corporation Reduction of phase information in coding of speech
US5581653A (en) * 1993-08-31 1996-12-03 Dolby Laboratories Licensing Corporation Low bit-rate high-resolution spectral envelope coding for audio encoder and decoder
US5737718A (en) * 1994-06-13 1998-04-07 Sony Corporation Method, apparatus and recording medium for a coder with a spectral-shape-adaptive subband configuration
US5852806A (en) * 1996-03-19 1998-12-22 Lucent Technologies Inc. Switched filterbank for use in audio signal coding
WO1998057436A2 (en) * 1997-06-10 1998-12-17 Lars Gustaf Liljeryd Source coding enhancement using spectral-band replication

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BOSI M. ET AL.: 'Time versus frequency resolution in a low-rate, high quality audio transform coder' IEEE ASSP WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, FINAL PROGRAM AND PAPER SUMMARIES 1991, pages 0_81 - 0_82 *
PRINCEN J. ET AL.: 'Audio coding with signal adaptive filterbanks' INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP-95 vol. 5, 1995, pages 3071 - 3074 *

Cited By (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7711123B2 (en) 2001-04-13 2010-05-04 Dolby Laboratories Licensing Corporation Segmenting audio signals into auditory events
US8195472B2 (en) 2001-04-13 2012-06-05 Dolby Laboratories Licensing Corporation High quality time-scaling and pitch-scaling of audio signals
US7283954B2 (en) 2001-04-13 2007-10-16 Dolby Laboratories Licensing Corporation Comparing audio using characterizations based on auditory events
US9165562B1 (en) 2001-04-13 2015-10-20 Dolby Laboratories Licensing Corporation Processing audio signals with adaptive time or frequency resolution
US7461002B2 (en) 2001-04-13 2008-12-02 Dolby Laboratories Licensing Corporation Method for time aligning audio signals using characterizations based on auditory events
US8488800B2 (en) 2001-04-13 2013-07-16 Dolby Laboratories Licensing Corporation Segmenting audio signals into auditory events
CN1312662C (en) * 2001-05-10 2007-04-25 杜比实验室特许公司 Improving transient performance of low bit rate audio coding systems by reducing pre-noise
US7313519B2 (en) 2001-05-10 2007-12-25 Dolby Laboratories Licensing Corporation Transient performance of low bit rate audio coding systems by reducing pre-noise
WO2002093560A1 (en) * 2001-05-10 2002-11-21 Dolby Laboratories Licensing Corporation Improving transient performance of low bit rate audio coding systems by reducing pre-noise
CN1758338B (en) * 2001-07-10 2010-11-17 杜比国际公司 Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US8081763B2 (en) 2001-07-10 2011-12-20 Coding Technologies Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US10297261B2 (en) 2001-07-10 2019-05-21 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US10540982B2 (en) 2001-07-10 2020-01-21 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
EP2015292A1 (en) * 2001-07-10 2009-01-14 Dolby Sweden AB Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US8014534B2 (en) 2001-07-10 2011-09-06 Coding Technologies Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US9799340B2 (en) 2001-07-10 2017-10-24 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US8059826B2 (en) 2001-07-10 2011-11-15 Coding Technologies Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US8073144B2 (en) 2001-07-10 2011-12-06 Coding Technologies Ab Stereo balance interpolation
US9865271B2 (en) 2001-07-10 2018-01-09 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate applications
US8116460B2 (en) * 2001-07-10 2012-02-14 Coding Technologies Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US9799341B2 (en) 2001-07-10 2017-10-24 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate applications
US9792919B2 (en) 2001-07-10 2017-10-17 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate applications
EP1603117A3 (en) * 2001-07-10 2008-02-06 Coding Technologies Sweden AB Efficient and scalable parametric stereo coding for low bitrate audio coding applications
CN101887724B (en) * 2001-07-10 2012-05-30 杜比国际公司 Decoding method for encoding power spectral envelope
US10902859B2 (en) 2001-07-10 2021-01-26 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US8243936B2 (en) 2001-07-10 2012-08-14 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US8605911B2 (en) 2001-07-10 2013-12-10 Dolby International Ab Efficient and scalable parametric stereo coding for low bitrate audio coding applications
US9818418B2 (en) 2001-11-29 2017-11-14 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US9761237B2 (en) 2001-11-29 2017-09-12 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US11238876B2 (en) 2001-11-29 2022-02-01 Dolby International Ab Methods for improving high frequency reconstruction
US10403295B2 (en) 2001-11-29 2019-09-03 Dolby International Ab Methods for improving high frequency reconstruction
US9812142B2 (en) 2001-11-29 2017-11-07 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US9792923B2 (en) 2001-11-29 2017-10-17 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US9779746B2 (en) 2001-11-29 2017-10-03 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US9761236B2 (en) 2001-11-29 2017-09-12 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US9761234B2 (en) 2001-11-29 2017-09-12 Dolby International Ab High frequency regeneration of an audio signal with synthetic sinusoid addition
US7246065B2 (en) 2002-01-30 2007-07-17 Matsushita Electric Industrial Co., Ltd. Band-division encoder utilizing a plurality of encoding units
US7610205B2 (en) 2002-02-12 2009-10-27 Dolby Laboratories Licensing Corporation High quality time-scaling and pitch-scaling of audio signals
US10157623B2 (en) 2002-09-18 2018-12-18 Dolby International Ab Method for reduction of aliasing introduced by spectral envelope adjustment in real-valued filterbanks
US9691404B2 (en) 2004-03-01 2017-06-27 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques
US10460740B2 (en) 2004-03-01 2019-10-29 Dolby Laboratories Licensing Corporation Methods and apparatus for adjusting a level of an audio signal
US9704499B1 (en) 2004-03-01 2017-07-11 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques and differentially coded parameters
US9715882B2 (en) 2004-03-01 2017-07-25 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques
US11308969B2 (en) 2004-03-01 2022-04-19 Dolby Laboratories Licensing Corporation Methods and apparatus for reconstructing audio signals with decorrelation and differentially coded parameters
US9691405B1 (en) 2004-03-01 2017-06-27 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques and differentially coded parameters
US9672839B1 (en) 2004-03-01 2017-06-06 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques and differentially coded parameters
US9779745B2 (en) 2004-03-01 2017-10-03 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques and differentially coded parameters
US9640188B2 (en) 2004-03-01 2017-05-02 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques
US8170882B2 (en) 2004-03-01 2012-05-01 Dolby Laboratories Licensing Corporation Multichannel audio coding
US10796706B2 (en) 2004-03-01 2020-10-06 Dolby Laboratories Licensing Corporation Methods and apparatus for reconstructing audio signals with decorrelation and differentially coded parameters
US9697842B1 (en) 2004-03-01 2017-07-04 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques and differentially coded parameters
US10403297B2 (en) 2004-03-01 2019-09-03 Dolby Laboratories Licensing Corporation Methods and apparatus for adjusting a level of an audio signal
US10269364B2 (en) 2004-03-01 2019-04-23 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques
US8983834B2 (en) 2004-03-01 2015-03-17 Dolby Laboratories Licensing Corporation Multichannel audio coding
US7508947B2 (en) 2004-08-03 2009-03-24 Dolby Laboratories Licensing Corporation Method for combining audio signals using auditory scene analysis
US9043200B2 (en) 2005-04-13 2015-05-26 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Adaptive grouping of parameters for enhanced coding efficiency
US8280743B2 (en) 2005-06-03 2012-10-02 Dolby Laboratories Licensing Corporation Channel reconfiguration with side information
US8041578B2 (en) 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8417532B2 (en) 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8126721B2 (en) 2006-10-18 2012-02-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
AU2009267529B2 (en) * 2008-07-11 2011-03-03 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for calculating bandwidth extension data using a spectral tilt controlling framing
TWI457914B (en) * 2008-07-11 2014-10-21 Fraunhofer Ges Forschung Apparatus and method for calculating bandwidth extension data using a spectral tilt controlled framing
RU2443028C2 (en) * 2008-07-11 2012-02-20 Фраунхофер-Гезелльшафт цур Фёрдерунг дер ангевандтен Apparatus and method for calculating bandwidth extension data using a spectral tilt controlled framing
WO2010003543A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for calculating bandwidth extension data using a spectral tilt controlling framing
US8788276B2 (en) 2008-07-11 2014-07-22 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method for calculating bandwidth extension data using a spectral tilt controlled framing
WO2014118179A1 (en) * 2013-01-29 2014-08-07 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoders, audio decoders, systems, methods and computer programs using an increased temporal resolution in temporal proximity of onsets or offsets of fricatives or affricates
US10438596B2 (en) 2013-01-29 2019-10-08 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoders, audio decoders, systems, methods and computer programs using an increased temporal resolution in temporal proximity of onsets or offsets of fricatives or affricates
EP3680899A1 (en) * 2013-01-29 2020-07-15 FRAUNHOFER-GESELLSCHAFT zur Förderung der angewandten Forschung e.V. Audio encoders, audio decoders, systems, methods and computer programs using an increased temporal resolution in temporal proximity of onsets or offsets of fricatives or affricates
RU2651425C2 (en) * 2013-01-29 2018-04-19 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Audio encoders, audio decoders, systems, methods and computer programs using increased time resolution in time neighborhood of appearances or disappearances of fricative consonants and affricates
US11205434B2 (en) 2013-01-29 2021-12-21 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoders, audio decoders, systems, methods and computer programs using an increased temporal resolution in temporal proximity of onsets or offsets of fricatives or affricates
EP3279894A1 (en) * 2013-01-29 2018-02-07 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoders, audio decoders, systems, methods and computer programs using an increased temporal resolution in temporal proximity of onsets or offsets of fricatives or affricates
RU2650031C2 (en) * 2013-08-29 2018-04-06 Долби Интернэшнл Аб Frequency band table design for high frequency reconstruction algorithms

Also Published As

Publication number Publication date
WO2000045378A3 (en) 2000-11-16
AU2585600A (en) 2000-08-18
SE9903552D0 (en) 1999-10-01

Similar Documents

Publication Publication Date Title
US6978236B1 (en) Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
EP1730725B1 (en) Efficient coding of digital audio spectral data using spectral similarity
WO2000045378A2 (en) Efficient spectral envelope coding using variable time/frequency resolution and time/frequency switching
US5819215A (en) Method and apparatus for wavelet based data compression having adaptive bit rate control for compression of digital audio or other sensory data
US6502069B1 (en) Method and a device for coding audio signals and a method and a device for decoding a bit stream
US8924201B2 (en) Audio encoder and decoder
EP1904999B1 (en) Frequency segmentation to obtain bands for efficient coding of digital media
EP2056294B1 (en) Apparatus, Medium and Method to Encode and Decode High Frequency Signal
US7548853B2 (en) Scalable compressed audio bit stream and codec using a hierarchical filterbank and multichannel joint coding
JP4220461B2 (en) Method and apparatus for generating upsampled signals of temporally discrete speech signals
JP3071795B2 (en) Subband coding method and apparatus
US10255928B2 (en) Apparatus, medium and method to encode and decode high frequency signal
RU2752127C2 (en) Improved quantizer
EP1423847A1 (en) Reconstruction of high frequency components
JPH0629859A (en) Method for encoding of digital input signal
AU6216498A (en) Audio coding method and apparatus
WO2009059632A1 (en) An encoder
CN114550732A (en) Coding and decoding method and related device for high-frequency audio signal
AU2011205144B2 (en) Scalable compressed audio bit stream and codec using a hierarchical filterbank and multichannel joint coding

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 09763128

Country of ref document: US

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase