平移(音频)
计算机科学
相似性度量
频域
连贯性(哲学赌博策略)
度量(数据仓库)
相似性(几何)
启发式
傅里叶变换
人工智能
频道(广播)
模式识别(心理学)
计算机视觉
数学
图像(数学)
数据挖掘
电信
数学分析
缩放
统计
石油工程
工程类
镜头(地质)
作者
Carlos Avendaño,Jean-Marc Jot
摘要
A series of upmixing techniques for generating multichannel audio from stereo recordings are proposed. The techniques use a common analysis framework based on a comparison between the short-time Fourier transforms of the left and right stereo signals. An interchannel coherence measure is used to identify time-frequency regions consisting mostly of ambience components, which can then be weighted via a nonlinear mapping function, and extracted to synthesize ambience signals. A similarity measure is used to identify the panning coefficients of the various sources in the mix in the time-frequency plane, and different heuristic mapping functions are applied to unmix (extract) one or more sources, and perceptually based functions to repan the signals into an arbitrary number of channels. We illustrate the application of the various techniques in the design of a two-to-five channel upmix system.
科研通智能强力驱动
Strongly Powered by AbleSci AI