話者適応の競合モデルからの選択.音声処理効率における注意、認知、記憶

/ /

日本語AIでPubMedを検索

PubMedの提供する医学論文データベースを日本語で検索できます。AI(Deep Learning)を活用した機械翻訳エンジンにより、精度高く日本語へ翻訳された論文をご参照いただけます。

Cognition.2020 Jul;204:104393. S0010-0277(20)30212-2. doi: 10.1016/j.cognition.2020.104393.Epub 2020-07-17.

話者適応の競合モデルからの選択.音声処理効率における注意、認知、記憶

Selecting among competing models of talker adaptation: Attention, cognition, and memory in speech processing efficiency.

Alexandra M Kapadia
Tyler K Perrachione

PMID: 32688132 DOI: 10.1016/j.cognition.2020.104393.

抄録

このような場合には、話者間での音声多様性は、音声認識の際に追加の処理コストを発生させ、しばしば単一話者条件と混合話者条件の間でのパフォーマンスの低下によって測定される。しかし、より大きな音声多様性（すなわち、話し手の数が多いこと）に対応することで、より大きな処理コストが発生するかどうかについては、モデルによって予測が異なります。この課題では、聞き手が聞く話者の数（1、2、4、8、16）を操作しました。その結果、混合トーカ条件では、単一トーカ条件に比べて単語識別の効率が低下したが、この性能低下の大きさはトーカの数には影響されなかった。さらに、2人の話者間の遷移確率が一様な条件では、話者が入れ替わった場合に比べて、話者が前の試行と同じ場合の方が単語識別の効率が高いことが示された。これらの結果は、話者適応の聴覚ストリーミングモデルを支持するものであり、話者の変更に伴う処理コストは注意の再方向付けに起因している。

Phonetic variability across talkers imposes additional processing costs during speech perception, often measured by performance decrements between single- and mixed-talker conditions. However, models differ in their predictions about whether accommodating greater phonetic variability (i.e., more talkers) imposes greater processing costs. We measured speech processing efficiency in a speeded word identification task, in which we manipulated the number of talkers (1, 2, 4, 8, or 16) listeners heard. Word identification was less efficient in every mixed-talker condition compared to the single-talker condition, but the magnitude of this performance decrement was not affected by the number of talkers. Furthermore, in a condition with uniform transition probabilities between two talkers, word identification was more efficient when the talker was the same as the prior trial compared to trials when the talker switched. These results support an auditory streaming model of talker adaptation, where processing costs associated with changing talkers result from attentional reorientation.