Query-by-Example Spoken Term Detection For OOV Terms

ABSTRACT

The goal of Spoken Term Detection (STD) technology is to allow open vocabulary search over large collections of speech content. In this paper, we address cases where search term(s) of interest (queries) are acoustic examples. This is provided either by identifying a region of interest in a speech stream or by speaking the query term. Queries often relate to named-entities and foreign words, which typically have poor coverage in the vocabulary of Large Vocabulary Continuous Speech Recognition (LVCSR) systems. hroughout this paper, we focus on query-by-example search for such out-of-vocabulary (OOV) query terms. We build upon a finite state transducer (FST) based search and indexing system to address the uery by example search for OOV terms by representing both the query and the index as phonetic lattices from the output of an LVCSR system. We provide results comparing different representations and generation mechanisms for both queries and indexes built with word and combined word and subword units . We also present a two-pass method which uses query-by-example search using the best hit identified in an initial pass to augment the STD search results. The results demonstrate that queryby-example search can yield a significantly better performance, measured using Actual Term-Weighted Value (ATWV), of 0.479 when compared to a baseline ATWV of 0.325 that uses reference pronunciations for OOVs. Further improvements can be obtained with the proposed two pass approach and filtering using the expected unigram counts from the LVCSR system’s lexicon.

INTRODUCTION

The fast-growing availability of recorded speech calls for efficient and scalable solutions to index and search this data .Spoken Term Detection (STD) is a key technology aimed at open-vocabulary search over large collections of spoken documents. A common approach to STD is to employ a large vocabulary continuous speech recognition (LVCSR) system to obtain word lattices and extend classical Information Retrieval techniques to word lattices. Such approaches have been shown to be very accurate for well-resourced tasks . A significant challenge in the STD task is the search for queries containing OOV terms. As queries often relate to named-entities and foreign words, they have typically poor coverage in the LVCSR system’s vocabulary, and hence searching through word lattices will not return any results. Common approaches to overcome this problem consist of searching sub-word lattices .

چکیده

هدف از تکنولوژی تشخیص اصطلاح گفتاری (STD) این است که امکان جستجوی لغات باز از مجموعه های بزرگ محتوای گفتاری را فراهم کند. در این مقاله، ما مواردی را مطرح می کنیم که کلمات جستجو (نمونه سوالات) عبارتند از نمونه های صوتی. این یا با شناسایی یک منطقه مورد علاقه در جریان گفتار یا با صحبت کردن از عبارت پرس و جو است. پرسشها اغلب مربوط به اشخاص نامیده شده و واژه های خارجی است که معمولا دارای ضعف در واژگان سیستم های بزرگ تشخیص گفتار مداوم واژگان (LVCSR) هستند. از طریق این مقاله، ما در جستجوی پرس و جو به عنوان مثال از واژه های پرس و جو از جمله (OOV) تمرکز می کنیم. ما بر مبنای یک جستجوی مبنای محدود (FST) مبتنی بر سیستم جستجو و نمایه سازی برای پاسخ دادن به این موضوع با جستجوی مثالهای OOV با نمایندگی هر دو پرس و جو و شاخص به عنوان لایه های آوایی از خروجی یک سیستم LVCSR بنا می کنیم. ما نتایج را با مقایسه نمایه های مختلف و مکانیزم نسل برای هر دو پرسش و شاخص های ساخته شده با واژگان و واحدهای واژگان و کلمات مشترک ارائه می دهیم. ما همچنین یک روش دو نفره ای ارائه می دهیم که با استفاده از بهترین نتیجه ی شناسایی شده در یک گذر اولیه برای جستجوی نتایج جستجوی STD استفاده می شود. نتایج نشان می دهد که جستجو به عنوان مثال queryby می تواند به طور قابل توجهی عملکرد بهتر، با استفاده از ارزش واقعی وزن با توجه به وزن واقعی (ATWV)، 0.479، در مقایسه با ATWV اولیه از 0.325 است که استفاده از واژگان مرجع برای OOV. بهبود بیشتر می تواند با روش پیشنهادی دو طرفه و فیلتر کردن با استفاده از شمارش واریانس انتظار می رود از واژگانی سیستم LVCSR بدست آید.

مقدمه

در دسترس بودن سریع گفتار ضبط شده، راهکارهای کارآمد و مقیاس پذیر برای نشان دادن و جستجوی این داده ها در اختیار شما قرار می گیرد. تشخیص اصطلاح معروف (STD) یک تکنولوژی کلیدی است که به جستجوی لغات باز از طریق مجموعه های بزرگ از اسناد سخن گفته است. یک رویکرد رایج برای STD این است که یک سیستم بزرگنمائی واژگان معیوب (LVCSR) برای استخراج واژه های دروغین و گسترش تکنیک های جستجوی اطلاعات کلاسیک به واژه های lattices استفاده شود. چنین رویکردهایی برای وظایف با ریسک بسیار دقیق نشان داده شده است. یک چالش مهم در وظیفه STD جستجو برای پرس و جوهایی است که حاوی شرایط OOV است. همان طور که پرس و جوها اغلب به نامهای نام و واژه های خارجی مربوط می شوند، معمولا آنها در واژگان سیستم LVCSR ضعیف هستند و از این طریق جستجو در شبکه های کلمه هیچ نتیجه ای نخواهد داشت. رویکردهای مشترک برای غلبه بر این مشکل عبارتند از جست و جوی جست و جوهای زیر کلمه.

Year: 2009

Publisher : IEEE

By : Carolina Parada , Abhinav Sethy , Bhuvana Ramabhadran

File Information: English Language/ 6 Page / size: 190 KB

Download

سال : 1388

ناشر : IEEE

کاری از : کارولینا پارادا، ابینعوفتی، بوهوونا رمباهادران

اطلاعات فایل : زبان انگلیسی / 6 صفحه / حجم : KB 190

لینک دانلود

Query-by-Example Spoken Term Detection For OOV Terms

دیدگاه خود را ثبت کنید

دیدگاهتان را بنویسید لغو پاسخ

درباره فروشگاه

ارتباط با ما