Search:

oklahoma sports oklahoma fan ou osu sooners cowboys stillwater norman bedlam oklahoma basketball oklahoma football oklahoma blazers CHL oklahoma redhawks redhawks blazers ford center toby keith bricktown ballpark bricktown hockey fight CHL fight oklahoma fight oklahoma tornado yard dawgz arena football oklahoma dawgz nba hornets hornets basketball oklahoma city hornets

Forest-based Search Algorithms in Parsing and Machine Translation

Oklahoma Sports Fan
Oklahoma Sports Fan Oklahoma Sports Fan
Oklahoma Sports Fan

Google Tech Talks March, 14 2008 ABSTRACT Many problems in Natural Language Processing (NLP) involves an efficient search for the best derivation over (exponentially) many candidates, especially in parsing and machine translation. In these cases, the concept of "packed forest" provides a compact representation of the huge search spaces, where efficient inference algorithms based on Dynamic Programming (DP) are possible. In this talk we address two important open problems within this framework: exact k-best inference which is often used in NLP pipelines such as parse reranking and MT rescoring, and approximate inference when the search space is too big for exact search. We first present a series of fast and exact k-best algorithms on forests, which are orders of magnitudes faster than previously used methods on state-of-the-art parsers such as Collins (1999). We then extend these algorithms for approximate search when the forests are too big for exact inference. We discuss two particular instances of this new method, forest rescoring for MT decoding with integrated language models, and forest reranking for discriminative parsing. In the former, our methods perform orders of magnitudes faster than conventional beam search on both state-of-the-art phrase-based and syntax-based systems, with the same level of search error or translation quality. In the latter, faster search also leads to better learning, where our approximate decoding makes whole-Treebank discriminative training practical and results in the best accuracy to date for parsers trained on the Treebank. This talk includes joint work with David Chiang (USC Information Sciences Institute). Liang Huang (2008). Forest Reranking: Discriminative Parsing with Non- Local Features. Proceedings of ACL 2008 (to appear). http://www.cis.upenn.edu/~lhua... Liang Huang and David Chiang (2007). Forest Rescoring: Faster Decoding with Integrated Language Models. Proceedings of ACL 2007. http://www.cis.upenn.edu/~lhua... Liang Huang and David Chiang (2005). Better k-best Parsing. Proceedings of IWPT 2005. http://www.cis.upenn.edu/~lhua... Speaker: Liang Huang Liang Huang is a final-year PhD student at the University of Pennsylvania, co-supervised by Aravind Joshi and Kevin Knight (USC/ ISI). He is mainly interested in the theoretical aspects of computational linguistics, in particular, efficient algorithms in parsing and machine translation, generic dynamic programming, and formal properties of synchronous grammars. He also works on applying computational linguistics to structural biology.

Channel: People & Blogs
Uploaded: March 18, 2008 at 9:07 am
Author: googletechtalks

Length: 01:19
Rating: 4.79
Views: 6268

Tags: education  engedu  google  googletechtalks  talk  talks  techtalk  techtalks  

Video Url:


Embed Code:

Video Comments

No comments.

Oklahoma Sports Fan © 2007 All Rights Reserved.