Login       My Wishlist
  My Cart
$0.00 / 0 items
 
Translate This Website
International Translation Network
 
International Access
Global Shipping Options Available
  Our Catalog   Computers & Technology   Databases & Big Data

Web Corpus Construction (Synthesis Lectures on Human Language Technologies)


Free Shipping Included! Web Corpus Construction (Synthesis Lectures on Human Language Technologies) by Morgan & Claypool Publishers at Translate This Website. MPN: black & white illustrations. Hurry! Limited time offer. Offer valid only while supplies last. The World Wide Web constitutes the largest existing source of texts written in a great variety of languages. A feasible and sound way of exploiting


Product Description

The World Wide Web constitutes the largest existing source of texts written in a great variety of languages. A feasible and sound way of exploiting this data for linguistic research is to compile a static corpus for a given language. There are several adavantages of this approach: (i) Working with such corpora obviates the problems encountered when using Internet search engines in quantitative linguistic research (such as non-transparent ranking algorithms). (ii) Creating a corpus from web data is virtually free. (iii) The size of corpora compiled from the WWW may exceed by several orders of magnitudes the size of language resources offered elsewhere. (iv) The data is locally available to the user, and it can be linguistically post-processed and queried with the tools preferred by her/him. This book addresses the main practical tasks in the creation of web corpora up to giga-token size. Among these tasks are the sampling process (i.e., web crawling) and the usual cleanups including boilerplate removal and removal of duplicated content. Linguistic processing and problems with linguistic processing coming from the different kinds of noise in web corpora are also covered. Finally, the authors show how web corpora can be evaluated and compared to other corpora (such as traditionally compiled corpora).

For additional material please visit the companion website: sites.morganclaypool.com/wcc

Table of Contents: Preface / Acknowledgments / Web Corpora / Data Collection / Post-Processing / Linguistic Processing / Corpus Evaluation and Comparison / Bibliography / Authors' Biographies

Additional Information

Manufacturer:Morgan & Claypool Publishers
Part Number:black & white illustrations
Publisher:Morgan & Claypool Publishers
Studio:Morgan & Claypool Publishers
MPN:black & white illustrations
EAN:9781608459834
Item Weight:0.58 pounds
Item Size:0.33 x 9.25 x 9.25 inches
Package Weight:0.75 pounds
Package Size:7.5 x 0.33 x 0.33 inches

Web Corpus Construction (Synthesis Lectures on Human Language Technologies) by Morgan & Claypool Publishers

Buy Now:
Web Corpus Construction (Synthesis Lectures on Human Language Technologies)

Brand: Morgan & Claypool Publishers
Condition: New
Lead Time: 1 - 2 Business Days
Availability: In Stock
$40.00


Quantity:  

 


View More In Databases & Big Data.

 


Have questions about this item, or would like to inquire about a custom or bulk order?


If you have any questions about this product by Morgan & Claypool Publishers, contact us by completing and submitting the form below. If you are looking for a specif part number, please include it with your message.

First Name:
Last Last:
Email Address:
Your Message:

Related Best Sellers


By Apress
mpn: 607 black & white illustrations, biograp, ean: 9781484209592, isbn: 1484209591,
Python Data Analytics will help you tackle the world of data acquisition and analysis using the power of the Python language. At the heart of this book lies the coverage of pandas, an open source, BSD-licensed library providing high-performance, easy...

By Apress
ean: 9781484235874, isbn: 1484235878,
Gain an accelerated introduction to domain-specific languages in R, including coverage of regular expressions. This compact, in-depth book shows you how DSLs are programming languages specialized for a particular purpose, as opposed to general purpos...

By Addison-Wesley Professional
ean: 9780134546926, isbn: 013454692X,
Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals   Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally b...

By Technics Publications
ean: 9781634621304, isbn: 9781634621304,
Master how to use the Julia language to solve business critical data science challenges. After covering the importance of Julia to the data science community and several essential data science principles, we start with the basics including how to ins...

By MySQL Press
mpn: illustrations, ean: 9780672328701, isbn: 0672328704,
Written by the creators of MySQL and edited by one of the most highly respected MySQL authors, the MySQL Administrator's Guide and Language Reference is the official guide to installing MySQL, to setting up and administering MySQL databases, and to ...

By Pearson
mpn: Illustrations, ean: 9780201314519, isbn: 0201314517,
This work prepares students for the world of computing by giving them a solid foundation in the science of computer science, algorithms. By taking an algorithm-based approach to the subject, this introductory text seeks to help students grasp overall...

By Brand: Cambridge University Press
mpn: 23882043, ean: 9780521865715, isbn: 0521865719,
Class-tested and coherent, this groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts. Written from a computer science perspective ...

By Ben Gan Itzik
mpn: 9780735685048, ean: 9780735685048, isbn: 0735685045,
T-SQL insiders help you tackle your toughest queries and query-tuning problems Squeeze maximum performance and efficiency from every T-SQL query you write or tune. Four leading experts take an in-depth look at T-SQL’s internal architecture and offe...

By Murach Joel
ean: 9781890774967, isbn: 1890774960,
If you’re an application developer, or you’re training to be one, this 2016 edition of Murach’s classic SQL Server book is made for you.To start, it presents the SQL statements that you need to retrieve and update the data in a database. These ...

By George Gr tzer
mpn: 53 black & white illustrations, 23 colou, ean: 9783319237954, isbn: 3319237950,
For over two decades, this comprehensive manual has been the standard introduction and complete reference for writing articles and books containing mathematical formulas. If the reader requires a streamlined approach to learning LaTeX for composing e...



Privacy Policy / Terms of Service
© 2018 - translateth.is. All Rights Reserved.