查询
最新公告

Web信息检索(真PDF,EPUB)

English | PDF,EPUB | 2013 | 287 Pages | ISBN : 3642393136 | 14.85 MB

With the proliferation of huge amounts of (heterogeneous) data on the Web, the importance of information retrieval (IR) has grown considerably over the last few years. Big players in the computer industry, such as Google, Microsoft and Yahoo!, are the primary contributors of technology for fast access to Web-based information; and searching capabilities are now integrated into most information systems, ranging from business management software and customer relationship systems to social networks and mobile phone applications. Ceri and his co-authors aim at taking their readers from the foundations of modern information retrieval to the most advanced challenges of Web IR. To this end, their book is divided into three parts. The first part addresses the principles of IR and provides a systematic and compact description of basic information retrieval techniques (including binary, vector space and probabilistic models as well as natural language search processing) before focusing on its application to the Web. Part two addresses the foundational aspects of Web IR by discussing the general architecture of search engines (with a focus on the crawling and indexing processes), describing link analysis methods (specifically Page Rank and HITS), addressing recommendation and diversification, and finally presenting advertising in search (the main source of revenues for search engines). The third and final part describes advanced aspects of Web search, each chapter providing a self-contained, up-to-date survey on current Web research directions. Topics in this part include meta-search and multi-domain search, semantic search, search in the context of multimedia data, and crowd search.

The book is ideally suited to courses on information retrieval, as it covers all Web-independent foundational aspects. Its presentation is self-contained and does not require prior background knowledge. It can also be used in the context of classic courses on data management, allowing the instructor to cover both structured and unstructured data in various formats. Its classroom use is facilitated by a set of slides, which can be downloaded from search-computing.org


随着网络上大量(异构)数据的普及,信息检索(IR)的重要性在过去几年里显著增长。计算机行业的大型企业如谷歌、微软和雅虎一直是快速访问基于互联网信息的技术的主要贡献者;搜索功能现在已经整合到从企业管理软件到客户关系系统,再到社交网络和移动应用程序等各式各样的信息系统中。 Ceri和他的合作者旨在带领读者从现代信息检索的基础知识出发,直达Web IR领域的最前沿挑战。为此,他们的书被划分为三个部分。第一部分探讨了IR的基本原则,并在介绍基本的信息检索技术(包括二元、向量空间和概率模型以及自然语言搜索处理)之前,对它们的实施应用进行了系统而紧凑的描述,重点是将其应用于Web领域。第二部分从基础层面探讨了Web IR,讨论搜索引擎的一般架构(特别关注爬虫和索引过程),描述链接分析方法(具体来说有Page Rank和HITS),解决推荐及多样性问题,并最后介绍在搜索中的广告推送(搜索引擎的主要收入来源)。第三部分和最终部分则描述了高级的Web搜索特性,每一章提供了一个关于当前Web研究方向的自包含、及时更新的综述。这部分的主题包括元搜索和跨领域搜索、语义搜索、多媒体数据下的搜索以及众包搜索。 这本书最适合用来作为信息检索课程的内容,因为它涵盖了所有与互联网无关的基础方面。其呈现形式是独立自足的,并不需要任何先期背景知识。它也可以用于经典的数据管理课程中使用,允许教师能够涵盖各种格式中的结构化和非结构化数据。在课堂上的使用得到了一套可以在search-computing.org下载使用的幻灯片的支持。
Download from free file storage


本站不对文件进行储存,仅提供文件链接,请自行下载,本站不对文件内容负责,请自行判断文件是否安全,如发现文件有侵权行为,请联系管理员删除。