- 浏览: 341749 次
- 来自: NA
文章分类
最新评论
-
上官车月:
实验成功,转载了
Java获取请求客户端的真实IP地址 -
url_nc:
very good
css教程–十步学会用css建站(全) -
hiveer:
楼主我想问能不能不在class里面定义get_binding ...
强大的ruby模版:ERB -
ilovebaby0530:
修改密码后需要 FLUSH PRIVILEGES;
绿色版mysql安装步骤 -
albrich:
你这个方法也是不行的,得到的仍然是内网的IP
Java获取请求客户端的真实IP地址
打上SOLR-236_collapsing.patch补丁,实现 solr 搜索结果折叠、除去重复的搜索结果,可以实现类似google搜索结果的“站内的其它相关信息 ”。solr collapsing patch 是用 hash 某个字段来实现折叠重复结果的。下面我演示下应用这个补丁并搜索试用下。
其实 solr 上已经有了这功能的实现:solr 1.3 collapse patch, 请看:https://issues.apache.org/jira/browse/SOLR-236 ,我这里下载是了新的:https://issues.apache.org/jira/secure/attachment/12403590/SOLR-236_collapsing.patch 。
下载 好后就需要打上补丁了,先准备一份源码在D:/apache-solr-1.3.0目录下。没有可以去下载:http: //archive.apache.org/dist/lucene/solr/1.3.0/apache-solr-1.3.0.zip。把SOLR- 236_collapsing.patch文件放在D:/apache-solr-1.3.0目录下, 打补丁有我知道的有两种:用linux工具 patch(windows 下有 cygwin);用 ant 的 patch。
windows cygwin 的 patch:
D:\apache-solr-1.3.0>patch -p0 < SOLR-236_collapsing.patch
patching file src/test/org/apache/solr/search/TestDocSet.java
patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
patching file src/java/org/apache/solr/common/params/CollapseParams.java
patching file src/java/org/apache/solr/handler/component/CollapseComponent.java
ant patch,把下面的内容保存为 patch-build.xml 放到 D:\apache-solr-1.3.0 目录下:
- <? xml version = “1.0″ encoding = “UTF-8″ ?>
- < project name = “solr-patch” default = “apply-patch” basedir = “.” >
- < target name = “apply-patch” description = “Apply a patch file. Set -Dpatch.file” >
- < patch patchfile = “${patch.file}” strip = “0″ />
- </ target >
- </ project >
ant 打补丁:
D:\apache-solr-1.3.0>ant -Dpatch.file=SOLR-236_collapsing.patch -f patch-build.xml
Buildfile: patch-build.xmlapply-patch:
[patch] patching file src/test/org/apache/solr/search/TestDocSet.java
[patch] patching file src/java/org/apache/solr/search/CollapseFilter.java
[patch] patching file src/java/org/apache/solr/search/DocSet.java
[patch] patching file src/java/org/apache/solr/search/NegatedDocSet.java
[patch] patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
[patch] patching file src/java/org/apache/solr/common/params/CollapseParams.java
[patch] patching file src/java/org/apache/solr/handler/component/CollapseComponent.javaBUILD SUCCESSFUL
Total time: 0 seconds
源码打上了补丁,然后用 ant 构建源码:
D:\apache-solr-1.3.0>ant dist
在 D:/apache-solr-1.3.0/dist 目录下可以找到编译好的 solr 了。然后把 solr 放到 tomcat 中去运行它,把下面的内容保存在 TOMCAT_HOME/conf/Catalina/localhost/solr.xml 文件中:
- < Context docBase = “D:\apache-solr-1.3.0\dist\apache-solr-1.3.0.war” reloadable = “true” >
- < Environment name = “solr/home” type = “java.lang.String” value = “D:\apache-solr-1.3.0\example\solr” override = “true” />
- </ Context >
修改 D:\apache-solr-1.3.0\example\solr\conf\solrconfig.xml 使 solr 可以支持 collapse。
定义搜索组件,在 QueryComponent 附近:
- < searchComponent name = “collapse” class = “org.apache.solr.handler.component.CollapseComponent” />
定义一个 handler 使用上面的搜索组件:
- < requestHandler name = “collapse” class = “solr.SearchHandler” >
- <!– default values for query parameters –>
- < lst name = “defaults” >
- < str name = “echoParams” > explicit </ str >
- </ lst >
- < arr name = “components” >
- < str > collapse </ str >
- < str > debug </ str >
- </ arr >
- </ requestHandler >
安装启动 tomcat,现在提交一些数据给它,用官方的示例数据就可以了。运行:
D:\apache-solr-1.3.0\example\exampledocs>java -Durl=http://localhost:8080/solr/update -Dcommit=yes -jar post.jar *.xml
SimplePostTool: version 1.2
SimplePostTool: WARNING: Make sure your XML documents are encoded in UTF-8, other encodings are not currently supported
SimplePostTool: POSTing files to http://localhost:8080/solr/update..
SimplePostTool: POSTing file hd.xml
SimplePostTool: POSTing file ipod_other.xml
SimplePostTool: POSTing file ipod_video.xml
SimplePostTool: POSTing file mem.xml
SimplePostTool: POSTing file monitor.xml
SimplePostTool: POSTing file monitor2.xml
SimplePostTool: POSTing file mp500.xml
SimplePostTool: POSTing file sd500.xml
SimplePostTool: POSTing file solr.xml
SimplePostTool: POSTing file spellchecker.xml
SimplePostTool: POSTing file utf8-example.xml
SimplePostTool: POSTing file vidcard.xml
SimplePostTool: COMMITting Solr index changes..
http://localhost:8080/solr/admin/stats.jsp 有结果了? 有了。然后开始查询试试看。
结果:
- <? xml version = “1.0″ encoding = “UTF-8″ ?>
- < response >
- < lst name = “responseHeader” >
- < int name = “status” > 0 </ int >
- < int name = “QTime” > 0 </ int >
- < lst name = “params” >
- < str name = “collapse.field” > popularity </ str >
- < str name = “fl” > id </ str >
- < str name = “collapse.threshold” > 1 </ str >
- < str name = “indent” > on </ str >
- < str name = “q” > *:* </ str >
- < str name = “qt” > collapse </ str >
- < str name = “collapse” > true </ str >
- </ lst >
- </ lst >
- < lst name = “collapse_counts” >
- < str name = “field” > popularity </ str >
- < lst name = “doc” >
- < int name = “SP2514N” > 4 </ int >
- < int name = “F8V7067-APL-KIT” > 1 </ int >
- < int name = “MA147LL/A” > 1 </ int >
- < int name = “TWINX2048-3200PRO” > 1 </ int >
- < int name = “VS1GB400C3″ > 3 </ int >
- < int name = “1″ > 10 </ int >
- </ lst >
- < lst name = “count” >
- < int name = “6″ > 4 </ int >
- < int name = “1″ > 1 </ int >
- < int name = “10″ > 1 </ int >
- < int name = “5″ > 1 </ int >
- < int name = “7″ > 3 </ int >
- < int name = “0″ > 10 </ int >
- </ lst >
- < str name = “debug” > HashDocSet(6) Time(ms): 0/0/0/0 </ str >
- </ lst >
- < result name = “response” numFound = “6″ start = “0″ >
- < doc >
- < str name = “id” > SP2514N </ str >
- </ doc >
- < doc >
- < str name = “id” > F8V7067-APL-KIT </ str >
- </ doc >
- < doc >
- < str name = “id” > MA147LL/A </ str >
- </ doc >
- < doc >
- < str name = “id” > TWINX2048-3200PRO </ str >
- </ doc >
- < doc >
- < str name = “id” > VS1GB400C3 </ str >
- </ doc >
- < doc >
- < str name = “id” > 1 </ str >
- </ doc >
- </ result >
- </ response >
可以看到 collapse_counts 相关的输出:
- < lst name = “collapse_counts” >
- < str name = “field” > popularity </ str >
- < lst name = “doc” >
- < int name = “SP2514N” > 4 </ int >
- …
- </ lst >
- < lst name = “count” >
- < int name = “6″ > 4 </ int >
- < int name = “1″ > 1 </ int >
- < int name = “10″ > 1 </ int >
- < int name = “5″ > 1 </ int >
- < int name = “7″ > 3 </ int >
- < int name = “0″ > 10 </ int >
- </ lst >
- < str name = “debug” > HashDocSet(6) Time(ms): 0/0/0/0 </ str >
- </ lst >
上面的 count 下的内容(它的顺序是result/doc的顺序),表示 popularity=6 相同的结果还有 4 个,与 popularity=1 相同的结果还有 1 个,依此类推。这样就可以显示给用户的界面里提示“相同的其它内容不有N个”。
使用的参数有:
- #启用 collapse 组件
- collapse=true
- #用那个字段来 hash 去除重量内容
- collapse.field=popularity
- #要结果中可以最多出现几个相同的文档
- collapse.threshold=1
当然还有其它参数,请看:org.apache.solr.common.params.CollapseParams 类。
原文出处:http://blog.chenlb.com/2009/04/apply-solr-collapsing-patch-remove-duplicate-result.html
发表评论
-
Apache Solr schema.xml及solrconfig.xml文件中文注解
2011-11-04 20:39 1949schema.xml位于solr/conf/目录下,类似于数据 ... -
自定义评分器Similarity提高搜索体验
2011-11-04 20:35 2039http://www.gbsou.com/2011/11/01 ... -
Solr的扩展(Scaling)以及性能调优
2011-11-04 20:16 3110当你的索引数量越来 ... -
lucene下的contrib包介绍
2010-09-10 18:23 2458analyzers 下分为两个包 ... -
构建可伸缩,高性能的互联网应用(copy from http://yuquan-nana.javaeye.com/blog/710302)
2010-07-12 13:31 1972时间过得很快,来淘宝已经两个月了,在这两个月的时间里,自己也感 ... -
实时检索系统Zoie实现分析
2010-05-11 14:19 5024实时检索系统Zoie实现分析 实时检索的核心原理 通 ... -
ImageMagick, JMagick安装、配置(windows版)
2009-09-29 21:55 5409ImageMagick, JMagick安装、配置(windo ... -
使用HttpClient4.0调用JavaEye API
2009-09-19 21:22 2469package com.javaeye.client; ... -
spam搜索引擎垃圾技术的统称
2009-07-24 13:53 809在搜索引擎优化 ... -
使用org.apache.commons.net.ftp包开发FTP客户端,实现进度汇报,实现断点续
2009-03-12 11:06 4963利用org.apache.commons.net.ftp包实现 ... -
Java的Excel报表开源工具
2008-12-24 20:25 2430http://jdkcn.com/entry/opensour ... -
Apache Commons工具集简介
2008-12-09 12:11 1918Commons BeanUtils http://jakart ... -
Apache开源项目分类列表
2008-11-03 16:41 2177分类 项目名 说明 开发语言 服务器 ... -
JDK5多线程框架java.util.concurrent
2008-10-30 17:59 3798JDK5中的一个亮点就是将Doug Lea的并发库引入到Jav ... -
Java处理图片
2008-10-05 13:08 2142图片上传到服务器后,会根据情况将图片缩小成一个图标,我们可以利 ... -
XFire 入门
2008-09-11 15:17 1115http://www.ibm.com/developerwor ... -
如何查看网站被百度或者google收录多少网页?
2008-08-09 02:16 4855在google或者百度的搜索框输入: site:www.ite ... -
分布式计算开源框架Hadoop介绍
2008-08-07 14:21 1500作者 岑文初 发布于 2008年8月4日 ...
相关推荐
apache solr搜索系统的.Net实现
Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook Apache Solr 4 Cookbook
Spring Data for Apache Solr API。 Spring Data for Apache Solr 开发文档
Apache Solr for Indexing Data
apache solr 源文件 版本为3.6.1 让你能够更好地了解solr实现,更好的使用solr
Apache Solr lucene 搜索模块设计实现 Solr 模块 架构 lucene 搜索
Apache Solr Search
Apache Solr Essentials is a fast-paced guide to help you quickly learn the process of creating a scalable, efficient, and powerful search application. The book starts off by explaining the ...
Apache Solr 3 Enterprise Search Server 部分中文翻译 从博客上面保存下来的。是网页版,方便大家查看
apache solr 官方文档(英文原版) 包含详细的安装、Schema配置、solrConfig配置、管理页面使用等.
apache solr guide 4.7
Apache Solr(solr-8.11.1.tgz)Binary releases 二进制版本
Apache Solr 1.3.0发布,Apache Solr是一个性能强大的,基于 Lucene 的全文搜索的 开源企业级搜索服务器,拥有XML/HTTP,JSON APIs,hit highlighting, faceted search, caching, replication,web管理界面等很多功能...
《apachesolr7官方指南》
Apache Solr(solr-8.11.1.zip)Binary releases 二进制版本
Apache Solr is one of the most popular open source search servers available on the web. However, simply setting up Apache Solr is not enough to ensure the success of your web product. To maximize ...
Title: Apache Solr Search Patterns Author: Jayant Kumar Length: 250 pages Edition: 1 Language: English Publisher: Packt Publishing Publication Date: 2015-03-31 ISBN-10: 1783981849 ISBN-13: ...
Apache Solr 4 Cookbook.pdf solr
Apache Solr(solr-7.7.3.tgz)Binary releases 二进制版本