How to set up site search using Solr
Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.
That means that to use Solr you will need to run in some Java servlet container.
We suggest you use Jetty as it is small, light weight and simple to use.
Dependencies
On Debian you need the following packages:
apt-get install libmx4j-java libregexp-java libsablevm-classlib1-java libservlet2.4-java libtomcat5.5-java libxerces2-java sun-java5-jdk
Installing Jetty
Solr comes with Jetty in the example directory.
Installing Solr
Download Solr from the Solr website http://www.apache.org/dyn/closer.cgi/lucene/solr/
Unpack solr and copy the jetty example install to the jetty home.
unzip apache-solr-1.1*.zip
cp -R apache-solr-1.1*/example /usr/share/jettyDownload the setupfiles for Solr. (Currently just for Debian)
svn co https://svn.midgard-project.org/midgard/trunk/external-tools/indexer-backends/solr
cd solr
bash ./install-solr.sh
This will install setupfiles in the correct places and set the correct permissions.
Start solr
/etc/init.d/jetty start
Now solr is running and listening to requests on port 8983.
Create a topic with the Search component.
In the menu, choose website -> website configuration.
Set the following values:
- Indexer: Solr
- Hostname of indexer xmltcp service: localhost (or the host solr is running on)
- Port of indexer xmltcp service: 8983
Reindex your site. Visit
/midcom-exec-midcom/reindex.php. This will take some time.You should now be able to run searches on your site.
Security
In the addListener definition in jetty.xml, add the following:
<Set name="Host">localhost</Set>
So Jetty doesn't listen to requests from the outside. If you want to still access the admin interface, use firewall scripts to hide the port from most users.
See http://wiki.apache.org/solr/SolrSecurity for more information.
Troubleshooting
"Authorization required"
When you're running midcom-exec-midcom/reindex.php, and you get "Authorization required" errors, you should modify the indexer_reindex_allowed_ips. Either set it in /etc/midgard/midcom.conf or in the host settings.
For the midcom.conf file, you need to add:
$GLOBALS['midcom_config_site']['indexer_reindex_allowed_ips'] = array('127.0.0.1','192.168.126.128','127.0.1.1');
"Indexer failed"
If you get an "indexer failed" error when reindexing the site, ensure that SOLR's data directory exists and is writable:
mkdir /usr/share/jetty/solr/data
chown jetty /usr/share/jetty/solr/data
