Open Source Content Management Framework

How to set up site search using Solr

  1. About Solr
  2. Using solr-tomcat5.5
  3. Using Jetty
    1. Dependencies
    2. Installing Jetty
    3. Installing Solr
    4. Security
    5. Troubleshooting
      1. "Authorization required"
      2. "Indexer failed"

About Solr

Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface. It runs in a Java servlet container such as Tomcat.

That means that to use Solr you will need to run in some Java servlet container.

Using solr-tomcat5.5

On Ubuntu et co we have this package available so install it and then do the following:

  1. Make sure the Catalina process uses UTF-8 for URI parsing
  2. Download the MidCOM Solr Schema and copy over /etc/solr/conf/schema.xml

Using Jetty

We suggest you use Jetty as it is small, light weight and simple to use.

Dependencies

On Debian you need the following packages:

apt-get install libmx4j-java libregexp-java libsablevm-classlib1-java libservlet2.4-java libtomcat5.5-java libxerces2-java sun-java5-jdk

Installing Jetty

Solr comes with Jetty in the example directory.

Installing Solr

You must use version 1.1 of Solr

  1. Download Solr from the Solr website http://www.apache.org/dyn/closer.cgi/lucene/solr/

  2. Unpack solr and copy the jetty example install to the jetty home.

    unzip apache-solr-1.1*.zip
    cp -R apache-solr-1.1*/example /usr/share/jetty

  3. Download the setupfiles for Solr. (Currently just for Debian)

    svn co https://svn.midgard-project.org/midgard/trunk/external-tools/indexer-backends/solr
    cd solr
    bash ./install-solr.sh

This will install setupfiles in the correct places and set the correct permissions.

  1. Start solr

    /etc/init.d/jetty start

Now solr is running and listening to requests on port 8983.

  1. Create a topic with the Search component.

  2. In the menu, choose website -> website configuration.

  3. Set the following values:

  • Indexer: Solr
  • Hostname of indexer xmltcp service: localhost (or the host solr is running on)
  • Port of indexer xmltcp service: 8983
  1. Reindex your site. Visit /midcom-exec-midcom/reindex.php. This will take some time.

  2. You should now be able to run searches on your site.

Security

In the addListener definition in jetty.xml, add the following:

<Set name="Host">localhost</Set>

So Jetty doesn't listen to requests from the outside. If you want to still access the admin interface, use firewall scripts to hide the port from most users.

See http://wiki.apache.org/solr/SolrSecurity for more information.

Troubleshooting

"Authorization required"

When you're running midcom-exec-midcom/reindex.php, and you get "Authorization required" errors, you should modify the indexer_reindex_allowed_ips. Either set it in /etc/midgard/midcom.conf or in the host settings.

For the midcom.conf file, you need to add:

$GLOBALS['midcom_config_site']['indexer_reindex_allowed_ips'] = array('127.0.0.1','192.168.126.128','127.0.1.1');

"Indexer failed"

If you get an "indexer failed" error when reindexing the site, ensure that SOLR's data directory exists and is writable:

mkdir /usr/share/jetty/solr/data  
chown jetty /usr/share/jetty/solr/data
Designed by Nemein, hosted by Anykey