midcom_services_indexer_document_datamanagerThis class is geared to ease indexing of datamanager driven documents. The user invoking the indexing must have full read permissions to the object otherwise the NAP or Metadata objects can probably not be loaded successfully.
Basic indexing operation
This class uses a number of conventions, see below, to merge an existing, datamanager driven document into a indexing capable document. It requires the callee to instantinate the datamanager, as this class would have no idea where to take the schema database from.
Additional information is taken out of the Metadata record and the NAP record, both of which have to be available to the indexer.
The RI (the GUID) from the base class is left untouched.
Indexing field defaults:
Unless you specify anything else explicitly in the schema, the class will merge all text based fields together to form the content field of the index record, to allow for easy searching of the document. This will *not* include any metadata like keywords or summaries.
If the schema contains a field abstract, it will also be used as abstract field for the indexing process. In the same way, fields named title or author will be used for the index document's title or author respectivly. The contents of abstract, title and author will also be appended to the content field at the end of the object construction, easing searching over this fields.
If no abstract field is present, the first 200 characters of the content area are used instead.
Not all types can be indexed, check the various types in question about their indexing capabilities. In general, if the system should index any non-text field, it will use the CSV representation for implicit conversion.
Metadata processing is done by the base class.
NAP interaction:
NAP is used to determine the title of the object if the schema does not contain a field which is indexed as title. Otherwise, NAP is not yet required.
In case the NAP information cannot be retrieved, it uses the URL of the document to populate the title as a last resort.
Due to the performance drawbacks it should be avoided to rely on this bahvoir. Instead, if you don't have a field called 'title' set to auto-indexing, set another field to the index-method title (see below). Note, that you should configure that field in a way so that it is not allowed to leave the field empty, as an empty field would again trigger the NAP fallback.
Configurability using the Datamanager schema:
You can decorate datamanager fields with various directives influencing the indexing. See the Datamanager's schema documentation for details. Basically, you can choose from the following indexing methods using the key 'index_method' for each field:
Located in /midcom/services/indexer/document_datamanager.php (line 102)
midcom_services_indexer_document | --midcom_services_indexer_document_midcom | --midcom_services_indexer_document_datamanager
midcom_services_indexer_document_datamanager
midcom_services_indexer_document_datamanager
(midcom_helper_datamanager &$datamanager)
midcom_helper_datamanager
$_datamanager
= null (line 111)
The datamanager instance of the document we need to index.
This is passed by reference through the constructor.
Array
$_schema
= null (line 119)
The schema in use.
Inherited from midcom_services_indexer_document_midcom
midcom_services_indexer_document_midcom::$_metadata
Inherited from midcom_services_indexer_document
midcom_services_indexer_document::$abstract
midcom_services_indexer_document::$author
midcom_services_indexer_document::$component
midcom_services_indexer_document::$content
midcom_services_indexer_document::$created
midcom_services_indexer_document::$creator
midcom_services_indexer_document::$document_url
midcom_services_indexer_document::$edited
midcom_services_indexer_document::$editor
midcom_services_indexer_document::$indexed
midcom_services_indexer_document::$RI
midcom_services_indexer_document::$score
midcom_services_indexer_document::$security
midcom_services_indexer_document::$source
midcom_services_indexer_document::$title
midcom_services_indexer_document::$topic_guid
midcom_services_indexer_document::$topic_url
midcom_services_indexer_document::$type
midcom_services_indexer_document::$_fields
midcom_services_indexer_document::$_i18n
The constructor initializes the memebervariables and invokes _process_datamanager, which will read and process the information out of that instance.
The document is ready for indexing after construction. On any critical error, generate_error ist triggered.
This function tries to convert the field $name into a date representation. Unixdate fields are used directly (localtime is used, not GMT), other fields will be parsed with strtodate.
Invalid strings which are not parseable using strtotime will be stored as a "0" timestamp.
Returns a textual representation of the corresponding field.
Actual behavoir is dependent on the datatype. Text fields are accessed directly, for other fields, the CSV representation is used.
Text fields run through the html2text converter of the document base class.
Attention: This function accesses originally private datamanager members. It is the only possible way to access the CSV interface of individual fields.
This helper will process the given field using the guidelines given in the class documentation.
Processes the information contained in the datamanager instance.
The function iterates over the fields in the schema, and processes them according to the rules given in the introduction.
If neccessary, process the available NAP information to fill in the title field.
In case the NAP information cannot be retrieved, it uses the URL of the document as a last resort.
Inherited From midcom_services_indexer_document_midcom
midcom_services_indexer_document_midcom::midcom_services_indexer_document_midcom()
midcom_services_indexer_document_midcom::_process_metadata()
midcom_services_indexer_document_midcom::_process_topic()
Inherited From midcom_services_indexer_document
midcom_services_indexer_document::midcom_services_indexer_document()
midcom_services_indexer_document::add_date()
midcom_services_indexer_document::add_date_pair()
midcom_services_indexer_document::add_keyword()
midcom_services_indexer_document::add_result()
midcom_services_indexer_document::add_text()
midcom_services_indexer_document::add_unindexed()
midcom_services_indexer_document::add_unstored()
midcom_services_indexer_document::datamanager_get_text_representation()
midcom_services_indexer_document::dump()
midcom_services_indexer_document::fields_to_members()
midcom_services_indexer_document::get_field()
midcom_services_indexer_document::get_field_record()
midcom_services_indexer_document::html2text()
midcom_services_indexer_document::is_a()
midcom_services_indexer_document::list_fields()
midcom_services_indexer_document::members_to_fields()
midcom_services_indexer_document::remove_field()
midcom_services_indexer_document::_add_field()
midcom_services_indexer_document::_set_type()
Documentation generated on Mon, 21 Nov 2005 18:15:01 +0100 by phpDocumentor 1.3.0RC3