This project will integrate Central CERN Search into Drupal. Search Box will be provided to search within the site or globaly in Drupal sites. In advanced version of the project a data extraction module will be developed to expose site information to CERN Search backend in order to improve the search experience.
- November 2011 presentatation to ENTICE
- March 2012 presentation to ENTICE
- November 2012 presentation to ENTICE
This projects is formed by two independent but related modules.
- This module contains all the functionality related with user interfaces between CERN Search service and Drupal. That makes possible perform queries to Drupal content from Drupal itself but using as search backend the CERN Search service.
- New CERN Search block. With this new block you can do queries to CERN Search service, by default when performing a query you are redirected to CERN Search site using you desired query (but is also possible to enable complete results integration in your Drupal site)
- Search profiles. The module enables the capabilitie to enable Search profiles. Search profiles are different search context where you query will be performed. Currently three search profiles exist, This Site, All CERN and Custom search profile (this last one can be configured to contain content from different and external sites). As example, "This Site" search profile will search only on the content of the current Drupal site, while "Custom" search profile will search on all the content of the sites you configured.
- Complete results integration. If enabled, the CERN Search block will show results inside the same Drupal site, as default Drupal search. Administrator can choose if refiners are shown or not together with the results. CSS customization is possible to customize the style of the results integration page. More info in the configuration page of the module.
CERN Search Index Tools:
- This module complements the one before allowing site administrators to send the Drupal site content to CERN Search in order to be indexed with rich metadata information.
- Changes on the content of the site are detected by the module and sent to CERN Search in order to update or create the necessary documents.
- Content protection is possible by using a new specific permissions created for this purpose "Permissions for content indexed into CERN Search"
- Some of the metada extracted are: author, creation time, modification time, title, body, comments, permissions, url, alias, taxonomy, keywords,...
- Taxonomy and Keyword extraction can be configured on the module's configuration page.
- Status of the index is shown on the module's configuration page allowing administrators of the site to analize the current status of the documents indexed.
- Anonymous web crawling indexing possible in module's configuration. That allows to adminitrators to define a set of urls that will be indexed as an anonymous users sees it. Not structured data is indexed on those.
Index structured content:
This module allows to index Drupal structured content in CERN Search. This well-structured content will provide more relevant results and enhanced refiner options when you search for Drupal content on CERN Search.
Index concrete anonymous pages:
In addition with the structured content extraction, the module allows to export to CERN Search a set of URLS that will be crawled by CERN Search web crawler. Anonymous HTML view for each page will be indexed.
The module supports protected content indexing through the Permission "Permissions for content indexed into CERN Search" (Only roles configured on this permission will be able to see the content of the site as results on CERN Search). Future versions will allow individual Content-type protection through the module 'Content Access'. Improtant note, since version 1.2 Content Access integration was included and custom permission "Permissions for content indexed into CERN Search" was removed to use the default Drupal protection squema.
Advance field extraction:
It also supports Keywords and Taxonomies extraction from your content.
Re-Index + Information:
on the botom of the module's configuration page you can perform a re-index of your content (this will force a new exportation on the next cron run for all the items selected on the configuration). There you will also be able to see some index status information regarding the state of each node you are exporting to CERN Search and also information about the permissions applied for your content on CERN Search.
- Index structured content: