Wednesday, July 26, 2017

About SolrJ

What is SolrJ? when I Saw First , thought of something different when googling came to know that it is not different from Solr but we are contacting the Solr through the StandAlone Java Code . Its nothing but the API that talks to solr through its predefined methods.
When You guys worked on the Endeca, it is as similar as the "Presentation API" . What we have seen in myprevious posts is like the Assembler API or Directly calling the Solr Core for the Responses through the Query Paramaters.

Today I will Show how to write one simple java class that will give the Demo for the SolrJ , same like below Changes you can write for the other predefined methods as well.

package com.mycommercesearch.solr;

import java.io.IOException;
import java.net.MalformedURLException;

import org.apache.commons.codec.EncoderException;
import org.apache.commons.codec.net.URLCodec;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;

public class TestSolrQuery {
 @SuppressWarnings("rawtypes")
 public static void main(String[] args) throws MalformedURLException, SolrServerException {
  HttpSolrClient solrClient = new HttpSolrClient("http://localhost:8983/solr/refrence");

  SolrQuery query = new SolrQuery();
  URLCodec encoder = new URLCodec();

  try {
   query.setQuery(encoder.encode("planes"));

  } catch (EncoderException e1) {
   // TODO Auto-generated catch block
   e1.printStackTrace();
  }
  query.setStart(0);
  query.set("defType", "edismax");

  QueryResponse response = null;
  try {
   response = solrClient.query(query);
  } catch (IOException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  }
  SolrDocumentList results = response.getResults();
  for (int i = 0; i < results.size(); ++i) {
   SolrDocument docuemnt = results.get(i);
   System.out.println("###################");
   String name = (String) docuemnt.getFieldValue("name");
   int quantity = (int) docuemnt.getFieldValue("quantity");
   String id = (String) docuemnt.getFieldValue("id");
   String productVendor = (String) docuemnt.getFieldValue("productVendor");
   String description = (String) docuemnt.getFieldValue("description");

   System.out.println("\n" + "id=" + id + "name=" + name + "quantity=" + quantity + "productVendor=" + productVendor + "description=");

  }
  try {
   solrClient.close();
  } catch (IOException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  }
 }
}


Happy Learning !!!!!

Saturday, April 22, 2017

Defining multiple entity In Solr

Most of us when implementing the Search for the Site , the data we are going to process is not from the Same Table and same fields , for information on how to Index the Data From Database can be seen in my previous blog here. This deals only with data from multiple datasources or the data from different tables here.

Navigate to db-data-config .xml and edit it. I am going to setup the Customer Data for Search here.

<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource name="ds1" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/classicmodels" user="root" password="root"/>
<dataSource name="ds2" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/customerdata" user="root" password="root"/>

Here is the place where, I can define the different datasources . Here I have configured two types of datasources, one is called ds1 and another called as ds2. You can have different Set like hsql and XML also defined for processing.

<document>
   <entity name="products" dataSource="ds1" pk="id" query="select * from products" deltaImportQuery="select * from products"
   deltaQuery="select * from products where last_modified > '${dataimporter.last_index_time}'">
     <field column="productCode" name="id"/>
     <field column="productName" name="name"/>
     <field column="productDescription" name="description"/>
                 <field column="productLine" name="category"/>
    </entity>    
    
In the products entity, unique key is the id and we can mention the data source also here.         

   <entity name="customers" dataSource="ds2"  pk="customerNumber" query="select * from customers" deltaImportQuery="select * from customers"
   deltaQuery="select * from customers where last_modified > '${dataimporter.last_index_time}'">
     <field column="customerNumber" name="id"/>
     <field column="customerName" name="customerName"/>
     <field column="contactLastName" name="contactLastName"/>
                 <field column="contactFirstName" name="contactFirstName"/>      
                 <field column="phone" name="phone"/>
                 <field column="addressLine1" name="addressLine1"/>
                 <field column="addressLine2" name="addressLine2"/>
                 <field column="city" name="city"/>
                 <field column="state" name="state"/>
                 <field column="postalCode" name="postalCode"/>
                 <field column="country" name="country"/>
     <field column="salesRepEmployeeNumber" name="salesRepEmployeeNumber"/>
                 <field column="creditLimit" name="creditLimit"/>
  </entity>
</document>
</dataConfig>


If you are introducing the new entity it is must to have field called the id , which is used for the uniqueness of the records.

Querying for products


Querying for customers


df is the data fields that holds this indexed data. Refrence is the datafield for products and customer is the datafield for customers.


Errors:

2017-04-22 10:24:17.256 WARN  (Thread-14) [   x:refrence] o.a.s.h.d.SolrWriter Error creating document : SolrInputDocument(fields: [category=Ships, id=S72_3212, name=Pont Yacht, description=Measures 38 inches Long x 33 3/4 inches High. Includes a stand.
Many extras including rigging, long boats, pilot house, anchors, etc. Comes with 2 masts, all square-rigged, _version_=1565373662235721728])
org.apache.solr.common.SolrException: [doc=S72_3212] missing required field: city


When you face the error you have to remove the field required=”true ” or make it “false” in managed-schema.xml as like below
                <field name="city" type="string" indexed="true" stored="true" required="false" multiValued="false" />

<field name="city" type="string" indexed="true" stored="true" multiValued="false" />


If you face below error

Solr Error Document is missing mandatory uniqueKey field id


It means your document does not have the property id which is defined like below in  <uniqueKey>id</uniqueKey> in managed-schema.xml


Happy learning !!!!

Sunday, April 9, 2017

Faceting in Solr

Faceting can be defined as grouping up the grouping up the fields of search results, so that user can narrow down their search. Solr comes with simple implementation of it.


Parameter
Example
Explanation
facet
If this is set to true then facets are enable for the current search.
facet.field
Facets will be returned for these fields defined.
facet.prefix
This will return only the fields matching the prefix in the facets.
facet.contains
This will return the facets containing the term matching
facet.sort
This will sort the results based on the field given.
facet.limit
This will limit the facets to be returned.
facet.offset
This will display the facets from the given offset
facet.mincount
This will return the facets , having only the matching count.
facet.missing
This will return the facets that is matching query but not the facet matching
facet.method
Algorithm to be used .
facet.range
This has to be returned for range faceting

Saturday, April 8, 2017

Standard Query Parser Parameters

QParameter Whatever we ask solr to search, it has to be denoted with the q Parameter. Once the Solr sees this parameter it will return the search results matching this parameter.

Usgae:


Specifying Terms for the Standard Query Parser A query to the standard query parser is broken up into terms and operators.
There are two types of terms:
Solr search for the single terms and phrases(combination of the words)

Wildcard Searches

Usually             Wildcard searches, searches for all the possible results, same concept is applicable in the solr as well.We can define in both the way

1)For the particular word we can define in the following way.

Usage:

When we define in the above way, then the solr searches for the words like texting and testing as well.


2) For All the possiblites

Usage:

When we define like this , solr will search for all the possible outcomes of the given word, like testing, tested etc.

Fuzzy Searches

Solr has some unique features, by which we can achieve the other terms for the searched term.

Usage:

When we define the term like above it searches for the words like beat,feat etc.We can also define the edit distance so based on it will search for the terms .

Usage:

It will search for only feat,beat but not foat

Proximity Searches

Solr will search only in the specified distance between this two words.

Usage:

This will return results so it identifies it.

Where else


This will not return the results , since there are 5 words distance between it.

Range Searches

This will return only of the specified ranges results.

Usage:

Boosting a Term with ^

We can use ^ for boosting of the term, and making it more relevant.

Usage:

Boolean Opertors in solr

AND (&&)

NOT (!)

OR (||)

+

-




Common Query Parameters in Solr

Following tutorials deals with Solr's common query parameters, which are supported by the Search
RequestHandlers

After the Data Setup it’s our time to learn about the Querying part of Solr. Solr comes with simple parameters for the Querying.

Parameter
Url Parameter Example
Explanation
 start

When the start parameter is defined, solr start displaying the results from this .The Default value is 0. Setting the start parameter to some other number, such as 4, causes Solr to skip over the preceding records and start at the document identified by the offset.
rows
this is the row parameter, specifies the number of products should be returned in the results set.

fq (Filter Query)


this parameter is used for the filtering results from the results already returned. We can use the multiple instances of the fq parameter. We can also concanete this type of query as well.

fl (Field List)

this field list parameter is used for explicitly explaning the solr to search only these fields. It is always good practice to define this field when we have more fields in our indexing data.Only the mentioned fields will be returned.
 Debug
the debug parameter is used to see the debug information about the query.
explainOther
This is used to compare the results with the id:S10_4698

this query not only returns the debug information and also used to compare with the given value.
Wt
The wt parameter selects the Response Writer that Solr should use to format the query's response.
omitHeader
This parameter may be set to either true or false.
If set to true, this parameter excludes the header from the returned results. The header contains information
about the request, such as the time it took to complete. The default value for this parameter is false.


 logParamsList


And only the 'q' and 'fq' parameters will be logged.
By default, Solr logs all parameters of requests. From version 4.7, set this parameter to restrict which parameters of a request are logged. This may help control logging to only those parameters considered important to your
organization.

 echoParams

The echoParams parameter controls what information about request parameters is included in the response header.
 Sort

The sort parameter arranges search results in either ascending (asc) or descending (desc) order.

Sunday, April 2, 2017

Reference Application in Solr

Most of us before start learning any new technology, will want to see how this works, and should be visualized before starting it, so that the understanding or the learning will be clear. That’s why almost all of the vendors provide the Reference application for their product. This Reference application will allow us to have the strong fundamentals and good understanding of the product.
When I started learning the solr, I searched for this refrence application at that time , I did not get to know about this application , As the Day Passes by I get to know this application , so I wanted to share with you guys as well, so that you can also use it for your learning .

How to start this reference application?

Navigate to : <SOLR-Installed Dir>/solr-6.4.2/bin

Execute the below :  solr start -e techproducts

Once if you execute the above following operations will be happen automatically, ie indexing

Starting up Solr on port 8983 using command:
C:\Users\syedghouse14\Downloads\solr-6.4.2\bin\solr.cmd start -p 8983 -s "C:\Users\syedghouse14\Downloads\solr-6.4.2\example\techproducts\solr"
Solr is already setup and running on port 8983 with status:
{
  "solr_home":"C:\\Users\\syedghouse14\\Downloads\\solr-6.4.2\\example\\techproducts\\solr",
  "version":"6.4.2 34a975ca3d4bd7fa121340e5bcbf165929e0542f - ishan - 2017-03-01 23:30:23",
  "startTime":"2017-04-02T06:23:19.196Z",
  "uptime":"0 days, 0 hours, 0 minutes, 38 seconds",
  "memory":"29.5 MB (%6) of 490.7 MB"}
If this is not the example node you are trying to start, please choose a different port.
WARNING: Core 'techproducts' already exists!
Checked core existence using Core API command:
http://localhost:8983/solr/admin/cores?action=STATUS&core=techproducts
Solr techproducts example launched successfully. Direct your Web browser to http://localhost:8983/solr to visit the Solr Admin UI

Once it is started then access this application through URL: 




This Reference application has the following features .

  • TypeAhead Sugestions .
  • Global Search
  • Facet Selection/Deselction
  • Instock Facet
  • Category Selection
  • Submit / Reset
  • Simple Search
  • Spatial Search
  • Group By


You can play with the above functionalities.  You can refer my blog for more Interesting topics on Solr .  

Happy Learning !!!!!