[2/25] Searchable: Full text indexed search in grails 13

Are you reading my blog’s feed? Be the first to know when I publish some interesting article signing up to my feed and following me on twitter!

Introduction

Groovy Version: 1.6
Grails Version: 1.1
Plugin Version: 0.5.3
Plugin Docs: http://www.grails.org/plugin/searchable
Download resources: source code screencast

-

Overview

The Searchable Plugin provides integration between Grails and, IMO, one of the most powerful open source libraries that we have. The Apache Lucene Project. I must admit that I’m a Lucene Lover, since my last project where I was leading a technical team for the largest brazilian e-commerce company and fourth worldwide. The project was totally lucene-driven to store everything you see there (yes, no database, believe me!); products, prices, categories, everything. Of course, the integration processes running backstage took all responsibility for update product prices and other stuff. For this project, we also used other important frameworks such as Apache Solr. I recommend you all look into Apache Lucene. It’s the base of the Compass Project, that is the framework that the Searchable Plugin integrates into our app.

All of this will provide us an excellent indexing tool to index our domain classes that will be searchable across our application. Searching in the Lucene index is infinitely lighter and faster than doing a “LIKE” select in any kind of relational database, and that’s why it is so awesome. So, let’s do it!

-

Download and Install

To do this example, we’ll create an application that searches in our posts archive! I’ll not save a lot of fake news articles in our bootstrap (as everybody is used to). I’ll use this tutorial to also show how to read a remote feed/rss! So, I will ask technorati what people are writing about groovy, and we’ll search on this database, I believ that this is a more realistic example :)

We’ll have a simple Post class that has only the post title, link and text, and make it searchable.

Creating the application

grails create-app postsearch
cd postsearch
grails install-plugin searchable
grails create-domain-class Post

This is the Post class

class Post {
    String title
    String link
    String body

    static searchable = true
    static constraints = {
        //constraints...
    }
}

Note that doing this:

static searchable = true

we are telling the searchable plugin that all instances of this domain class have to be indexed so we can search it later.

Take a look, now in action:

screencast

-

Technorati Integration

To get the technorati feed we’ll use to search, I build a simple class that will get the search results feed and iterate over the results and save one post for each entry. On technorati, I’ll search the following words: groovy, grails, java, griffon, springsource, g2one, acegi, groovyws, and codehaus. This will give us approximately 200 posts. I’ll create a simple controller that will just do this.

grails create-controller technorati

and this is its content

class TechnoratiController {
    def index = {
        def totalPosts = 0
        def wordList = ['groovy', 'grails', 'java', 'griffon', 'springsource',
                'g2one', 'acegi', 'groovyws', 'codehaus'].each() { word ->
            def rss = "http://feeds.technorati.com/search/${word}"
            def rssObj = new XmlSlurper().parse(rss)
            rssObj.channel.item.each { item ->
                def post = new Post(title: item.title.toString(),
                        link: item.link.toString(),
                        body: item.description.toString())
                if (post.save())
                    totalPosts++
            }
        }
        render "${totalPosts} posts indexed"
    }
}

Maybe we can turn this into a plugin later! :) That’s it, no view for it, we just need to request it to feed our database.

-

Searching with SearchController

After this you can go to the SearchableController that is installed in our application:

http://localhost:8080/postsearch/searchable

Try searching for “grails” or any other word that may have been in our technorati posts.

Note that this view uses the toString() method, so lets beautify it.

String toString() {
    return "${title}: ${body}"
}

SearchController screen

-

Changing the way fields are indexed

Our Post class is indexed using the default configuration for the Searchable plugin and that’s not the best way since the post URL is indexed as well and currently has the same relevance as its title (this is wrong, believe me). IMO, the link should not be indexed, just the title and the text of the post, and the title is much more important that its description.

To do this, we’ll use some plugin options. This plugin has A LOT of options, (it deserves a book of it, really), and all the options are described here. I strongly recommend you to read this if you use this plugin in your production environment.

Here we’ll just stick to the basics, we’ll exclude some properties being indexed and boost one field (title) that is more important. This means that when you search for “grails”, posts with “grails” in the title will come with a higher score than posts with “grails” only in the body of it.

Excluding link from being indexed

This is easy! We’ll change the static searchable = true for this one with the ‘except’ property.

static searchable = {
    except = ['link']
}

That’s it, no link will be indexed anymore. It’s recommended to index ONLY properties you really ‘ll need, otherwise your lucene index can grow to be quite large.

Boosting the title

This is easier (I don’t remember anything difficult using grails) than the last one, we’ll add the property boost to our title, and this is the final mapping closure:

static searchable = {
    except = ['link']
    title boost: 2.0
}

This will give our searches what we really want.

-

Searching – Domain classes

After installed, the plugins offer us (for domain classes marked as searchable) some methods that will search on the index. Here I’ll explain some of the most important ones.

search

The main method of this plugin. Will search across all instances of this domain class for the requested string (and options)

def postsListSeachResult = Post.search("grails")
def postsListOrderedSearchResult = Post.search("grails", [sort: 'title'])

Remember that ordering searches is not a good idea since you will lose all effective relevance-based scoring that lucene gives to each hit entry.

countHits

This method returns just the number of hits that your query retrieved in the index, useful to know how many entries will be returned if the search method was used instead. You can use as search method.

[groovy]def postsListSeachResultCount = Post.countHits("grails")
def postsListOrderedSearchResultCount = Post.countHits("grails", [sort: 'title'])

moreLikeThis and suggestQuery

“moreLikeThis” and “suggestQuery” (aka spell checking) can be done easily with Seachable Plugin, all you have to do is set these properties to the mapping closure.

Take a look here and here for more information.

-

Conclusion

This plugin is one of my favorites. If you’re planning a grails website in a production environment, this one will be your friend.

Ohh remember that this plugin is much more powerful than shown here, most configuration options available for Compass and Lucene have not been demonstrated here. This is just a small part of it!

Are you reading my blog’s feed? Be the first to know when I publish some interesting article signing up my feed and following me on twitter!

Last Tutorial:  [1/25] Acegi: Secure your grails application with no pain

Next Tutorial: [3/25] Quartz: Easy job scheduling plugin.

E foi isso que a Oracle fez com minha certificação da BEA 0

Estaria eu sendo injusto se falar que ela “jogou fora” a primeira parta da minha certificação de Arquiteto SOA da BEA. Estaria sendo injusto talvez pelo fato de que “fiquei sabendo” agora que eu tinha até o dia 01 de Dezembro para fazer a segunda parte da prova. Pois bem, não fiz.

Não fiz porque a primeira parte fui ter tempo para tirar no dia 4 de novembro (acho que a BEA nem deveria mais ofereçer essa certificação em novembro, já que ela iria se extinguir menos de um mês depois.

Bom, acontece o seguinte, a BEA tinha a certificação para Arquiteto SOA e resolveu fazê-la em duas partes:

  • Parte 1 – SOA Foundations: Prova que exigia conhecimento do Modelo “Six Domains” da BEA, conceitos de governança, SOA, conhecimento sobre a aplicabilidade do modelo, ROI na visão do cliente, obstáculos e diferentes cenários passíveis de encontro em uma empresa que estivesse iniciando a adoção.
  • Parte 2 – SOA Adoption and Implementation: Todo o restante que não está englobado acima, ou seja, a implementação mesmo, do modelos e exemplos.

Acontece que agora a Oracle, que comprou a BEA, não quis manter em duas provas, quis fazer uma prova apenas, que é chamada obviamente de: Oracle SOA Foundations, Adoption and Implementation (um nome hiper criativo, concordo :) ) e é claro que terei que tirar a certificação por completo novamente, uma pena.

Mas e se eu tentar entrar em contato, procurar alguma alternativa, o que será que eles dizem? Segue abaixo:

I have taken the Phase 1 SOA Architect exam. If I do not take the Phase 2 exam before December 1, when the exams are combined, do I have to take the full exam even though I’ve already tested on ½ of the material?


If you are pursuing the BEA Certified Architect: BEA SOA Enterprise Architecture credential, you should pass both exams before December 1, 2008. If you do not pass both exams before that date, you will need to take the full Oracle SOA Foundations, Adoption and Implementation exam in order to obtain the Oracle SOA Architect Certified Expert credential.

O jeito agora é focar em estudar o que faltava para a segunda parte, e tirar a certificação por completo!

Para quem tiver interesse, basta visitar esta página no site da Oracle.

Web Analytics