[3/25] Building a RSS Reader with Quartz Plugin – Grails Tutorial 4

Are you reading my blog’s feed? Be the first to know when I publish some interesting article signing up to my feed and following me on twitter!

 

Groovy Version: 1.6
Grails Version: 1.1
Plugin Version: 0.4.1-SNAPSHOT
Plugin Docs: http://grails.org/plugin/quartz
Download Resources: source code screencast 1 screencast 2

Hello,

In this tutorial, we’ll talk about the Quartz plugin used to schedule jobs executions in your application. The plugin is build on top of the Quartz Job Scheduler Library from OpenSymphony. OpenSympony is the company that built the WebWork framework, that is now called Struts2 after Apache “aquisition”.

“Scheduling jobs” is very useful in your application to cover background needs. Some tasks you’ll need to execute undercover your application some times (invalidating old users that have not logged for more than 1 month) or even async processes that you’ll have to do if you do not have a JMS infrastructure, for example, sending e-mails to a lot of people.

In our example, we’ll build a simple RSS Reader that will use the quartz plugin to schedule fetchs it will be done in the feeds and insert in the database. Our application will mainly have one domain class called Post (seen in the last tutorial), a Feed domain class to store our feeds and a similar RSS Parser from technorati.  (Yes, I love the RSS format).

Initially, we’ll create the app, install the quartz plugin, create the domain classes and the Feed scaffold structure

grails create-app feedreader
cd feedreader
grails install-plugin quartz
grails create-domain-class Post

We’ll insert the Post domain class code

class Post {
    String title
    String link
    String body
}

We have to create the Feed domain class and its scaffold structure.

grails create-domain-class Feed
class Feed {
    String word
    String url
}

Scaffolding…

grails generate-all Feed

screencast-1
screencast

After this, we’ll create our Technorati Feed Parser from this code above.

class TechnoratiService {
    boolean transactional = false
    def parseAndSave(rss) {
        def rssObj = new XmlSlurper().parse(rss)
        rssObj.channel.item.each {
            def post = new Post(title: it.title.toString(),
                    link: it.link.toString(),
                    body: it.description.toString())
            post.save()
            println "Post [${post}] saved."
        }
    }
}

We’ll run our application using the grails run-app command and insert some feeds. Note that we’ve configured our datasource to use hsqldb storing in the filesystem instead of regular memory setup. 

Note that we have one JobController that Quartz install for us, forget about it, ok? We’ll create our own job after the second screencast.

screencast-2
screencast

Now, we have to understant some quartz properties and commands. 

When we install the quatz plugin, it installs another command for us the grails create-job MyJob, with it we’ll create our FeedParserJob. Note that we use convention over configuration with all jobs having *Job names. 

grails create-job FeedParser

Job classes have to implement the execute() method. This method is the one that Quartz will trigger when it’s time to execute the job. To define when the job it will be executed and what’s the interval between executions, I suggest you read the plugin documentation witch shows N ways to do this. In our example we’ll use a cron expression similar to *N*X OS systems setting our job to execute once in five minutes.

Our cron expression will be like this:  “0 0/5 * * * ?”

Depending on your jobs requisites, it may run concurrently with another instance of it or not. In our case, we’ll not start other job execution if the last on is still running. To prevent this behavior, we can set the concurrent property to false

def concurrent = false

Our job will essentially look for the feeds we’ve inserted on the database, and for each one it will call the Technorati service asking for new Posts. The final source for our job is the one below:

class FeedParserJob {
    def concurrent = false
    def cronExpression = "0 0/5 * * * ?"
    
    def technoratiService

    def execute() {
        def feedList = Feed.findAll()
        for (Feed feed : feedList) {
            println "Reading feed ${feed.word} @ ${feed.url}"
            technoratiService.parseAndSave(feed.url)
        }
    }
}

As you can see in the example above, you can inject any spring bean in your job, just declare it as I did with my TechnoratiService! :) (this is really great!)

That’s it, if you run you application you’ll see that every 5 minutes (minutes 0,5,10,15…) the job will be called and every posts technorati returns will be inserted on your database. Note that in this simple example we did not check if the post had been already inserted in the database before inserting it, this will just grow our database with a lot of instances representing the same post. This can be avoided checking if the post already exists before inserting it  (just check if you have any Post with the same link), but I’ll left this for you!

Before finish this, let’s just improve a little bit our post list view.

 

Tela de posts

 

 

Now, try to enrich its interface, adding some ajax to get only the new posts since the last fech! Maybe you can start from this your new Google Reader killer! :P

Now, let me know, are you using this plugin in your production environment? What for? What kind of jobs you do with it? 

Thanks!

Are you reading my blog’s feed? Be the first to know when I publish some interesting article signing up to my feed and following me on twitter!

 

Next tutorial: [4of25] Jasper Plugin

Past tutorials:
        [2of25] Searchable Plugin
        [1of25] AcegiSecurity Plugin

[2/25] Searchable: Full text indexed search in grails 13

Are you reading my blog’s feed? Be the first to know when I publish some interesting article signing up to my feed and following me on twitter!

Introduction

Groovy Version: 1.6
Grails Version: 1.1
Plugin Version: 0.5.3
Plugin Docs: http://www.grails.org/plugin/searchable
Download resources: source code screencast

-

Overview

The Searchable Plugin provides integration between Grails and, IMO, one of the most powerful open source libraries that we have. The Apache Lucene Project. I must admit that I’m a Lucene Lover, since my last project where I was leading a technical team for the largest brazilian e-commerce company and fourth worldwide. The project was totally lucene-driven to store everything you see there (yes, no database, believe me!); products, prices, categories, everything. Of course, the integration processes running backstage took all responsibility for update product prices and other stuff. For this project, we also used other important frameworks such as Apache Solr. I recommend you all look into Apache Lucene. It’s the base of the Compass Project, that is the framework that the Searchable Plugin integrates into our app.

All of this will provide us an excellent indexing tool to index our domain classes that will be searchable across our application. Searching in the Lucene index is infinitely lighter and faster than doing a “LIKE” select in any kind of relational database, and that’s why it is so awesome. So, let’s do it!

-

Download and Install

To do this example, we’ll create an application that searches in our posts archive! I’ll not save a lot of fake news articles in our bootstrap (as everybody is used to). I’ll use this tutorial to also show how to read a remote feed/rss! So, I will ask technorati what people are writing about groovy, and we’ll search on this database, I believ that this is a more realistic example :)

We’ll have a simple Post class that has only the post title, link and text, and make it searchable.

Creating the application

grails create-app postsearch
cd postsearch
grails install-plugin searchable
grails create-domain-class Post

This is the Post class

class Post {
    String title
    String link
    String body

    static searchable = true
    static constraints = {
        //constraints...
    }
}

Note that doing this:

static searchable = true

we are telling the searchable plugin that all instances of this domain class have to be indexed so we can search it later.

Take a look, now in action:

screencast

-

Technorati Integration

To get the technorati feed we’ll use to search, I build a simple class that will get the search results feed and iterate over the results and save one post for each entry. On technorati, I’ll search the following words: groovy, grails, java, griffon, springsource, g2one, acegi, groovyws, and codehaus. This will give us approximately 200 posts. I’ll create a simple controller that will just do this.

grails create-controller technorati

and this is its content

class TechnoratiController {
    def index = {
        def totalPosts = 0
        def wordList = ['groovy', 'grails', 'java', 'griffon', 'springsource',
                'g2one', 'acegi', 'groovyws', 'codehaus'].each() { word ->
            def rss = "http://feeds.technorati.com/search/${word}"
            def rssObj = new XmlSlurper().parse(rss)
            rssObj.channel.item.each { item ->
                def post = new Post(title: item.title.toString(),
                        link: item.link.toString(),
                        body: item.description.toString())
                if (post.save())
                    totalPosts++
            }
        }
        render "${totalPosts} posts indexed"
    }
}

Maybe we can turn this into a plugin later! :) That’s it, no view for it, we just need to request it to feed our database.

-

Searching with SearchController

After this you can go to the SearchableController that is installed in our application:

http://localhost:8080/postsearch/searchable

Try searching for “grails” or any other word that may have been in our technorati posts.

Note that this view uses the toString() method, so lets beautify it.

String toString() {
    return "${title}: ${body}"
}

SearchController screen

-

Changing the way fields are indexed

Our Post class is indexed using the default configuration for the Searchable plugin and that’s not the best way since the post URL is indexed as well and currently has the same relevance as its title (this is wrong, believe me). IMO, the link should not be indexed, just the title and the text of the post, and the title is much more important that its description.

To do this, we’ll use some plugin options. This plugin has A LOT of options, (it deserves a book of it, really), and all the options are described here. I strongly recommend you to read this if you use this plugin in your production environment.

Here we’ll just stick to the basics, we’ll exclude some properties being indexed and boost one field (title) that is more important. This means that when you search for “grails”, posts with “grails” in the title will come with a higher score than posts with “grails” only in the body of it.

Excluding link from being indexed

This is easy! We’ll change the static searchable = true for this one with the ‘except’ property.

static searchable = {
    except = ['link']
}

That’s it, no link will be indexed anymore. It’s recommended to index ONLY properties you really ‘ll need, otherwise your lucene index can grow to be quite large.

Boosting the title

This is easier (I don’t remember anything difficult using grails) than the last one, we’ll add the property boost to our title, and this is the final mapping closure:

static searchable = {
    except = ['link']
    title boost: 2.0
}

This will give our searches what we really want.

-

Searching – Domain classes

After installed, the plugins offer us (for domain classes marked as searchable) some methods that will search on the index. Here I’ll explain some of the most important ones.

search

The main method of this plugin. Will search across all instances of this domain class for the requested string (and options)

def postsListSeachResult = Post.search("grails")
def postsListOrderedSearchResult = Post.search("grails", [sort: 'title'])

Remember that ordering searches is not a good idea since you will lose all effective relevance-based scoring that lucene gives to each hit entry.

countHits

This method returns just the number of hits that your query retrieved in the index, useful to know how many entries will be returned if the search method was used instead. You can use as search method.

[groovy]def postsListSeachResultCount = Post.countHits("grails")
def postsListOrderedSearchResultCount = Post.countHits("grails", [sort: 'title'])

moreLikeThis and suggestQuery

“moreLikeThis” and “suggestQuery” (aka spell checking) can be done easily with Seachable Plugin, all you have to do is set these properties to the mapping closure.

Take a look here and here for more information.

-

Conclusion

This plugin is one of my favorites. If you’re planning a grails website in a production environment, this one will be your friend.

Ohh remember that this plugin is much more powerful than shown here, most configuration options available for Compass and Lucene have not been demonstrated here. This is just a small part of it!

Are you reading my blog’s feed? Be the first to know when I publish some interesting article signing up my feed and following me on twitter!

Last Tutorial:  [1/25] Acegi: Secure your grails application with no pain

Next Tutorial: [3/25] Quartz: Easy job scheduling plugin.

Como adicionar seus “Itens Compartilhados” do Google Reader no blog 0

Eu já havia até citado que tinha feito isso neste post, mas eu estava tendo sérios problemas com os links que eram gerados pelo widget. Isto acontecia pelo fato do leitor de RSS do Wordpress usar o MagPie RSS Reader, que “trata” o feed de uma maneira no mínimo esquisita, fazendo com que na url fosse concatenado o nome do blog/post. Enfim, links quebrados o tempo todo.

Resolvi pesquisar sobre algum outro widget que fizesse a leitura de RSSs externos, quando enfim achei um ou 2 candidatos, fui até meu Google Reader procurar pela URL do meu feed dos “Shared Items”, e foi lá mesmo que encontrei a opção “You can insert this clip on your website”. Simplesmente fantástico, mais um daqueles “build-everything-right-here-right-now” do Google. Se resume ao fato de você digitar um título para o seu “clip”, o estilo de cor que você quer que ele tenha (nenhuma no meu caso, para assumir o layout do WP), e a quantidade de itens que serão mostrados.

Bingo! É gerado um javascript que basta inserir em seu HTML que automáticamente seus feeds serão lidos! Para testar, basta tentar clicar nos “links interessantes”, último bloco da barra lateral!

[]s,

Lucas

Compartilhamento de Feeds através do Google Reader 0

O Google Reader (google.com/reader) é um leitor de RSS bem bacana do Google, uso ele faz um tempinho já, acho que desde o ano passado, quando o Paulo (Bahiano) me mostrou aqui no trabalho. Nem muito costume de RSS eu tinha naquela época, e de lá pra cá, passo uma boa parte do dia fazendo isso :-)

Bom, um recurso que eu vi que tem, é a possiblidade de você (além de categorizar as entradas, marcar com estrela e etc, como no GMail), é a possibilidade de você “Compartilhar” um post alheio. Isso é fácil de se fazer, basta clicar no “Compartilhar” abaixo de cada entrada do blog.

Você também tem a opção de compartilhar uma entrada, ainda escrevendo alguma informação pequena e objetiva sobre o assunto, por exemplo: “Esse cara tá viajando.”, ou então, “Realmente, Java é muito legal para se fazer isto”, e etc. Com isso, o google gera automaticamente uma página para você, com os feeds que você marcou como “Compartilhados”.

Bom, seria mais ou menos como construir um blog muito rapidamente, sem se preocupar com criar, hospedar, arrumar o template e além de tudo, sem se preocupar com “Escrever” realmente né, uma vez que você irá apenas usar as entradas alheias. :]

Sinceramente, não tinha achado muita utilidade para isso, quando brinquei com a funcionalidade no começo, porém agora (em algum momento no tempo entre a última vez que entrei, ontem) eles adicionaram um recurso que senti falta no começo, que é gerar um RSS através das suas entradas compartilhadas.

(Respire fundo para ler este parágrafo) Isso mesmo,  você seleciona quais são as entradas interessantes, e o Google Reader automaticamente gera um RSS para você repassar para as pessoas que estão interessadas em ler no Google Reader, o que você achou interessante no seu Google Reader (se você tem ela nos seus contatos, você já ve diretamente).

Tá, mas se eu já tenho um blog, e se uma coisa é realmente muito interessante para ler, a pessoa acessando meu blog verá algo a respeito disso, certo? Certo sim! 

Então o que fazer com essa tal funcionalidade? Sei lá ué, faça o que quiser… Eu coloquei um widget no meu Blog fazendo ele ler o RSS das minhas entradas compartilhadas, ou seja, nesse menu aí do lado, tem um bloco chamado “Textos interessantes”, estes textos são entradas que eu compartilhei no meu Google Reader e apareceram aí sem dor nem trabalho nenhum! :]

Gostei muito da combinação! Não se desesperem se começar a aparecer posts de vocês no meu blog, quer dizer, não só que eu leio seu blog, mas sim, que eu gosto de seus textos.

[]s,

Lucas

Web Analytics