Webtask, Wikipedia and Slack.

Whilst playing around with webtask I thought it would be a fun idea to create a little project which, when providing the webtask with a keyword, posts a message to slack containing the first paragraph of actual content from wikipedia.

Webtask is quite a fun technology to play around with. In essence, it is running a node-server remotely on which code is ran when an HTTP call is made. We do not set up the server ourselves, so in that regard you can think of it as a 'serverless' application. We only write the code, deploy it using webtask, and access it through our browser.

If you know how to program for node.js, you know how to program for Webtask. To find out which node modules are supported, there is a handy website. Browsing around here it did seem like most of the frequently used node modules, and with almost 1K modules at the time of writing, plenty is supported.

The goal of this project is rather simple, though it can be considered more of a 'PoC' rather than an actual implementation of this idea. I did not set up a slack bot for this, but you can assume that it is made for such bots. When a user would write something akin to

@wikibot Database

a request would be made to our webtask, which posts in the slack channel the first paragraph of wikipedia, with a link to the full article.

Webtask has a great guide on getting started, so if you have not done so yet I'd recommend reading it and following the steps outlined there. Once you have your first Webtask project set up, you're good to go.

We are going to require two modules for this.

  • request
  • cheerio

Request will be used to create a POST and GET request, to slack and wikipedia respectively. Cheerio will be used to parse the HTML content from the wikipedia page, and filter for the information we want to push to slack.

With this information we can write the GET request we need for wikipedia using both these modules.

function getWikiInfo(keyword)
{
    console.log("getting wikipedia information");
    var wikiURL = 'https://en.wikipedia.org/wiki/' + keyword;

    console.log("fetching information from: " + wikiURL);

    request(wikiURL, function(error,response,body){
	if(!error && response.statusCode == 200){
	    var $ = cheerio.load(body);
	    var wikiInfo = $("#mw-content-text > p").first().text();

	    // we post this to slack, but add a link to the origin as well.
	    wikiInfo += " *... read more:* " + wikiURL.replace(' ', '%20');
	    slackPost(wikiInfo);
	}
    });
}

Don't worry about the 'keyword' parameter at this point. This is passed with the Webtask URL we will use, which will become clear at the end of the post. As you can tell, we get some Keyword for wikipedia, which can be any valid wikipedia search. e.g: Donut, Homer Simpson or Dogs.

Next we look for the first paragraph in the HTML we get back using cheerio. Afterwards, we add some of our own content to this paragraph, which includes the slack markdown formatting so the '... read more: url' will be better visible inside slack.

When we have this information, we want to push it to slack. For slack, we need two extra variables. A slack URL, which is the URL for our slack webhook, and a username for our bot. These two are stored in slackURL and username respectively in my example. The method then looks like the following:

function slackPost(wikiInfo)
{
    var slack_data = {
	'text':wikiInfo,
	'username' : 'wiki bot'
    };

    var opts = {
	method : 'POST',
	url : slackURL,
	headers : {
	    'Content-Type' : 'application/json'
	},
	json : slack_data
    };

    request(opts, function(error,response,body){
	if(!error)
	{
	    console.log("posted successfully");
	}
    });
    
}

We now have the two main methods, and we only need to put this together. Our webtask URL will contain an attribute for the wikipedia keyword we want to perform, so we'll need to fetch that out of the URL before calling our getWikiInfo method.

module.exports = function (context,cb) {
    var keyword = context.data.wiki;
    // get wiki info and post to slack.
    getWikiInfo(keyword);
    cb(null, 'Posted information about ' + keyword + ' to slack');    
}

There we go, we can now run our webtask with `wt create filename.js`. This will give us a URL which we can navigate to in order to run our code. We do need to add the 'wiki' tag to get information about a topic which we want.

For example, I have tried this out with 'Homer Simpson' and 'Donut'.

{webtask.baseurl}&wiki=Homer%20Simpson

Screen Shot 2016-08-01 at 10.11.15

{webtask.baseurl}&wiki=Donut

HomperSimpsonResult

The full code can be found on my github repository.