Another day, another gist. Today’s was prompted by a question on the eXist-db mailing list about how to access OAuth-based services like the Google API with XQuery. I happened to have just been working on accessing the OAuth-based Twitter v1.1 API for the new social media section of my office’s homepage, so I posted the code and some pointers. Like the gist I posted yesterday, I hope others can use these bits of XQuery code.

But there’s a back story and, dare I say, some illustrative lessons, to this latest addition to my series of posts and gists on XQuery.

Until recently, writing a program to retrieve one’s latest tweets was as simple as going to the Twitter homepage is: you just made a basic, unauthenticated HTTP request to Twitter’s servers to get the data you needed. But with version 1.1 of Twitter’s API, Twitter announced a new requirement - that all requests to its API be signed and authenticated using the OAuth 1.0 protocol. This complicated the task of getting data from Twitter exponentially. The OAuth protocol, while not rocket science, requires one to jump through a rather intricate sequence of steps to compose the parameters of your request, and then cryptographically sign the request with a hashing function. (I’m not complaining about the protocol; it does a great job providing an authentication layer to the web. I’m just saying that requiring OAuth to retrieve tweet imposes a pretty heavy burden on users and developers.) If that weren’t enough, Twitter also ended support for the XML-based Atom format, leaving just JSON as the format it returned results as. That left me with two problems.

First, XQuery’s rich function library does not include the HMAC-SHA1 cryptographic hashing algorithm needed to sign OAuth requests. So I turned to Claudius Teodorescu, who applied his considerable Java skills to the task of creating an HMAC-SHA1 function for eXist-db, the XQuery-based server that powers history.state.gov. We took it a step further, releasing Claudius’s work to the EXPath community in the form of an specification: the EXPath Crypto Module. The EXPath community builds up common standards for XPath and XQuery implementations. Claudius also released his module as an EXPath package for eXist-db, which is now available in the eXist-db Public Package Repository for anyone to download and install (to do so, go to eXist-db’s Dashboard, click on Package Manager, and find “EXPath Cryptographic Module Implementation” in the list of packages”). Look at the prolog of the OAuth module I posted in today’s gist, and you’ll see that it imports Claudius’s module.

So I was able to check OAuth off my list of problems.

But besides handling OAuth, I also needed a way to deal with JSON. JSON is increasingly ubiquitous data format in the world of APIs, but its data model is subtly incompatible with XML, and XML-based software like eXist-db has a difficult time ingesting or searching JSON data. Luckily, there were a number of XQuery libraries for me to choose from, and I decided to use one that John Snelson wrote for XQilla. With his permission, I updated it a bit, using some new features in XQuery 3.0 to make his library implementation-independent, and released the updated library on GitHub. Thanks to GitHub’s mechanisms for code contributions (“pull requests”), the library has already received several improvements from the community. The package is also available in eXist-db’s public app repository and the CXAN package repository. (I’m also eagerly following the JSONiq project which is working on extending XQuery to deal natively with JSON, obviating the need to convert JSON to XML to deal with it.)

So I was able to check JSON off my list of problems.

This paved the way for yesterday’s addition of social media links to the homepage of history.state.gov, and coming soon, a complete, searchable social media archive.

All in all, a story — albeit not unique — of open source communities working together to build solutions to common challenges.

For more on social media archives in government — the ultimate objective beyond the immediate goal of displaying our latest tweets on our homepage — see NextGov’s aptly titled article, Saving Government Tweets is Tougher Than You Think.