Fast-backwards
Ever since my daughter learned the word “fast-forward,” meaning to skip ahead in a song or TV show, she took to saying “fast-backwards” to mean “rewind.” Despite the additional syllables, her phrase instantly took hold in our household. So, given the passage of time since my previous post from December 2016, I think some fast-backwarding is in order!
tl;dr XQuery for Humanists is out! And a roundup of my activities and other developments in the world of XQuery, TEI, and digital humanities.
The book!
At the end of my last post, I mentioned that I was “wrapping up” work on a book. In retrospect, that characterization was a tad optimistic. Three more years would pass. Last December, my co-author, Clifford Anderson, and I met at Circle Bistro in Foggy Bottom to review our final set of changes for the manuscript of XQuery for Humanists:
We just sent in the final proofs and index for XQuery for Humanists, due out from @TAMUPress—@andersoncliffb, Dec 10, 2019
These meetings with Cliff—in Nashville or DC or at conferences elsewhere, twice a year or so—were the highlights of working on the book. I couldn’t have hoped for a better co-author.
At long last, at the end of February 2020, a satisfyingly dense package arrived from Texas A&M University Press. My first book, hot off the presses!
That box arrived just shy of one year ago—just as COVID-19 began to ravage the world. While the pandemic and other events of 2020 absorbed much of the energy I had planned to put into the book’s launch, I am so pleased with the reception the book has received, and especially that it has reached distant shores and entered classrooms. Here are some selected tweets from readers:
Y’all. This excellent book by @andersoncliffb and @joewiz arrived in mid March. I intended to tweet this photo on the last Friday I was in my office. And then everything happened. So I’m posting it now and saying I’m really glad I brought it home with me—@KathrynTomasek, May 22, 2020
I liked the book very much it’s extremely well written—@ZornGerhard, Sep 19, 2020
Your book was one of five introductions to XQuery that I especially recommended to my class on Wednesday. ;-)—@MonikaBarget, Nov 19, 2020
The moment when #XQuery does not work and you remember the magic of *: from #XqueryForHumanists :). Great book!—@CatoMinor3, Dec 4, 2020
The companion website
Whether you’re a reader of the book, just curious about it, or a seasoned user of XQuery, I hope you have explored the book’s companion website. [Update (2024-04-04): The URL for the companion website has changed from xquery.forhumanists.org
to coding4humanists.github.io/xquery4humanists.] To make it I used several technologies familiar to readers of the book and this blog. Naturally, I used XQuery to prepare the site’s Code section by transforming the plain text source files into GitHub-flavored Markdown. I also used XQuery to validate that all of the sample code was free of syntax and well-formedness errors and would run in the three XQuery processors we highlighted in the book. We addressed processor-specific issues in our Implementations page. We hope the site is a useful resource for all learners of XQuery.
Under the hood, the book’s companion website is a static site powered by Jekyll and hosted by GitHub. My main criterion was good syntax highlighting for support for XQuery, XML, XSLT, and JSON. Rouge, which Jekyll has used as its default syntax highlighting library since version 3, does a great job on this front; see the sample XQuery page. As a result, both the companion site and this blog now run on Jekyll, version 4. GitHub Pages’s support for HTTPS on custom domains such xquery.forhumanists.org and joewiz.org made it a great platform to host the site’s content.
Other projects and news
Between the book and other developments, there’s now a lot to catch up on in the sphere of XQuery, XML, and digital humanities. So let’s fast backwards, with a sampling of papers, packages, and projects, in roughly reverse chronological order:
-
I co-authored and co-presented a paper at Balisage 2019 with David J. Birnbaum, Hugh Cayless, Emmanuelle Morlock, and Leif-Jöran Olsson: The integration of XML databases and content management systems in digital editions: Understanding eXist-db through Reese’s Peanut Butter Cups. This is one of the first papers to survey and evaluate different approaches to application development with XQuery; it’s focused on developing digital editions but is widely applicable to other domains.
-
I co-authored a paper with Christiane Sibille based on our presentation at ICEDD Berlin 2019: Best practices for digital diplomatic documentary editions. A brief handbook of lessons from Dodis and FRUS for ICEDD editions to consider as they develop a digital publication strategy. We encouraged adoption of open standards like TEI and listed options for training, tools (including TEI Publisher, discussed more below), etc.
-
Cliff and I gave two half-day workshops on XQuery at DH2017 in Montreal: XQuery for Digital Humanists and XQuery for Data Integration. These presentations gave an introduction to XQuery and covered advanced uses for digital research. See the lecture notes and code.
-
XQuery 3.1, which enabled so much of our book and the DH2017 workshops, became a W3C Recommendation on March 21, 2017. It was great to have the finalized spec in place as we prepared the manuscript and the workshops. Our thanks again to the editors and the working groups for all of their efforts on this fantastic family of specifications.
- Outside of conferences and this blog, I posted articles and useful code snippets—on my GitHub Gist page. Articles include:
- Converting an eXist application from old-style fields to new, Lucene-based facets and fields
- How variables in XQuery FLWOR expressions change when using the
group by
clause - Basic dynamic web pages, with XQuery and eXist
- An introduction to recursion in XQuery
- Full text search of Chinese text, with eXist-db and Lucene.
(With better XQuery syntax highlighting in Jekyll now, perhaps it’s time to move these articles over here and to develop some of the other code snippets into proper posts. ICYMI, Gist recently gained a long-requested feature: you’ll now receive notifications when a reader leaves a comment on your Gist. Before this, readers could post comments… but the authors would never know.)
-
One of my gists graduated to a full-fledged library package: semver.xq. You can use it to validate, compare, sort, parse, and serialize Semantic Versioning version strings. This package is now part of the eXist distribution, where it is used for EXPath package libraries. But you can use it for anything you like.
-
Speaking of XQuery packages, I published two new ones last week: airtable.xq, a library for accessing the Airtable REST API; and Airlock, an application for taking snapshots of Airtable bases for offline browsing and transformation. If you’re not familiar with it, Airtable is a web-based relational database and a great way for teams to collaborate on data with interlinked records. It’s makes for a nice pairing with the world of XML documents and databases. My colleagues have been using it to manage metadata associated with our documents and model relationships between entities like people, places, and organizations. I chose the name “Airlock” since, like its counterpart in the physical world, Airlock permits the passage of data between relational and document models, minimizing mismatches and maximizing the advantages of the environments on both sides for managing and manipulating data.
-
I also updated the venerable Punch app package, which I originally created to accompany the eXist workshops I gave at TEI@Oxford 2010 and Digital.Humanities@Oxford Summer School 2011. It demonstrates how to create a dynamic website for a TEI edition using XQuery and eXist, in a world from before TEI Publisher. It’s still a good illustration of what you’d need to create an edition from scratch—as discussed in the Balisage paper above.
-
I happily abandoned xqjson once eXist added support for XQuery 3.1’s JSON parsing and serialization capabilities.
- I’ve received contributions and posted updates for two lists of XQuery resources that I maintain:
-
I’ve kept the @XQuery Twitter account kicking, mostly with retweets of posts from other users but occasionally with posts. (: Let’s make it to 400 followers in 2021! :)
- Some great XQuery questions have come up on Stack Overflow; here’s the list of questions I’ve answered. Most of the time, though, Martin Honnen beats me to the punch, and puts his XQuery Fiddle tool (see below) to great use.
Outside of my own projects, numerous tools and projects have entered the scene and become invaluable, among which include:
- TEI Publisher, created by Wolfgang Meier and now developed under e-editiones, lowers barriers and empowers TEI projects, by ingeniously leveraging the TEI Processing Model and establishing a common framework for digital editions. I discuss TEI Publisher in the Balisage and ICEDD papers linked above. Continuing the work of the late Sebastian Rahtz, TEI Publisher’s community is active and rapidly growing.
- existdb-vscode, by Wolfgang Meier, brings great XQuery support to Visual Studio Code.
- generator-exist, by Duncan Paterson, simplifies the task of generating scaffolding for library and application EXPath Packages for eXist. I used this when I prepared airtable.xq and Airlock for publishing.
- XQuery Fiddle, by Martin Honnen, excels at sharing XQuery snippets and is used widely in Stack Overflow answers.
- Dash docsets, by Norm Walsh, makes searching the W3C XML and XQuery specifications a snap.
- exist-stanford-nlp, by Loren Cahlander, brings the Stanford CoreNLP library for named multilingual entity recognition and other natural language processing capabilities like parts of speech tagging and sentence tokenization to XQuery and eXist. Here’s some sample code I wrote.
- github-xq, by Winona Salesky, lets you interact with the GitHub API from XQuery—making commits, submitting pull requests, creating branches, and responding to webhooks.
XQuery implementations and the community
In addition, the XQuery implementation that I’ve used on a daily basis, eXist, has leaped from version 3 to 4 and now 5, bringing extensive XQuery 3.1 support, improvements in performance and stability, and transformative features like facets and fields. eXist has modernized greatly, jettisoning its legacy build process with a standard Maven build process; you can check eXist’s conformance with the W3C’s XQuery test suite using the exist-xqts-runner; and you can use it with Docker and continuous integration pipelines, including GitHub Actions.
Agreed, the facets & fields facility in eXist 5 has made me rethink search, sort, and other aspects of my XQuery design. The ability to pre-generate facet & field contents means you can skip a lot of XQuery processing at request time. It’s like a customized, persistent cache.—@joewiz, May 1, 2020
While eXist 5.2 has been out for nearly a year, don’t mistake this for a lack of activity; the core developers and community contributors have put considerable energy into new features and bugfixes, and the next version—more likely version 6 than 5.3—is shaping up to be the best yet.
In these past few years, the eXist community has undergone a renaissance, with a very active Slack channel and weekly community calls (see these calendar and invite links). It’s exciting to see talented new users with great DH projects. This community also has a good deal of overlap with the e-editiones and TEI Publisher community, which also has a great Slack channel (see this invite link).
Like eXist, BaseX, Saxon, and oXygen XML Editor have issued a steady stream of stellar releases with great support for XQuery 3.1. These products all are indispensable tools in my toolbox.
Two new XQuery implementations, xqerl, by Zack Dean, and FusionDB, by Adam Retter & co., have entered the scene and look to have a great future. For eXist users, FusionDB is compatible with eXist, and Fusion Studio is a promising IDE and administrative tool that could consolidate many of the tools that eXist users rely on: eXide, Monex, the Java admin client, and possibly more.
Looking ahead, there are some exciting ideas in the community draft proposal of XQuery 4, led by Michael Kay and developed through the XML.com Slack workspace’s #xpath-ng channel and in several repositories on GitHub.) If you haven’t joined the XML.com Slack workspace, please do! If you’ve attended Balisage, you’ll know what I mean if I say, it’s like Balisage, but all year long; if you haven’t, then just know it’s a great, vendor-neutral forum for asking questions and talking about XML and markup technologies.
One final note: Thanks to the Center of Digital Humanities Research at Texas A&M University for supporting the book!
CoDHR is pleased to announce the publication of XQuery for Humanists, the newest release in its Coding for Humanists book series published by the Texas A&M University Press!—@TAMU_CoDHR, May 27, 2020