17 May 2016

Trying out Blazegraph

Especially inferencing.

I've been hearing more about the Blazegraph triplestore (well, "graph database with RDF support"), especially its support for running on GPUs, and because they also advertise some degree of RDFS and OWL support, I wanted to see how quickly I could try that after downloading the community edition. It was pretty quick.

Downloading from the main download page with my Ubuntu machine got me an rpm file, but I found it simpler to download the jar file version that I could start as a server from the command line as described on the Nano SPARQL Server page. I found the jar file (and several other download options) on the sourceforge page for release 2.1.

The jar file's startup message tells you the URL for the web-based interface to the Nano SPARQL Server, shown here:

At this point, uploading some RDF on the UPDATE tab and issuing SPARQL queries on the QUERY tab was easy. I was more interested sending it SPARQL queries that could take advantage of RDFS and OWL inferencing, so after a little help from Blazegraph Chief Scientist Bryan Thompson via their mailing list (with a quick answer on a Saturday) I learned how: I had to first create a namespace on the NAMESPACES tab with the Inference checkbox checked. The same form also offers checkboxes for Isolatable indexes, Full text index, and Enable geospatial when configuring a new namespace. I found this typical of how Blazegraph lets you configure it to take advantage of more powerful features while leaving the out-of-box configuration simple and easy to use.

For finer-grained namespace configuration, after you select checkboxes and click the Create namespace button, a dialog box lets you edit the configuration details, with each of these lines explained in the Blazegraph documentation:

I wanted to check Blazegraph's support for owl:TransitiveProperty, because this is such a basic, useful OWL class, as well as its ability to do subclass inferencing. I created some data about chairs, desks, rooms, and buildings, specifying which chairs and desks were in which rooms and which rooms were in which buildings, and also made dm:locatedIn a transitive property:

@prefix d: <http://learningsparql.com/ns/data#> .
@prefix dm: <http://learningsparql.com/ns/demo#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

dm:Room rdfs:subClassOf owl:Thing .
dm:Building rdfs:subClassOf owl:Thing .
dm:Furniture rdfs:subClassOf owl:Thing .
dm:Chair rdfs:subClassOf dm:Furniture .
dm:Desk rdfs:subClassOf dm:Furniture .

dm:locatedIn a owl:TransitiveProperty. 

d:building100 rdf:type dm:Building .
d:building200 rdf:type dm:Building .
d:room101 rdf:type dm:Room ; dm:locatedIn d:building100 . 
d:room102 rdf:type dm:Room ; dm:locatedIn d:building100 . 
d:room201 rdf:type dm:Room ; dm:locatedIn d:building200 . 
d:room202 rdf:type dm:Room ; dm:locatedIn d:building200 . 

d:chair15 rdf:type dm:Chair ; dm:locatedIn d:room101 . 
d:chair23 rdf:type dm:Chair ; dm:locatedIn d:room101 . 
d:chair35 rdf:type dm:Chair ; dm:locatedIn d:room202 . 
d:desk22 rdf:type dm:Desk ; dm:locatedIn d:room101 . 
d:desk59 rdf:type dm:Desk ; dm:locatedIn d:room202 . 

The following query asks for furniture in building 100. No triples above will match either of the query's two triple patterns, so a SPARQL engine that can't do inferencing won't return anything. I wanted the query engine to infer that if chair 15 is a Chair, and Chair is a subclass of Furniture, then chair 15 is Furniture; also, if that furniture is in room 101 and room 101 is in building 100, then that furniture is in building 100.

PREFIX dm: <http://learningsparql.com/ns/demo#> 
PREFIX d: <http://learningsparql.com/ns/data#> 
SELECT ?furniture
  ?furniture a dm:Furniture .
  ?furniture dm:locatedIn d:building100 . 

We need the first triple pattern because the data above includes triples saying that rooms 101 and 102 are located in building 100, so those would have bound to ?furniture in the second triple pattern if the first triple pattern wasn't there. This is a nice example of why declaring resources as instances of specific classes, while not necessary in RDF, does a favor to anyone who will query that data—it makes it easier for them to specify more detail about exactly what data they want.

When using this query and data in a namespace (in the Blazegraph sense of the term) configured to do inferencing, Blazegraph executed the query against the original triples plus the inferred triples and listed the furniture in building 100:

Several years ago I backed off from discussions of the "semantic web" as a buzzphrase tying together technology around RDF-related standards because I felt that the phrase was not aging well and that the technology could be sold on its own without the buzzphrase, but the example above really does show semantics at work. Saying that dm:locatedIn is a transitive property stores some semantics about that property, and these extra semantics let me get more out of the data set: they let me query for which furniture is in which building, even though the data has no explicit facts about furniture being in buildings. (Saying that Desk and Chair are subclasses of Furniture also stores semantics about all three terms, but that won't be as interesting to a typical developer with object-oriented experience.)

Blazegraph calls their subset of OWL RDFS+, which was inspired by Jim Hendler and Dean Allemang's RDFS+ superset of RDF that added in OWL's most useful bits. (It's similar but not identical to AllegroGraph's RDFS++ profile, which has the same goal.) Blazegraph's Product description page describes which parts of OWL it supports, and their Inference And Truth Maintenance page describes more.

A few other interesting things about Blazegraph as a triplestore and query engine:

  • The REST interface offers access to a wide range of features.

  • Queries can include Query Hints to optimize how the SPARQL engine executes them, which will be handy if you plan on scaling way up.

  • I saw no no direct references to GeoSPARQL in the Blazegraph documentation, but they recently announced support for geospatial SPARQL queries. (I've been learning a lot about working with geospatial data at Hadoop scale with GeoMesa.)

Blazegraph's main selling points seems to be speed and scalability (for example, see its Scaleout Cluster mode) and I didn't play with those at all, but I liked seeing that SPARQL querying with inferencing support can take advantage of such new hotness technology as GPUs. It will be interesting to see where Blazegraph takes it.

Please add any comments to this Google+ post.

23 April 2016

Playing with a proximity beacon

Nine-dollar devices send URLs to your phone over Bluetooth.

I've been hearing about proximity beacons lately and thought it would be fun to try one of these inexpensive devices that broadcast a URL for a range of just a few meters via Bluetooth Low Energy (a.k.a. BLE, which I assume is pronounced "bleh"). Advocates often cite the use case of how a beacon device located near a work of art in a museum might broadcast a URL pointing to a web page about it—for example, one near Robert Rauschenberg's Bed in New York's Museum of Modern Art could broadcast the URL http://moma.org/collection/works/78712, their web site's page with information about the work. When the appropriate app on your phone (or perhaps your phone's operating system) saw this, it would alert you to the availability of this localized information.

beacon in phone charger

You can find these beacons for as little as $14, and even cheaper on eBay, where colorful bracelet versions can cost less then $10. Most need batteries, typically the kind you put in a watch, so to avoid this I got a RadBeacon USB from Radius Technologies that draws its power from any USB port where you plug it in. At the right you can see mine plugged into a conference swag phone recharger.

I also chose this one because it supports Google's Eddystone open beacon format, Apple's iBeacon format, and Radius Network's AltBeacon. I haven't dug into the pros and cons of these different formats yet; I just wanted something that was likely to work out of the box with both my Samsung S6 Android phone and my wife's iPhone. The RadBeacon USB did fine.

You configure it with a phone app built for that particular beacon product line. The Android RadBeacon app generally worked, although I often had to press "Apply" several times and restart Bluetooth before new settings would actually take hold. Its documentation shows the kinds of properties it lets you set, such as the URL to broadcast and the Transmit Power (which affects the battery life and the distance that the URL is broadcast—in a museum, you want people receiving the URL of the painting in front of them, not the one twenty feet to the left of it).

I had set mine to the URL of a sample web page that I created for this purpose. While waiting for my RadBeacon to arrive in the mail, after Dan Brickley tweeted the mobiForge article Eddystone beacon technology and the Physical Web, I learned a lot from it about which components of my web page would be picked up by an app that received the broadcast URL.

After I configured the beacon, the open source physical web app found it and displayed the following on my Samsung S6:

screenshot of physical web app

Tapping the blue title took the phone to the web page. This all worked the same, with the same app, on my wife's iPhone.

I don't want to have to bring such an app to the foreground every time I want to check for nearby beacons, so I was glad to see that the app also added something to my phone's notifications list:

screenshot of Android notifications

Touching the notification sent the phone to the referenced web page.

Both notifications above show what the app pulled from my sample web page: the content of the head element's title element and the value of the content attribute from the meta element that had a name attribute value of "description". They also displayed the hastily-drawn favicon image I created for the web page.

A beacon won't broadcast just any URI that you want, because the allowable length is somewhat limited. (This could vary by beacon product.) The article mentioned above describes the role of URL shorteners in the architecture. Still, the idea of such inexpensive hardware using URIs to identify things brings a nice semantic web touch to an Internet of Things architecture.

One experiment I tried was the use of Audio Tag Tool to add every metadata field available to an MP3. I then configured my beacon to broadcast that MP3's URL, but none of the metadata showed up on my phone's display. I thought that the idea of location-specific audio might be interesting. (You could also implement location-specific audio with much older technology—for example, Victrolas—but the ability to control the audio from a central server could lead to interesting possibilities.)

The museum use case for beacons is nice and cultured, but I wonder about the attraction of a technology whose real main use case for now is to pump ads at people. (When was the last time you scanned a QR code with your phone?) I say "for now" because I remain hopeful that creative people will come up with more interesting things to do with these, especially if they dig into the Eddystone, iBeacon, and AltBeacon APIs. For example, you could add features to your own apps to check for or even act as beacons, communicating with other beacons and apps around your phone whether these devices had Internet connections or not. The Opera browser's use of schema.org metadata stored in web pages referenced by beacons is also promising, and I know that Dan is putting more thought into what role schema.org can play.

The idea of the broadcast URL showing up as a notification on your phone that you can follow or ignore is much simpler than starting up a special app on your phone and then pointing the phone at one corner of a poster, which the QR enthusiasts thought we'd be happier to do. The short article 5 Common Misconceptions About Beacons and Proximity Marketing gives a good perspective on where beacons can fit into the communications ecosystem in general and the world of marketing in particular. The article is from one of several companies building a business model around advertising via beacons, but like I said above, I hope that the APIs inspire other users for them as well.

Please add any comments to this Google+ post.

20 March 2016

Adding custom menus to Google docs

Using Google Apps Script, but unfortunately not in Google apps.

Google apps menu

I've been using Google Docs more because at work it's great for collaboration, and also, for shopping lists and notes to myself, I can easily edit the same documents from my phone, tablet, and laptop. I found out that it's pretty easy to add menus that perform custom functions, so I created a few menu choices... and then found out that they weren't available on my phone or tablet. Still, it's good to know how easy it is to automate a few things.

Extending Google Docs is a good introduction to getting started. Picking Script Editor from the Tools menu puts you into this editor with an empty function waiting for you to fill it in or, more likely, to replace it with code you copied from web pages such as "Extending Google Docs." Google Apps Script is basically Javascript, and I had an easy time searching for any code that I wanted to plug in.

For example, when writing a note about something, I sometimes want to add a date-time stamp to show exactly when I made a particular note, because if it's ongoing research it's easier to see my progress leading up to where I left off. (I've had my .emacs file set up to let me add this with Alt+D for years.) To add a timestamp menu choice to Google Docs, I replaced the blank function in the script editor with menu code based on what you see in Custom Menus in Google Apps, and then I added a line to insert the current date and time at the cursor using the format "Sun Mar 13 2016 10:40:33 GMT-0400 (EDT)." I'd prefer the terser ISO 8601 format, and I found a function to convert it, but the function wants to know what time zone you're in, and the simpler Date() function that creates the more verbose form already knows.

When I read something on my tablet and I'm taking notes, I often paste blocks of text into a Google docs document. To remember which parts are large verbatim blocks of someone else's writing, I enclose them in <blockquote></blockquote> tags. My second new menu item inserts this string and then moves the cursor between those tags so that if I have something in my copy-paste buffer I can just paste it right there. The "utilities" menu that I added also demonstrates how to add a menu separator and a submenu that pops up a message box.

The code is all shown below. If I want to share these features across multiple documents, to be honest, the simplest way I've found is to paste this code into the script editor for each of the other documents. This is not, if I may string together some buzzwords, a scalable code maintenance solution.

These are known as "bound" scripts because they're bound to specific documents. You can also create standalone scripts, which I hoped would be a way to store shared code that could be referenced from multiple documents, but you actually run them independently of the documents to perform tasks that are not tied to any specific document such as, in the example on that page, searching Google Drive for documents meeting certain conditions.

If you have a script that adds choices to a document and you want to use it from multiple documents, you must publish it. As the Publishing an Add-on web page says,

Publishing add-ons allows them to be used by other users in their own documents. Public add-ons require a review before publication, although if you are a member of a private Google Apps domain, you can publish just for users within your domain without a review. You can also publish an add-on for domain-wide installation, which lets a domain admins find [sic], authorize and install your add-on on behalf of all users within their domain.

There's even an add-on store with offerings available from some recognizable brand names.

I never did find a way to create a single script that I could share among my own documents without going through some approval process. In an even greater disappointment, I found that the menu I created was not available when editing that same document on my phone or tablet, which was much of the point of creating them. In other words, this part of Google Apps script doesn't work with Google apps.

Still, skimming the Apps Script Reference for available methods to call when customizing for Google Docs, spreadsheets, calendars, and more shows that there's a lot to play with, and I didn't even try a standalone script. If this ever works on phones and tablets, I will definitely be digging back into the reference material again.

function onOpen() {
  var ui = DocumentApp.getUi();
  // Or DocumentApp or FormApp.
      .addItem('timestamp', 'insertTimestamp')
      .addItem('blockquote', 'insertBqTags')
          .addItem('Second item', 'menuItem2'))

function insertTimestamp() {
  DocumentApp.getUi() ; 
  var doc = DocumentApp.getActiveDocument(); 
  var body = doc.getBody();
  // The following gives me ISO format, which I prefer, but unlike Date(), 
  // needs to be told the time zone 
  // var timestamp = Utilities.formatDate(new Date(), "EDT", "yyyy-MM-dd'T'HH:mm:ss"); 
  var timestamp = new Date();
  // https://developers.google.com/apps-script/reference/document/document#getcursor
  // has error-checking code for the following that would make it more robust.
  var cursor = DocumentApp.getActiveDocument().getCursor();
  var element = cursor.insertText(timestamp);

function insertBqTags() {
  DocumentApp.getUi() ;
  var doc = DocumentApp.getActiveDocument(); 
  var body = doc.getBody();
  var cursor = DocumentApp.getActiveDocument().getCursor();
  var insertedText = cursor.insertText("<blockquote></blockquote>");
  var position = doc.newPosition(insertedText, 12);

function menuItem2() {
  DocumentApp.getUi() // Or DocumentApp or FormApp.
     .alert('You clicked the second menu item!');

Please add any comments to this Google+ post.

"Learning SPARQL" cover

Recent Tweets


    [What are these?]
    Atom 1.0 (summarized entries)
    Atom 1.0 (full entries)
    RSS 1.0
    RSS 2.0
    Gawker Artists