From ADL Team Member… Nikolaus Hruska: How The Semantic Web Benefits Education

Nikolaus Hruska

Image of Nikolaus Hruska

Nikolaus Hruska is a software engineer and researcher with over ten years of experience. He is currently working on the Next Generation SCORM project investigating new approaches to learning. He produces prototypes using emerging technologies to enable new learning scenarios such as mobile learning, simulations, and serious games.

As a contractor with Problem Solutions, Nikolaus provides support to the Advanced Distributed Learning (ADL) Initiative. The views expressed are those of the author and do not necessarily represent the views or policies of ADL.

Who likes chocolate chip cookies?  I sure do. I never have my mom around to remind me of her recipe, so I’m often searching for something that approximates her cooking – because everyone knows “you can’t beat mom’s cooking.”

A Quick Example

If you search for ‘chocolate chip cookies’ on Google, you’ll see some filtering options in your results – Ingredients, Cook Time and Calories.  I’m allergic to walnuts, so i’m going to deselect ‘Walnuts’ from the ingredients. I’ll select a cook time of ‘Less than 15 minutes’ since I’m hungry right now, and since I’m on a diet, I’ll select ‘Less than 300 calories’.  My search has now been refined to display only those recipes which fit my dietary needs and preferences.

Some magic occurred behind the scenes when we found our perfect cookie recipe.  Not only did Google know ‘chocolate chip cookies’ was a recipe, it also knew that all recipes have ingredients, cook times, and calories. But how does Google appear to know this information? The answer is – the Semantic Web.

What is the Semantic Web?

The Semantic Web allows web pages to contain the information (metadata) necessary to describe themselves, which enables search engines to offer filters that make our recipe searches more interactive.  This is quite different from the approach used in SCORM of creating a separate file containing the web page’s metadata.  All of the recipes returned by my search used the same method (semantic web standards) to list their ingredients, cook time and calories – hidden within the html code of the web page. There are several web standards working behind the scenes to make these self-describing recipes work.

How does it work?

Semantic data is hidden from the end user since it doesn’t show up on the visible page. If we use Google’s ‘Rich Snippets Testing Tool’, we can extract and examine the semantic data from the document. When we look at our recipe using the tool, we can see exactly how the page describes itself for search engines, as well as which specific semantic web standards are in use.  My chocolate chip cookie recipe makes use of the following methods to describe the content of the document:

This is how Google appears to know that the document is a recipe with embedded ingredients, cook time, and calories for me to filter my search. In fact, it also knows that the cookies contain 3g of saturated fat, 8 g of sugar and 2 g protein. I bet we could mine that detailed information in a dietary management app (more on this in a minute)…

Another important point – when I share this recipe with my friends on Facebook or Google+, the Open Graph metadata contained within the recipe dictates the title, description and image which is displayed with the link in my status update. This is quite telling of the widespread adoption of Open Graph across the web today.  Ever try and share a link that doesn’t show a preview image when posted? That’s a site that isn’t identifying itself with Open Graph.  The moral of the story…You may indeed have THE best cookie recipe in the world, but if you don’t use semantic web standards, some other recipe will go viral on Facebook or Pinterest because their page is described using Open Graph and they have a great picture and link title. For more info on Open Graph, see my colleague Tom Creighton’s article Using Open Graph to Describe Learning Resources.

What are we waiting for?

Why is it easier to meet my dietary needs and preferences after almost 20 years of having the Internet than it is to meet my learning needs and preferences? Schema.org (Google, Bing, and Yahoo’s collaboration to take search to the next level) has already started working on semantic web standards across industries (eCommerce, social media, news outlets). Let’s take a look at the recipe specification at schema.org. Wow! They are allowing for MUCH more detailed semantic information – in addition to the ingredients, preparation time and calories. How can we leverage semantic web technologies to feed our brains the right diet?

One of the groups working on the eLearning industry’s piece of schema.org is The Learning Resource Metadata Initiative (LRMI). The LRMI is a joint effort between The Association of Educational Publishers and Creative Commons. According to the LRMI website, they aim to “make it easier to publish and discover quality educational content and products online.” Essentially, they are working to create a common eLearning vocabulary for schema.org so that we can start embedding rich semantic data into our learning resources.

How might search be improved with the work being done by the LRMI? While searching for ‘Math Word Problems’, you could be presented with filters for Grade, Level of Difficulty, Prerequisites, Subject, and whether the content was authored for use by teachers or students. Selecting grade 8, medium difficulty, algebra prerequisite, and ‘student use’ will display only those math resources which meet your learning needs and preferences.

How does this fit into Next Generation SCORM?

In my previous article Using Activity Streams in Next Generation SCORM, I described how the Activity Stream specification fits into the picture. Semantic web standards can help systems reporting via activity streams determine the type of object in the <actor><verb><object> statement. According to the semantic information contained in the <object> web page, the activity stream can be reported with a contextually corresponding verb to adequately describe my interaction(s) with the recipe (ex. Nikolaus baked cookies…and then… Nikolaus ate cookies). More importantly, since semantic data can be understood by computers, content can be harvested, filtered and acted upon by artificial intelligence and recommendation engines.

To stick with the food analogy, imagine a dietary management app in which I enter my diet and exercise plan. It takes into account the meals i’ve previously eaten and my personal preferences (such as my dislike of mexican food) to offer recipes based on my tastes (think Amazon recommendations). Of course the mobile version of the app uses my real-time exercise data so that i get the correct balance of calories and exercise (FYI – I just finished a 5 mile run according to the sensors in my running shoes). Pair that app with my smart refrigerator (a ‘Personal Assistant for Eating’, if you will) which determines my nutritional (learning) gaps and suggests recipes based on a composite view of my caloric intake, exercise and diet plan. My fridge would take into account the ingredients i have on hand (or fetch ingredients online) to precisely prepare each meal in my well balanced diet. Do you smell what I’m cooking?

Whether its cookie recipes or educational material, the web is flush with content which has semantic information our search engines can increasingly make sense of, helping us better filter the mass of available content down to the morsels we need.

Tags experience api, Next Generation SCORM, Tin Can, tin can api, tla | April 24th, 2012 | Posted in Blog Post |