Browsers / Data / visualization

Visualizing the Creative Commons licenses of webpages you’ve visited, data to visual in 10 minutes.

Over the last few months, I’ve been drawn towards looking for ways people can create visualizations of data they own that can help them better understand their participation and habits on the Web. In my own research, I’m using a free web-based data visualization tool to understand how grantees of the TACT program are reporting their experience working with OER.

The app I’m reference is called RAW, and is a free browser-based app that takes advantage of javascript (d3.js) to let just about anyone visualize datasets in a variety of graph types. It’s being developed by a group called Density Design, based in Milano. And if you’re using the OpenAttribute browser plugin, you already have a dataset worth taking a look at inside their app.

Open Attribute

While checking a website for a proper machine-readable license by using OpenAttribute earlier this week (by clicking on the attribution man in the address bar), I noticed a button labeled, “More Data” when the dialogue box dropped down.


When clicked, the attribution and license metadata history from cc-licensed pages I’ve visited were displayed as text, in the following format:




To be honest, I wasn’t aware that this history was kept by the plugin, and it made me wonder if that was mentioned in the documentation. But what the heck, it might turn out to be an interesting dataset, and it’s ready to be run through the RAW visualization app as-is.


Select all the text including-and-below the URL,LICENSE,NAME,ATTRIBUTION_URL,AUTHOR line, and navigate to the RAW app:


Once there, paste in the attribution metadata text chunk that you just copied from OpenAttribute. The first line row will appear as the variables (column headers), and lines of data beneath will be data points/values. It’s simpler than it sounds.


A notice beneath my data says I have 42 records that have been parsed. Cool.

Next, RAW will suggest chart and graph types that might be more useful for visualizing your data. For this tutorial, I’m using “Circle Packing”:



As a final step, you will need to tell RAW which variables should represent which dimensions on the graph.

For my first visualization of these data, I told RAW to map the dimensions as follows (which was done by dragging-and-dropping values on the left into the dimension boxes):

hierarchy: license

color: license


And straight away, we have a visualization that shows the types of licenses that OpenAttribute has detected from my browsing history that’s grouped, sized, and colored in a way that speaks to us:


Looking at this visual (versus looking at a table of words and numbers), it’s possible to get an immediate sense for how the websites you’ve visited are licensed. RAW allows some color customization, and exports the graphic and data into PNG, SVG, and JSON for you to take the graphic with you in whatever way is useful. If you’re feeling ambitious, RAW’s code is up on Github, and they’re taking pull requests for visual design and functional contributions to the project. I imagine it’s also possible to use the RAW API to hook together multiple, dynamic datasets like this, but I don’t know that it’s been done with data relating to licenses. Would having and visualizing this data be useful to everyone? Probably not. But the method at work here might be.

To recap, you’re taking a dataset that you already have (assuming you’re using OpenAttribute, which you should be!). And after a handful of clicks, copying and pasting some text, and choosing the type of graph you’d like to use, you’re delivered a visualization that can tell stories about where you’ve traveled on the Web, across the CC-flavored sites anyway. If I wrote this clearly enough, you should be able to make your own visualization using this app in not more than 10 minutes.

I’m a always a fan of things that put control back in the hands of users, or at the very least offers users a chance to build more context around their interaction with technology. Using simple methods like this can put data to work for you, which is something most people wouldn’t mind happening. (side note: I’m particularly proud of the bubble representing sites marked CC0, which is as open as it gets for content.)

Lastly, here’s the initial tweet and graph I made earlier from a larger (125 entries) dataset from OpenAttribute, which prompted this writeup:


How many records do you see in your OpenAttribute plugin? If you have a entries in the hundreds, or even thousands, I’d like to see similar graphs of your data, and graphs that map the data dimensions differently. Feel free to leave a comment with a link to your graph, or tweet at me.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s