ContextMiner is a framework to collect, analyze, and present the contextual information along with the data. It is based on an idea that while describing or archiving an object, contextual information helps to make sense of that object or to preserve it better. This website provides tools to collect data, metadata, and contextual information off the Web by automated crawls. At present, ContextMiner supports automated crawls from blogs, YouTube, Flickr, Twitter, and open Web. It also collects inlinks information for YouTube videos from the Web. Additional sources will continue to be added. The initial development of ContextMiner was supported by NSF grant #IIS 0455970.
ContextMiner helps you (1) run automated crawls on various social media sources on the Web and collect data as well as contextual information, (2) analyze and add value to collected data and context, and (3) monitor digital objects of interest. Following is a typical flow of using ContextMiner:
  1. Start a new campaign based on some story, concept, or an object.
  2. Choose the sources (Web, Blogs, YouTube, Twitter, Flickr) that you want ContextMiner to do your searches and crawls on.
  3. Once you provide all the required parameters, ContextMiner can immediately start running your campaign. You can access all your campaigns and collected data as well as contextual information through this website.
  4. You can manipulate individual items as well as related items that are collected by the above processes to add your interpretation and meaning to the campaign.
Creating a campaign
You can create a new campaign in ContextMiner with just two easy steps.
  1. Enter a name for your campaign.
  2. Select the sources to collect data and contextual information from and enter your queries or URLs. At present, ContextMiner supports YouTube, blogs, Flickr, Twitter, and collecting inlinks from the Web.
Working with collected data and contextual information
Once a campaign is created, ContextMiner can start collecting data and contextual information almost immediately. The owner of the campaign can then start working with the collected items. With ContextMiner, one can access their running campaigns via a web-based interface. Following actions are possible for a campaign created with ContextMiner.
  1. View the up-to-date information about each of your campaigns.
  2. Pause, resume, or delete a campaign.
  3. View the objects collected from a given source (Web, blogs, YouTube, Flickr, Twitter) for a given campaign.
  4. Pause, resume, or delete a query.
  5. Add a new query for any of the sources.
  6. Provide judgments (relevant, non-relevant, neutral) to collected data items or contextual information items.
  7. Delete a collected data item or contextual information item.
  8. Export the campaign with all of its data in XML or CSV format.
New in Beta 3.3
  • Added support for collecting videos without searching for YouTube channels.
  • Removed a bug that would have allowed unauthorized users to access your campaign data.
New in Beta 3.2
  • More standardized export functionalities.
  • More options for Twitter search.
  • Bug fixes, including XML formatting while exporting campaign data.
New in Beta 3.1
  • Increased compatibility with Internet Explorer.
  • Export function improved. XML output is now more standard-based and easier to use.
  • Ability to get top 10/25/100 most commented or viewed videos from the YouTube collection.
  • Ability to extract metadata as well as user-generated content (e.g., comments) for a YouTube video using InfoExtractor.
  • Extracting additional details about a YouTube author/channel using InfoExtractor.
  • Resolved several issues, including the one that prevented data collection from Flickr.
New in Beta 3
  • Support for collecting inlinks for any URL from the Web.
  • Support for collecting data from Flickr by running queries.
  • Ability to export campaign data in CSV, in addition to XML format.
  • Completely redesigned user interface.
  • Re-engineered back-end, making the whole system much more efficient and scalable.
  • Ability to change any of the options for a campaign (excluding the name) at any time.
  • Inlinks information for YouTube can now be accessed right from the 'View Campaigns' page.
  • Improved display for lists and items for each source.
  • Bug fixes and other enhancements.
New in Beta 2
  • Support for Twitter as a source. What if you already have a campaign running? Go to your 'View Campaigns' page, select 'Description' of that campaign, and select 'Twitter' also as a source. Then click on the 'Parameters' for that campaign where you should now have options for 'Twitter' too. Finally, go to your 'Queries' page for that campaign and select which queries to run on blogs. FYI, these queries on Twitter are run at the same time they are run on YouTube (possibly, every day).
  • Support for exporting your campaign data. To export your data in XML format, go to your campaigns page, select the campaign that you want to export, and choose 'Export' option from the actions list.
  • Support for downloading YouTube videos on your client. Click here for more details.
  • Many other bug fixes and improvements.
  • New policy: starting March 1, 2009, any inactive campaign for one month will be paused and any inactive campaign for three months will be deleted with all its data. In order to keep your campaigns active, you need to log in at least once every month. You will be emailed a warning about possible pausing or deletion of your campaign two weeks before.
New in Beta 1.5
  • Now when you create a new campaign with ContextMiner, you can select "blogs" also as a source to run your queries on. To actually make your queries run on blogs (or YouTube), you need to go to your 'Queries' page for a given campaign, and select those sources there. This also allows you to manipulate individual queries.
  • What if you already have a campaign running? Go to your 'View Campaigns' page, select 'Description' of that campaign, and select 'Blogs' also as a source. Then click on the 'Parameters' for that campaign where you should now have options for 'Blogs' too. Finally, go to your 'Queries' page for that campaign and select which queries to run on blogs. FYI, (1) blog search is run on Google Blog Search, and (2) these queries on blogs are run at the same time they are run on YouTube (possibly, every day).