Main Page

From Clairlib
Jump to: navigation, search

The Clair library is a suite of open-source Perl modules intended to simplify a number of generic tasks in natural language processing (NLP), information retrieval (IR), and network analysis (NA). Its architecture also allows for external software to be plugged in with very little effort.

  • Code - Clairlib comprises over 100 modules covering functionality for a wide range of tasks
  • Documentation - full API documentation in PDF and HTML format

Getting Started

Latest Version: Clairlib-Core 1.08 [Sept 2009]

  • Updated Clair::SynthCollection to generate synthetic documents based on (1 to 4)-grams.
  • Modified extract N-grams to optionally use CMU-LM.
  • Updated make_synth_collection.pl to fully utilize Clair::SyntheticCollection.
  • Fixed some Tokenizer issues.
  • Added summarize_document.pl to the utilities.
  • Added summarize_collection to the utilities.
  • Added learn.pl to the utilities.
  • Added classify.pl to the utilities.
  • Added extract_features.pl
  • Added bigrams_to_rand_doc.pl to the utilities.
  • Added make_synth_collection_Menczer.pl to the utilities.
  • Added Clair::RandomWalk for random walk on graphs.
  • Added Clair::Harmonic for computing harmonic functions based on the Relaxation and Montecarlo methods
  • Added random_walk.pl to the utilities.
  • Added harmonic.pl to the utilities.
  • Added directory_to_URL_network.pl to the utilities.
  • Fixed a bug in the crawling code.
  • Added new tutorials.
  • Added new sections to the documentation.
  • Added Clair::Bio::GIN for gene interaction extraction.
  • Added an interface to Stanford parser in Clair::Utils::Parse
  • Added tag_genes.pl to the utilities.
  • Added extract_interactions.pl to the utilities.
  • Added Clair::Network::Modularity for computing the partitioning modularity.
  • Added modularity.pl to the utilities.
  • Added Clair::Network::Mincut for Mincut Partitioning.


For full details see the Development page.

More about Clairlib

  • Contribute - ways to contribute to Clairlib
  • Development - learn about Clairlib development
  • FAQ - answers to frequently asked questions
  • Clairlib-dev - mailing list for discussion among Clairlib developers (and users)
  • People - Clairlib developers and contributors
  • Presentation - an introduction to Clairlib (from October 2006)
  • Projects - ideas for student projects using Clairlib
  • NLP - Wikipedia entry on natural language processing
  • If you publish using clairlib, you should acknowledge its creators. Please use the following bibtex:
 @techreport{Radev&al.07a,
 author =	 "Radev, Dragomir R. and Hodges, Mark and Fader,
                 Anthony and Joseph, Mark and Gerrish, Joshua and
                 Schaller, Mark and dePeri, Jonathan and Gibson,
                 Bryan",
 title =	 "CLAIRLIB Documentation v1.07",
 institution =	 "University of Michigan. Department of Electrical
                 Engineering and Computer Science",
 pdf =
                 "http://clair.si.umich.edu/~radev/papers/csetr536-07.pdf",
 postscript =
                 "http://clair.si.umich.edu/~radev/papers/csetr536-07.ps",
 year =	 "2007",
 number =	 "CSE-TR-536-07",
Personal tools
Namespaces

Variants
Actions
Main Menu
Documentation
Clairlib Lab
Community
Development
Toolbox