Development History

From Clairlib
Revision as of 11:32, 11 September 2012 by Clairlibwikiadmin (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


1.08 August 2009

  • Updated Clair::SynthCollection to generate synthetic documents based on (1 to 4)-grams.
  • Modified extract N-grams to optionally use CMU-LM.
  • Updated to fully utilize Clair::SyntheticCollection.
  • Fixed some Tokenizer issues.
  • Added to the utilities.
  • Added summarize_collection to the utilities.
  • Added to the utilities.
  • Added to the utilities.
  • Added
  • Added to the utilities.
  • Added to the utilities.
  • Added Clair::RandomWalk for random walk on graphs.
  • Added Clair::Harmonic for computing harmonic functions based on the Relaxation and Montecarlo methods
  • Added to the utilities.
  • Added to the utilities.
  • Added to the utilities.
  • Fixed a bug in the crawling code.
  • Added new tutorials.
  • Added new sections to the documentation.
  • Added Clair::Bio::GIN for gene interaction extraction.
  • Added an interface to Stanford parser in Clair::Utils::Parse
  • Added to the utilities.
  • Added to the utilities.

1.07 June 2009

  • Added Clair::Network::Spectral for spectral partitioning using Fiedler Vector.
  • Made Clairlib independent of MEAD (MEAD is no more required for Clairlib).
  • Added Naive Bayes learning and classification.
  • Added tests for feature extraction, learning, classification.
  • Fixed a bug in Clair::Cluster::create_lexical_network().
  • Added sampling options to Clair::Cluster.
  • Added "No IDF" option and sampling capabilities to utility.
  • Fixed documentation typos.
  • Added new tutorials to the documentation.
  • Fixed bug in Clair::Utils::CorpusDownload.
  • Added 'manual weights' option to make_synth_collection util.
  • Fixed bug in extract_ngrams.

1.06 March 2009

  • Added Clair:Network:FordFulkerson
  • Added to the utilities.
  • Added new scripts to interface ACL Anthology Network.
  • Fixed a bug in split_into_sentences() of Clair::Document

1.05 July 2008

  • Fixed formatting bugs in
  • Added get_predecessor_matrix() function in
  • Added get_shortest_path() function in
  • Added script
  • added script
  • added --ignore-isolated-nodes in
  • added several options to a. --self-loop,
  • completed the descriptions of added the note of --force into usage.
  • added , under util folder

1.04B June 2008

  • Added -no-duplicated-edges in
  • Added largest connected component in
  • Added full avergage shortest path in
  • fixed divide by zero error in,

1.04A April 2008

  • Added Clair::Network::GirvanNewman algorithm to do hierarchical clustering
  • Added Clair::Network::KernighanLin algorithm to do graph partition

1.04 Feburary 2008

  • Added Clair::Network::AdamicAdar to compute the adamic/adar value for a given network corpus
  • Added Clair::ChisqIndependent to compute p-value and degree of freedom for Chi square

1.03 August 2007

  • Added functionality to perform community finding within weighted, undirected networks
  • Added util/chunk\ to break documents into smaller files by word number
  • Added option to retain punctuation for idf and tf queries
  • Added option to print out full lists of idf and tf values for a corpus
  • LexRank moved from Clair::Network to Clair::Network::Centrality::LexRank
  • LexRank use now follows the same use pattern as the other centrality modules

1.02 July 2007

  • Distribution reorganized in standard format
  • Improved and expanded installation documentation (INSTALL)
  • Improved POD (inline) documentation
  • Additional examples
  • Updated PDF documentation

1.01 May 2007

  • Added Phrase-based Retrieval and Fuzzy OR Queries
  • Extended Clairlib-ext with interfaces for the Cluster class and the Document class to the Weka machine learning toolkit
  • Added LSI functionality
  • Extended parsing of strings / files into Documents
  • Added perceptron learning and classification for documents

1.0 RC1 April 2007

  • Moved all Clair modules beneath the Clair::* namespace, updated documentation
  • Improved Network Analysis, added Clustering Coefficients code
  • Added Network Generation and Statistics modules

0.955 March 2007

  • Made it possible to distribute clairlib in two distributions, one containing core code and another containing code that may be dependent on other resources
  • Cleaned up unit tests

0.953 February 2007

  • Fixed bugs in Clair::Cluster, Clair::Document involving stemming
  • Cleaned up t/ and test/ directories
  • Created util/ directory
  • Added scripts to util/ directory to:
    • Run a Google query and save the returned URLs to a file
    • Download files from a URL and build a corpus
    • Segment a document into sentences and build a corpus of the sentences
    • Take all documents in a directory and create a corpus
    • Index the corpus (compute TF*IDF, etc.)
    • Compute cosine similarity measures between all documents in a corpus
    • Generate networks corresponding to various cosine thresholds
    • Print network statistics about a network file
    • Generate plots of degree distribution and cosine transitions
  • New methods in Clair::Network:


Personal tools

Main Menu
Clairlib Lab