November 17, 2006
Site and feed changes coming soon
We’re going to upgrade access to our research in various cool ways in the near future.
Right now, please bear with me in what is essentially a test post. In theory, I’ve switched the feeds here over to Feedburner. Now I’m going to test if that really has happened.
EDIT: That didn’t work. I’m going to put things back the way they were.
Categories: About this blog
Subscribe to our complete feed!
Comments
One Response to “Site and feed changes coming soon”
Leave a Reply
Hi everybody!
TermExtractor, my master thesis, is online at the
address http://lcl2.di.uniroma1.it.
TermExtractor is a FREE and high-performing software package for Terminology
Extraction. The software helps a web community to
extract and validate relevant domain terms in their
interest domain, by submitting an archive of
domain-related documents in any format
(txt, pdf, ps, dvi, tex, doc, rtf, ppt, xls, xml,
html/htm, chm, wpd and also zip archives.)
TermExtractor extracts terminology consensually
referred in a specific application domain. The
software takes as input a corpus of domain documents,
parses the documents, and extracts a list of
“syntactically plausible” terms (e.g. compounds,
adjective-nouns, etc.).
Documents parsing assigns a greater importance
to terms with text layouts (title, bold, italic,
underlined, etc.). Two entropy-based measures, called
Domain Relevance and Domain Consensus, are then used.
Domain Consensus is used to select only the terms
which are consensually referred throughout the corpus
documents. Domain Relevance to select only the terms
which are relevant to the domain of interest, Domain
Relevance is computed with reference to a set of
contrastive terminologies from different domains.
Finally, extracted terms are further filtered using
Lexical Cohesion, that measures the degree of
association of all the words in a terminological
string.
—
Francesco Sclano
home page: http://lcl2.di.uniroma1.it/~sclano
msn: francesco_sclano@yahoo.it
skype: francesco978