Related Work

This page assembles different topics to which my work is related on. It is composed of two main parts, Information Access and Machine Learning.


Active Learning

Abe N., Mamitsuka H.
Query learning strategies using boosting and bagging
ICML 1998
PDF

Campbell C., Cristianini N., Smola A.
Query Learning with Large Margin Classifiers.
ICML 2000
PDF

Cohn D.A., Atlas L., Ladner R.
Improving Generalization with Active Learning.
ML 92
PDF

Cohn D.A., Ghahramani Z., Jordan M.I.
Active Learning with Statistical Models.
NIPS 96
Postscript

Dagan I., Engelson S.P.
Committee-Based Sampling for Training Probabilistic Classifiers.
ICML 1995
PDF

Dasgupta S.
Analysis of a greedy active learning strategy.
NIPS 2004
PDF

Dasgupta S.
Coarse sample complexity bounds for active learning.
NIPS 2005
PDF

Freund Y., Seung H.-S., Shamir E., Tishby N.
Selective Sampling Using the Query by Committee Algorithm.
ML 1997
PDF

Lewis D., Gale W.
A sequential Algorithm for Training Text Classifiers
SIGIR 1994
PDF

Long P.M.
Minimum Majority Classification and Boosting
AAAI 2002
PDF

Muslea I., Minton S., Knoblock C.A.
Active+Semi-Supervised Learning = Robust Multi-View Learning.
ICML 2002
PDF

Roy N., McCallum A. K.
Toward optimal Active Learning through Sampling Estimation of Error Reduction.
IJCAI'99
PDF

Schohn G., Cohn D.
Less is more: Active learning with support vector machines.
ICML 2000
Postscript

Seong-Bae P., Zhang B.-T.
Document Filtering Boosted by Unlabeled Data.
IEEE International Symposium on Industrial Electronics 2001
PDF

Seung H.S., Opper M., Sompolinsky H.
Query by Committee.
Proceedings of the Fifth Workshop on Computational Learning 1992
PDF

Sung K.K., Niyagi P.
Active Learning for Function Approximation.
NIPS'95
PDF

Tong S., Koller D.
Support Vector Machine Active Learning with Applications to Text Classification.
ICML 2K
PDF

Tong S.
Active Learning: Theory and Applications.
Ph.D. 2001
PDF

Vlachos A.
Active Learning with Support vector machines.
Master Thesis, 2004
PDF


Additive models, Bagging and Boosting

Bauer E., Kohavi R.
An Empirical Comaprison of Voting Classification Algorithms: Bagging, Boosting and variants.
Machine Learning
Postscript

Breiman L.
Bagging Predictors.
Machine Learning
Postscript

Friedman J., Hastie T., Tibshirani R.
Additive Logistic Regression: a Statistical View of Boosting
Technical Report 1998
PDF Postscript

Grove A.J., Schuurmans D.
Boosting in the limit: Maximizing the margin of the learned ensembles.
AAAI 98
PDF

Iyer R.D.
An Efficient Boosting Algorithm for Combining Preferences.
Master Thesis 99
PDF

Laferty J.
Additive Models, Boosting and Inference for Generalized Divergences
COLT 99
Postscript

Lebanon G., Lafferty J.
Boosting and Maximum Likelihood for Exponential Models
Technical Report 2001
PDF

Mason L., Baxter J., Bartlett P.L. Frean M.
Functional Gradient Techniques for Combining Hypotheses
In Advances in Large Margin Classifiers, Eds. Smola, Bartlett, Schölkopf and Schuurmans 1999
Postscript

Schapire R.E., Freund Y., Bartlett P., Sun Lee W.
Boosting the Margin: A new explanation for the Effectiveness of Voting Methods.
The Annals of Statistics 1998
Postscript

Schapire R.E.
The Strenght of Weak Learnability.
ML 1999
PDF

Schapire R.E.
Theoretical views of Boosting.
EuroColt 1999
Postscript

Schapire R.E., Singer Y.
Improved Boosting Algorithms Using Confidence-reted Predictions.
Machine Learning 1999
PDF


Clustering Techniques

Blimes J.A.
A Gentle Tutorial of the EM algorithm and its application to the parameter estimation for gaussian mixture and Hidden Markov Models.
Tutorial 1998
PDF

Cadez I., Gaffney S., Smyth P.
A General Probabilistic Framework for Clustering Individuals and Objects.
KDD 2000
PDF

Ding C., He X.
Cluster merging and splitting in hierarchical clustering algorithms.
ICDL'01
PS

El-Yaniv R., Souroujon O.
Iterative Double Clustering for Unsupervised and Semi-supervised Learning
ECML 2001
PS

Fraley C., Raftery A.E.
How many clusters? Which Clustering Method? Answers via Model-Based Cluster Analysis
Technical Report
PDF

Govaert G., Nadif M.
Clustering with block mixture models.
Pattern Recognition 2003
PS

Gondek D., Hofmann T.
Conditional Information Bottleneck Clustering.
IEEE Data Mining 2003
PDF

Haralick, R. Harpaz R.
Linear Manifold Clustering
MLDM 2005
PDF

Pelleg D., Moore A.,
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML'2K
PDF

Slonim N., Tishby N.
Document Clustering using Word Clusters via the Information Bottleneck Method
Research and Developpment in Information Retrieval 2000
PS

Xing E.P., Ng A.Y., Jordan M.I., Ruseell S.
Distance Metric Learning, with Application to Clustering with Side-Information
NIPS 15 - 2002
PDF


Dimensionality reduction

Collins M., Dasgupta S., Schapire R.E.
A Generalization of Principal Component Analysis to the Exponential Family
NIPS 2001
PS

Dasgupta S.
Experiments with Random Projection.
UAI 2000
PDF

Miasnikov A.D., Rome J.E., Haralick R.M.
A Hierarchical Projection Pursuit Clustering Algorithm
ICPR 2004
PDF


Discriminant Analysis

Hastie T., Tibshirani R., Buja A.,
Flexible Discriminant Analysis by Optimal Scoring.
Journal of the American Statistical Association 1993
PDF

Hastie T., Tibshirani R., Buja A.,
Flexible Discriminant and Mixture Models.
Neural Networks and Statistics 1995
PDF

Hastie T., Buja A., Tibshirani R.
Penalized Discriminant Analysis.
Annals of Statistics, 1995
Postscript


Learning Theory

Agarwal S., Roth D.
Learnability of Bipartite Ranking Functions
COLT, 2005
PDF

Bartlett P.L., Bousquet O., Mendelson S.
Local Rademacher Complexities
Annal of Statistics, 2005
PDF

Blanchard G., Bousquet O., Massart P.
Statistical Performance of Support Vector Machines
Annal of Statistics, 2004
Postscript

Bottou L.
Une Approche théorique de l'Apprentissage Connexionniste: Applications à la Reconnaissance de la Parole
PhD Theis, 1991
PDF

Bousquet O., Boucheron S., Lugosi G.
Theory of Classification: A Survey of Recent Advances
ESAIM 2005
PDF

Bousquet O., Boucheron S., Lugosi G.
Introduction to Statistical Learning Theory
Advanced Lectures on Machine Learning 2004
PDF

Cortes C.
Prediciton of Generalization Ability in Learning Machines
PhD Theis, 1995
PDF

Clémençon S., Lugosi G., Vayatis N.
Ranking and Scoring Using Empirical Risk Minimization
COLT 2005
PDF

Crammer K., Singer Y.
Loss Bounds for Online Category Ranking
COLT 2005
PDF

Kääriäinen M.
Generalization Error Bounds Using Unlabeled Data
COLT 2005
PDF

Kääriäinen M., Langford J.
A comparison of Tight Generalization Error Bounds
ICML 2005
PDF

Langford J.
Tutorial on Practical Prediction Theory for Classification
JMLR 2005
PDF

Langford J. and Seeger M.
Bounds for Averaging Classifiers
Technical Report 2001
PDF

Petra P.
Data-Dependent Analysis for Learning Algorithms
PDF

Rennie J.D.M.
Bounded Loss Classification
PDF

Rudin C., Cortes C., Mohri M., Schapire R.E.
Margin-Based Ranking Meets Boosting in the Middle
COLT 2005
PDF


Maximum Entropy

Chen S., Rosenfeld R.
A Survey of Smoothing Techniques for ME Models
IEEE Transactions on speech and Audio Proceesing, 8(1)
PDF

Jaakkola T., Meila M., Jebara T.
Maximum Entropy Discrimination.
MIT AITR-1668 1999
Postscript


On-line Learning

Freund Y., Schapire R.E.
Large margin classification using the perceptron algorithm
Machine Learning Journal, 37(3):277-296, 1999
PDF


Probabilistic Latent Semantic Analysis

Blei B.M., Jordan M.I.
Modeling Annotated Data
Sigir 2003
PDF

Hofmann T.
Probabilistic Latent Semantic Indexing.
SIGIR 99
PDF


Ranking

Agarwal S.
Ranking on Graph Data
ICML 2006
PDF

Agarwal S., Graepel T., Herbrich R., Har-Peled S., Roth D.
Generalization Bounds for the Area Under the ROC Curve
JMLR 2005
PDF

Brinker K., Fürnkranz J., Hüllermeier E.
Label Ranking by Learning Pairwise Preferences
JMLR 2006
PDF

Brinker K.
Active Learning of Label Ranking Functions
ICML 2004
PDF

Chu W., Ghahramani Z.
Preference Learning with Gaussian Processes
ICML 2005
PDF

Cohen W.W., Scahpire R.E., Signer Y.
Learning to Order Things
NIPS 1998
PDF

Cortes C., Mohri M.
Confidence Intervals for the Area Under the ROC Curve
NIPS 2004
Postscript

Collins M.
Ranking Algorithms for Named-Entity Extraction: Boosting and Voted Perceptron
ACL 2002
PDF

Dekel O., Manning C.D., Singer Y.
Log-Linear Models for Label Ranking
NIPS 2003
PDF

Freund Y., Iyer R., Schapire R.E., Singer Y.
An Efficient Boosting Algorithm for Combining Preferences
JMLR 2003
PDF

Fürnkranz J., Hüllermeier E
Pairwise Preference Learning and Ranking
ECML 2003
PDF

Fürnkranz J., Hüllermeier E
Preference Learning
KIJ 2005
PDF

He J., Li M., Zhang H.-J., Tong H., Zhang C.
Manifold-Ranking Based Image Retrieval
MM 2004
PDF

Joachims T.
Optimizing Search Engines using ClickThrough Data
KDD 2002
PDF

Lovasz L.
Random Walk on Graphs: A Survey
Combinatorics
PS

Rudin C.
Ranking with a P-Norm Push
COLT 2006
PDF

Saar-Tsechansky M. Provost F.
Active Learning for Class Probability Estimation and Ranking
IJCAI 2001
PDF

Rudin C., Joshi A.K.
Ranking and Reranking with Perceptron
RNLP 2004
PDF

Yu H.
SVM Selective Sampling for Ranking with Application to Data Retrieval
KDD 2005
PDF

Zhou D.,Weston J.,Gretton A., Bousquet O., Schölkopf
Ranking on data Manifolds.
ICML 2004
PDF


Semi-supervised Learning

Altun Y., McAllester D., Belkin M.
Maximum Margin Semi-supervised Learning for Structures Variables
NIPS 2005
Postscript

Ando R.K., Zhang T.
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data
JMLR 2005
PDF

Balcan M.-F., Blum A.
A PAC-style Model for Learning from Labeled and Unlabeled Data
COLT 2005
PDF

Baluja S.
Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data.
NIPS 93
PDF

Basu S., Banerjee A., Mooney R.
Semi-Supervised Clustering by Seeding.
ICML'02
PDF

Belkin M., Niyogi P.
Semi-Supervised Learning on Riemannian Manifolds.
Machine Learning 2004
PDF

Bennett K.P., Demiriz A. Maclin R.
Exploiting Unlabeled Data in Ensemble Methods.
KDD'02
PDF

Blum A., Mitchell T.
Combing Labeled and Unlabeled Data with Co-Training.
Colt'98
Postscript

Chen K., Wang S.
Regularizaed Boost for Semi-supervised Learning.
NIPS'07
PDF

Chapelle O., Zien A.
Semi-supervised Classification by Low Density Separation.
AI & Statistics 2005
PDF

Chapelle O., Weston J., Schölkopf B.
Cluster Kernels for Semi-Supervised Learning.
NIPS 2003
Postscript

Collins M., Singer Y.
Unsupervised Models for Named Entity Classification.
EMNLP'99
Postscript

De Comité F., Denis F., Gilleron R., Letouzey F.
Positive and unlabeled data help learning.
COLT'99
Postscript

Cozman F.G., Cohen I.
Unlabeled Data Can Degrade Classification Performance of Generative Classifiers.
Report 2002
PDF

Cozman F.G., Cohen I., Cirelo M.C.
Semi-supervised learning of mixture models
ICML 2003
PDF

Goldman S.A., Kwek S.S., Scott S.D.
Learning From Examples With Unspecified Attribute Values.
CL'97
Postscript

Goldman S.A., Zhou Y.
Enhancing Supervised Learning with Unlabeled Data.
ICML'2K
Postscript

Grandvalet Y., Bengio Y.
Semi-supervised Learning by Entropy Minimization
NIPS 2004
PDF

Jaakkola T., Meila M., Jebara T.
Maximum entropy discrimination.
NIPS 2000
PDF

Joachims T.
Transductive Inference for Text Classification using Support Vector Machines
ICML 1999
Postscript

Lewis D., Catlett J.
Heterogenous uncertainty sampling for supervised learning
ICML 1994
PDF

Leslie C.S., Eskin E., Cohen A., Weston J., Noble W.S.
Mismatch string kernels for discriminative protein classification.
Bioinformatics 2004
PDF

Mitchell T.M.
The role of Unlabeled Data in Supervised learning.
Science Cognitive'99
Postscript

Muslea I., Minton S., Knoblock C.A.
Active+Semi-Supervised Learning = Robust Multi-View Learning.
ICML 2002
PDF

Nigam K.
Using Unlabeled Data to Improve Text Classification
Ph.D. 2001
PDF

Nigam K., Ghani R.
Analyzing the effectiveness and applicability of co-training
CIKM'2K
PDF

Nigam K., McCallum A. K., Thrum S., Mitchell T.
Text Classification from Labeled and Unlabeled Documents using EM.
Machine Learning'2K
Postscript

Ratsaby J., Venkatesh S.S.
Learning from a Mixture of Labeled and Unlabeled Examples with Parametric Side Information
ICML 1995
PDF

Ratsaby J., Maiorov V.
On the Value of Partial Information for Learning from Examples
Journal of Complexity 1998
PDF

Seeger M.
Learning with Labeled and Unlabeled Data.
Rapport 2000
Postscript

Shahshahani B.M., Langrebe D.A.
The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon.
IEEE Geoscience & Remote Sensing'94
PDF

Szummer M., Jaakkola T.
Kernel expansions with unlabeled examples.
NIPS 2000
PDF

Szummer M., Jaakkola T.
Partially labeled classification with Markov random talks.
NIPS 2001
PDF

Tur G., Hakkani-Tür D., Schapire R.E.
Combining Active And Semi-Supervised Learning for Spoken Language Understanding.
Speech Communication 2005
PDF

Zhang T.
The Value of Unlabeled Data for Classification Problems.
ICML 2000
PDF

Zhu X.,Ghahramani Z.,Lafferty J.
Semi-supervised learning using gaussian fields and harmonic fucntions.
ICML 2003
PDF


Support Vector Machines

Farquhar J.D.R., Hardoon D.R., Meng H., Sahwe-Taylor J., Szedmak S.
Two view learning: SVM-2K, Theory and practice
NIPS 2003
PDF

Platt J.C.
Fast Training of Support Vector Machines using Sequential Minimal Optimization
Book Chapter
PDF

Schölkopf B., Smola A.
A Tutorial Introduction
Learning with Kernels
PDF

Sahwe-Taylor J., Szedmak S.
Synthesis of Maximum Margin and Multiview Learning using Unlabeled Data
ESANN 2006
PDF


Unsupervised learning

De sa V.
Learning Classification with Unlabeled Data
NIPS'93
Postscript

De sa V.
Unsupervised Classification Learning from Cross-Modal Environmental Structure
Thesis 94
Postscript

Hinton G. H., Dayan P., Frey B. J., Neal R. M.
The wake-sleep algorithm for unsupervised neural networks.
Science 95
PDF

Jordan M.
The wake-sleep algorithm for unsupervised neural networks.
Neural Computation 94
PDF

Jordan M., Xu L.
On Convergence Properties of the EM Algorithm for Gaussian Mixtures
Neural Computation 96
PDF


Machine learning (divers)

Amari S.-I.
Information Geometry of the EM and em algorithms for Neural Networks.
Neural Networks'95
Postscript

Berger A.
Information Retrieval and Information Theory.
Research and Development in Information Retrieval, 1999
PDF

Berger A.
The Improved Iterative Scaling Algorithm: A Gentle Introduction
Technical Report, 1997
PDF

Boyd S.
Convex Optimisation.
Book 2004
PDF

Breiman L.
Models and Selection Criteria for Regression and Classification
Technical Report MSR-TR-97-08, Microsoft Research
Postscript

Chapelle O.
Support Vector Machines: principe d'induction, règlage automatique et connaissances a priori.
These
Postscript

Neal R.M., Hinton G.E.
A view of the EM algorithm that justifies incremental, sparse, and other variants.
Learning in Graphical Models, 355-368
PDF

Ng A., Jordan M.
On Discriminative vs. Generative classifiers: A comparison of logistic regression and Naive-Bayes.
NIPS 2001
PDF

Jaakkola T., Haussler D.
Exploiting generative models in discriminative classifiers.
NIPS 11
Postscript

Ng S.K., McLachlan G.J.
On the choice of the number of blocks with the incremental EM algorithm for the fitting of normal mixtures
Statistics and computing 13, 2003, 45-55
PDF

McLachan G.J., Peel D.
Mixture of Factor Analyers
ICML 2000, 599-606
PDF

Meir R., El-Yaniv R., Ben-David S.
Localized Boosting.
CL'2K
Postscript

Rabiner L.
A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.
IEEE 1989
PDF


Information Extraction

Cardie C.
Empirical Methods in Information Extraction.
AI Magazine 2005
PDF

Choi Y., Cardie C., Riloff E., Patwardhan S.
Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns.
EMNLP 2005
PDF

Freitag D.
Information Extraction in HTML: Application of a General Machine Learning Approach.
AAAI 1998
Postscript

Kristjansson T., Culotta A., Viola P., McCallum A.
Interactive Information Extraction with Constrained Random Fields.
AAAI'04
PDF

Pierce D., Cardie C.
User-Oriented Machine Learning Strategies for Information Extraction: Putting the Human Back in the Loop.
IJCAI'01 Workshop on Adaptive Text Extraction and Mining
PDF

Riloff E.
Automatically Generating Extraction Patterns from Untagged Text
AAAI 96
PDF


Text Classification

Buckley C., Allan J., Salton Gerard
Automatic Routing and Ad-hoc Retrieval Using SMART : TREC 2.
TREC-2 1993
Postscript

Cai L., Hofmann T.
Text Categorization by Boosting Automatically Extracted Concepts.
SIGIR 2003
PDF

Cline M.
Utilizing HTML Structure and Linked Pages to Improve Learning for Text Categorization.
These 99
Postscript

Dhillon I. S., Fan J. and Guan Y.
Efficient Clusterting of very large document collections.
Chapitre de livre
PDF

Douglas Baker L.
Distributional Clustering of Words for Text Classification.
SIGIR 1998
Postscript

Fabio C.
Is This document relevant? ... probably.
ACM Computing surveys 30, 4 (Dec. 1998), 528--552
PDF

Joachims T.
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML 1997
PDF

Joachims T.
Text Categorization with Support Vector Machines: Learning with Many Relevant Features
ECML 98
PDF

Lewis David D.
An evaluation of phrasal and clustered representations on a text categorization tesk.
SIGIR 1992
PDF

Lewis David D.
Feature Selection and Feature Extraction for Text Categorization.
Proceedings of Speech and Natural Language Workshop
DARPA 1992
Postscript

McCallum A.K.
Multi-Label Text Classification with a Mixture Model Trained by EM.
AAAI'99
Postscript

McCallum A., Rosenfeld R., Mitchell T., Ng. A.
Improving Text Classification by Shrinkage in a Hierarchy of Classes.
ICML 1998
Postscript

McCallum A., Nigam. K.
Employing EM and Pool-based Active Learning for Text Classification.
ICML 1998
PDF

Mladenic D.
Feature subset selection in text-learning.
ECML 1998
PDF

Nigam K., Lafferty J., McCallum A.
Using Maximum Entropy for Text Classification.
1999
PDF

Robertson S. E., Walker S., Jones S., Hancock-Beaulieu M.-M., Gatford M.
Okapi at TREC 3.
1996
PDF

Yang Y., Pederson J. O.
A Comparative Study on Feature Selection in Text Classification
1997
PDF

Yang Y.
An Evaluation of Statistical Approaches to Text Categorization
IR 1999
PDF

Zelikovitz S., Hirsh H.
Improving Short-Text Classification Using Unlabeled BAckground Knowledge to Assess Document Similarity
ICML'2K
Postscript


Language Modeling

Kraaij W., Spitters M., Van Der Heijden M.
Combining a Mixture Langage Model and Naive Bayes multi-document Suumarization.
DUC 2001
PDF

Westerveld T., Kraaij W., Hiemstra D.
Retrieving Web Pages using Content, Links, URLs and Anchors
TREC 10
PDF

Zhai C., Lafferty J.
A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval
SIGIR 2001
PDF


Passage Retrieval

Callan J. P.
Passage-Level Evidence in Document Retrieval.
SIGIR 1994
Postscript

Knaus D., Mittendorf E., Schäuble P.
Improving a Basic Retrieval Method by Links and Passage Level Evidence.
TREC-3, 1994
Postscript

Li H., Yamanishi K.
Topic Analysis using a Finite Mixture Model.
SIGADT 2000
PDF

Mittendorf E., Schäuble P.
Document and Passage Retrieval Based on Hidden Markov Model.
SIGIR 1994
Postscript

Moffat A., Sacks-Davis R., Wilkinson R. and Zobel J.
Retrieval of Partial Documents.
TREC-2 1993
Postscript

Salton G.J., Buckley C.
Automatic Text Structuring and Retrieval - Experiments in Automatic Encyclopedia searching.
SIGIR 91
PDF

Salton G.J., Allan J., Buckley C.
Approaches to Passage Retrieval in Full Text Information Systems.
SIGIR 93
PDF

Wilkinson R.
Effective Retrieval of Structured Documents.
ACM SIGIR, 1994.
PDF


Text Segmentation

Barzilay R., Elhadad M.
Using Lexical Chains for Text Segmentation.
EMNLP'97
PDF

Beeferman D., Berger A., Lafferty J.
Text Segmentation Using Exponential Models.
EMNLP'97
Postscript

Beeferman D., Berger A., Lafferty J.
Statistical Models for Text Segmentation.
ML'99
Postscript

Choi F. Y. Y.
Advances in domain independant linear text segmentation.
NAACL'2K
PDF

Hearst M.A., Plaunt C.
Subtopic Structuring for full-length Document Access.
SIGIR 1993
Postscript

Hearst M.A.
Cases as structured indexes for full-length documents.
AAAI 1993
Postscript

Hearst M.A.
TextTiling: A Quantitative Approach to Discourse Segmentation.
Technical Rapport
Postscript

Hearst M.A.
Multi-Paragraph Segmentation of Expository Texts.
ACL 1994
Postscript

Huang X., Peng F., Schuuramns D., Cercone N., Robertson S.E.
Applying Machine Learning to Text Segmentation for Information Retrieval.
Information Retrieval 2003
PDF

Ji X., Zha H.
Domain-independant Text Segmentation using Anisotropic Diffusion and Dynamic Programming.
Sigir 2003
PDF

Kozima H.
Text Segmentation Based on Similarity.
ACL 1993
Postscript

Litman D.J., Passonneau R.J.
Combining Multiple Knowledge Sources for Discourse Segmentation.
ACL 1995
Postscript

Mulbregt, P.van, Carp, I., Gillick, L., Lowe, S., and Yamron, J.
Text Segmentation and Topic Tracking on Broadcast News Via a Hidden Markov Model Approach.
ICSLP 1998
Postscript

Ponte J.M., Croft W.B.
Text Segmentation by Topic.
DL 1997
Postscript

Reyner J. C.
An Automatic Method of Finding Topic Boundaries.
DL 1997
PDF

Salton G., Singhal A., Buckely C., Mitra M.
Automatic Text Decomposition Using Text Segments and Text Themes.
HyperText 1996
Postscript


Structured Documents

Cline M.
Utilizing HTML Structure and Linked Pages to Improve Learning for Text Categorization.
Thesis, 1999
PDF

Dumais S., Chen H.
Hierarchical Classification of Web Content.
Sigir 2000
PDF

Kamps J., De Rijke M., Sigurbjörnsson B.
The Importance of Length Normalization for XML Retrieval.
IR Journal 2004
PDF

Yang Y., Slattery S., Ghani R.
A study of Approahces to Hypertext Categorization.
Journal of Intelligent Information Systems 2002
PDF


Text Summarization

Banko M., Mittal V., Kantrowitz M., Goldstein J.
Generating Extraction-Based Summaries from Hand-Written Summaries by Aligning Text Spans.
PacLing 1999
Postscript

Barzilay R., Lee L.
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
HLT-NAACL 2004
PDF

Berger A.L., Mittal V.
Ocelot: A system for summarizing web pages
Research and Development in Information Retrieval
PDF

Chuang W.T., Yang J.
Extracting sentence segments for text summarization: a machine learning approach.
SIGIR 2000
PDF

Farzindar A., Lapalme G.
Legal Text Summarization by Exploration of the Thematic strucutres and Argumentative Roles.
ACL 2004
PDF

Hahn U., Mani I.
The Challenges of Automatic Summarization
IEEE 2000
PDF

Hongyan J., Barzilay R., McKeown K., Elhadad M.
Summarization Evaluation Methods: Experiments and Analysis.
AAAI 1998
Postscript

Hongyan J., McKeown K.
The Decomposition of Human-Written Summary Sentences.
SIGIR 1999
Postscript

Goldstein J.
Automatic Text Summarization of Mutliple Documents.
Technical Report
Postscript

Goldstein J., Kantpowitz M., Mittal V., Carbonell J.,
Summarizing Text Documents: Sentence Selection and Evaluation Metrics.
SIGIR 1999
Postscript

Jing H., Barzilay R., McKeown K., Elhadad M.
Summarization Evaluation Methods: Experiments and Analysis.
AAAI 1999
Postscript

Kruengkrai C., Jaruskulchai C.
Using One-Class SVMs for Relevant Sentence Extraction
International Symposium on Communications and Information Technologies 2003
PDF

Kupiec J., Pederson J., Chen F.
A Trainable Doscument Summarizer.
SIGIR 1995
Postscript

Marcu D.
The Automatic Construction of Large-scale corpora for Summarization Research.
SIGIR 1999
Postscript

Marcu D.
The Rhetorical Parsing of Unrestricted Texts: A Surface-based Approach.
Rapport
PDF

Marcu D.
From discourse structure to text summaries.
ACL/EACL 1997
Postscript

Mitra M., Singhal A., Buckley C.
Automatic Text Summarization by Paragraph Extraction.
ACL 1997
Postscript

Mittal V., Kantrowitz M., Goldstein J., Carbonell J.
Selecting Text Spans for Document Summaries: Heuristics and Metrics.
Technical Rapport
Postscript

Moens M.-F., Angheluta R., Dumortier J.
Generic Technologies for Single- and Multi-document Summarizaton
Information Processing and Management 2005
PDF

Nagao K., Hasida K.
Automatic Text Summarization based on the Global Document Annotation.
COLING 1998
Postscript

Nomoto T., Matsumoto Y.
A new Approach to Unsupervised Text Summarization.
Sigir 2001
PDF

Radev D.R., Jing Hongyan, Budzikowska M.
Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies.
NLP 2000
Postscript

Radev D.R., Hovy E., McKeown K.
Introduction on the Special Issue on Summarization.
ACL 2002
PDF

Sakai T., Sparck-Jones K.
Generic summaries for indexing in information retrieval
Sigir 2001
PDF

Summac Report
Summac Tipster Evaluation Program.
1998
Postscript

Taghva K., Gilbreth J.
Recognizing Acronyms and their definitions.
IJDAR 1999 Vol.1
PDF

Teufel S., Moens M.
Sentence Extraction as Classification Task.
Mani and Maybury ed.
Postscript

Teufel S., Moens M.
Sentence extraction and rhetorical classification for flexible abstracts.
AAAI'98
Postscript

Witbrock M.J., Mittal V.O.
Ultra-Summarization: A statistical Approach to Generating Highly Condensed Non-Extractive Summaries
Research and Development in Information Retrieval 1999
Postscript

Zechner K.
Fast Generation of Abstracts from General Domain Text Corpora by Extracting Relevant Sentences.
COLING 1996
Postscript

Zechner K.
Automatic Text Abstracting by Selecting Relevant Passages.
MSc Dissertation
Postscript


Question Answering

Blair-Goldensohn S., McKeown K.R., Hazen A.
A Hybrid Approach for QA Track Definitional Questions
TREC 2003
PDF

Chu-Caroll J., Czuba K., Prager J., Ittcheriah A.
In Question Answering, Two Heads are Better Than One.
NAACL 2003
PDF

Ferret O., Grau B., Illouz G., Jacquemin C., Masson N.
QALC- The Question-Answering program of the Language and Cognition Group at LIMSI-CNRS.
TREC-8 2000
PDF

Ferret O., Grau B., Illouz G., Jacquemin C., Masson N.
QALC- The Question-Answering program of the Language and Cognition Group at LIMSI-CNRS.
TREC-8 2000
PDF

Hovy E., Gerber L., Hermjakob U., Junk M., Lin C.-Y.
Question answering in webclopedia.
TREC 9
PDF

Ravichandran D., Hovy E.
Learning Surface Text Patterns for a Question Answering System.
ACL 2002
PDF

Ramakrishnan G., Chakrabarti S., Paranjpe D., Bhattacharyya P.
Is Question Answering an Acquired Skill?
WWW 2004
PDF


Divers

Charniak E., Hendrickson C., Jacobson N., Perkowitz M.
Equations for Part-of-Speech Tagging.
AAAI 1993
Postscript

Cranor L.F., LaMacchia B.A.
Spam!
ACM 1998
PDF

Fuhr N.
Probabilistic Models in Information Retrieval.
The Computer Journal 1992.
Postscript

Fuhr N., Pfeifer U., Bremkamp C., Pollmann M., and Buckley C.
Probabilistic Learning Approaches for Indexing and Retrieval with the TREC-2 collection.
TREC-2, 1993
Postscript

Gale W., Church K.
A program for aligning sentences in bilingual corpora.
ACL, 1991
Postscript

Haines D., Bruce-Croft W.
Relevance Feedback and Inference Networks.
SIGIR 1993
Postscript

McMahon J.
A Review of Statistical Language Processing Techniques.
The Queen's University of Belfast, 1995.
Postscript

McMahon J., Smith F.J.
Structural Tags, Annealing and Automatic Word Classification.
Artificial Intelligence and the Simulation of Behaviour Quarterly, 1994.
Postscript

Miller D.R.H., Leek T., Schwartz R.M.
BBN at TREC7: Using Hidden Markov Models for Information Retrieval.
TREC-7, 1999.
PDF

Pereira F., Tibshy N., Lillian L.
Distributional Clustering of English Words.
CL 1993.
Postscript

Salton G., Allan J., Buckley C.
Approaches to Passage Retrieval in Full Text Information Systems.
SIGIR, 1993.
PDF

Tang Y.Y., Cheriet M., Liu J., Said J.N., Suen C.Y
Document Analyis And Recognition By computers
Book
Postscript

Yarowsky D.
Word sense disambiguation using statistical models of roget's categories trained on large corpora.
Colling 1992
Postscript