| Publications 
Notes
Andrew I. Schein, Johnnie F. Caver, Randale J. Honaker, and Craig H. Martell.  
Author Attribution Evaluation with Novel Topic Cross-Validation. Appeared in
 The 2010 International Conference on Knowledge Discovery and Information Retrieval. October 25-28.  Valencia, Spain.
[.pdf][data]
	Grant Gehrke, Craig Martell, Andrew Schein and Pranav Anand. Projecting Away The Class Imbalance Problem in Author Attribution.
International Journal of Semantic Computing. Volume: 3, Issue: 3 (September 2009).
Andrew I. Schein and
Lyle H. Ungar. Active Learning
for Logistic Regression: An Evaluation. Machine Learning. 68:3 2007. Publisher DOI: 10.1007/s10994-007-5017-7. [.ps.gz] [.pdf]
 
Jinying Chen, Andrew I. Schein,
Martha S. Palmer, and
Lyle H. Ungar. An Empirical Study of
the Behavior of Active Learning for Word Sense Disambiguation. Appeared in the
Proceedings of HLT-NAACL
2006. June 5-7, 2006. New York, NY. [.pdf]
Andrew I. Schein.  Active Learning for Logistic Regression.  Ph.D. Dissertation in
Computer and Information Science. The University of Pennsylvania. Defended: April 21, 2005. Supervisor: Lyle H. Ungar. Committee: Andreas Buja, Mark Liberman, Andrew McCallum (external), and Fernando C. N. Pereira (chair).
Talk: [.ps.gz] [.pdf] Doc: [.ps.gz] [.pdf] 
Automatic term list generation for entity tagging.
Ted Sandler, Andrew I. Schein
and Lyle H. Ungar.
Bioinformatics Advance Access published on October 25, 2005, DOI 10.1093/bioinformatics/bti733.
	Seth Kulick, Ann Bies, Mark Liberman, Mark Mandel, Ryan McDonald, Martha
Palmer, Andrew Schein, Lyle Ungar, Scott Winters, and Pete White.  Integrating
Annotation for Biomedical Information Extraction. BioLINK 2004: Linking
Biological Literature, Ontologies and Databases, pp. 61-68. May 6,
2004. Boston, MA. [.pdf]
Andrew I. Schein and  Lyle
H. Ungar. A-Optimality for Active Learning of Logistic Regression
Classifiers. The University of Pennsylvania Department of Computer and
Information Science Technical Report No. MS-CIS-04-07: [.ps.gz][.pdf].
Andrew I. Schein, S. Ted
Sandler and Lyle H. Ungar. 
Bayesian Example Selection using BaBiES. The University of Pennsylvania Department of Computer and
Information Science Technical Report No.  MS-CIS-04-08  [.ps.gz][.pdf].
Seth Kulick, Mark Liberman, Martha Palmer, and Andrew  Schein. Shallow Semantic Annotation of Biomedical Corpora for Information Extraction.  Proceedings of the 2003 ISMB Special Interest Group Meeting on Text Mining (a.k.a. BioLink). June 27, 2003. Brisbane, Australia. [.ps.gz] [.pdf]  [.ppt talk]
Andrew I. Schein, Alexandrin Popescul,
				   Lyle H. Ungar, and David M. Pennock. CROC: A New Evaluation Criterion for Recommender Systems. Electronic Commerce Research,  Special Issue: World Wide Web Electronic Commerce, Security and Privacy, Editors: Mary Ellen Zurko and Amy Greenwald , Volume 5, Issue 1, January 2005 (pp. 51-74). Draft: [.ps.gz] [.pdf]
Andrew I. Schein, 
				      Lawrence K. Saul, and
				   Lyle H. Ungar. A Generalized Linear Model for Principal Component Analysis of Binary Data. Appeared in Proceedings of the  9'th International Workshop on Artificial Intelligence and Statistics. January 3-6, 2003. Key West, FL. This is the paper that gives "Logistic PCA" its name.  [talk] [.ps.gz] [.pdf]
Andrew I. Schein, Alexandrin Popescul,
				   Lyle H. Ungar, and David M. Pennock. Methods and Metrics for Cold-Start Recommendations. Appeared in Proceedings of the 25'th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002), pp 253-260. August 11-15, 2002. Tampere, Finland. [.ps.gz] [.pdf]
Andrew I. Schein, Alexandrin Popescul,
				   Lyle H. Ungar, and David M. Pennock. Generative Models for Cold-Start Recommendations. Workshop on Recommender Systems at SIGIR, 2001. [.ps.gz] [.pdf]
Andrew I. Schein, Alexandrin Popescul, and
				   Lyle H. Ungar. PennAspect: A Two-Way Aspect Model Implementation. University of Pennsylvania Department of Computer and Information Science, Technical Report MS-CIS-01-25. [.ps]
Andrew I. Schein, Jessica
C. Kissinger, and  Lyle
H. Ungar. Chloroplast Transit Peptide Prediction: a Peek Behind the Black
Box. Nucleic Acids Research, 2001, Vol 29, No. 16 e82. 
View Online.  A webserver with the software is available. Also, see the software
section of my web page for a copy of the code. 
 
Posters
Andrew I. Schein. Notes on the CROC Curve.  Unpublished.  [.ps.gz] [.pdf]
	Andrew Schein. Computation of log(\Phi(z)) For Large Negative z. In 2012, I discovered that the scipy routine scipy.stats.logcdf(z) (normal distribution) produced negative infinite values for moderately negative values of z.  I submitted a patch, and now a lot of software relies on the improvement. The linked PDF provides an explanation of the mechanics of the patch and some supporting analysis. The associated issue tracker is here. [.pdf] 
 
Invited TalksAndrew I. Schein and  Lyle
H. Ungar.  Active Learning for Multi-Class Logistic Regression. LEARNING 2005. Snowbird, Utah. April
5-8, 2005. [.ps.gz][ .pdf ]. Abstract:  [.ps.gz][ .pdf ].
 
 Andrew I. Schein.  Active Learning for Logistic Regression. Alberta Ingenuity Centre for
Machine Learning, The University of
Alberta. January 6, 2005.  Talk Overheads [.ps].
 |