I am a researcher (Principal Staff Scientist) in the AI Group of Linkedin. I work on Distributed training of machine learning and AI systems, Huge scale Linear programming, and Information extraction projects.

Previously (Decemebr 2017-December 2019) I was a Distinguished researcher in Criteo Research, a great team of researchers (spread out in Paris, Grenoble and Palo Alto) working on fundamental and applied research problems in computational advertising. Previous to that, I was in Microsoft (June 2012-December 2017) (located in Mountain View, CA), first with the CISL team in Big Data and later with the FAST division of Microsoft Office. From January 2004-April 2012 I was with the Machine Learning Group of Yahoo! Research, in Santa Clara, CA. My recent research has mainly focused on the design of distributed training algorithms for developing various types of linear and nonlinear models on Big Data, and the application of machine learning to textual problems.

Prior to joining Yahoo! Research, I worked for 11 years at the Indian Institute of Science, Bangalore, and for 5 years at the National University of Singapore. During those sixteen years my research focused on the development of practical algorithms for a variety of areas, such as machine learning, robotics, computer graphics and optimal control. (Many of the publications during that period are not mentioned in this page.) My works on support vector machines (e.g., improved SMO algorithm), polytope distance computation (e.g., GJK algorithm) and model predictive control (e.g., stability theory) are highly cited. Overall, I have published more than 100 papers in leading journals and conferences. I am an Action Editor of JMLR (Journal of Machine Learning Research) since 2008. Previously I was an Associate Editor for the IEEE Transactions on Automation Science and Engineering.

**Contact:** keselvaraj at linkedin dot com

**Slide deck of my talk on Interplay between Optimization and Generalization in Deep Neural Networks given at the
3rd annual Machine Learning in the Real World Workshop organized by
Criteo Research, Paris, on 8th November, 2017:
Optimization_and_Generalization_Keerthi_Criteo_November_08_2017.pptx**. This is a review and critique of recent works in this topic. The actual talk was for 45 minutes and I covered the main ideas quickly. The ppt has more detailed material. I intend
to update the slide deck as new works are published on this and related topics.

**Slide deck of my talks on Optimization for machine learning given at UC Santa Cruz in February, 2017:
Keerthi_Optimization_For_ML_UCSC_2017.pdf**

**In 2010 I attended and gave a talk at GilbertFest, a symposium in honor of my Ph.D thesis advisor, Elmer G. Gilbert. Check out the
symposium page, which also has
pdfs of his classic papers in Control and Optimization. I am honored to have some of my joint papers with him in that list. Also, check out his
A Life in Control talk given at the University of Michigan, Ann Arbor covering his marvelous career in control systems.
**

- LIBLINEAR-a
*Library for Large Linear Classification*written by Chih-Jen Lin and his students. It has codes for the methods covered in the ICML'08, KDD'08, ICML'07/JMLR'08 papers given below.**Check out LIBLINEAR's most recent Distributed version. It gives cool speedups in multicore settings. For the cluster case where communication is a bottleneck, the algorithms in our JMLR-2017 papers below are very good.**

Citations of my papers in Google Scholar

2018 2017 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 and Earlier

**Batch-Expansion Training: An Efficient Optimization Framework**. With Michal Derezinski, Dhruv Mahajan, Vishy Vishwanathan, and Markus Weimer, To be presented in AISTATS, 2018. [abstract]

**An efficient distributed learning algorithm based on effective local functional approximations**. With Dhruv Mahajan, Nikunj Agrawal, S. Sundararajan and Leon Bottou, To appear in JMLR, 2017. [abstract]**A distributed block coordinate descent method for training l_1 regularized linear classifiers**. With Dhruv Mahajan and S. Sundararajan, To appear in JMLR, 2017. [abstract]**Gradient Boosted Decision Trees for High Dimensional Sparse Output**. With Si Si, Cho-Jui Hsieh, Huan Zhang, Dhruv Mahajan and Inderjit Dhillon, Accepted in ICML, 2017. [abstract]

**Towards a Better Understanding of Predict and Count Models**. With Tobias Schnabel and Rajiv Khanna, arXiv:1511.02024v1, 2015. [abstract]**Learning a Hierarchical Monitoring System for Detecting and Diagnosing Service Issues**. With Vinod Nair, Ameya Raul, Shwetabh Khanduja, Vikas Bahirwani, S. Sundararajan, Steve Herbert, Sudheer Dhulipalla, and Qihong Shao, KDD, 2015. [abstract]**Near Real-time Service Monitoring Using High-dimensional Time Series**. With Vinod Nair, Sundararajan Sellamanickam, Shwetabh Khanduja, Ameya Raul, and Ajesh Shaj, ICDM, 2015. [abstract]

**A distributed block coordinate descent method for training L1 regularized linear classifiers**. With Dhruv Mahajan and S. Sundararajan, submitted to JMLR, 2014. [abstract]**An efficient distributed learning algorithm based on effective local functional approximations**. With Dhruv Mahajan, Nikunj Agrawal, S. Sundararajan and Leon Bottou, submitted to JMLR, 2014. [abstract]

**Tractable semi-supervised learning of complex structured prediction models**. With Kai-Wei Chang and S. Sundararajan, ECML, 2013. [abstract]

**Extension of TSVM to multi-class and hierarchical textclassification problems with general losses (Long version)**. With S. Sundararajan and S.K. Shevade, COLING, 2012. [abstract]**Iterative Viterbi A* algorithm for K-best sequential coding**. With Z. Huang, Y. Chang, B. Long, J.F. Crespo, A. Dong and S.L. Wu, ACL, 2012. [abstract]**A simple approach to the design of site-level extractors usingdomain-centric principles**. With C. Long, X. Geng and C. Xu, CIKM, 2012. [abstract]**Deterministic annealing for semi-supervisedstructured output learning**. With P. Dhillon, K. Bellare, O. Chapelle and S. Sundararajan, AISTATS, 2012. [abstract]**Regularized structured output learning with partial labels**. With S. Sundararajan and C. Tiwari. SDM, 2012. [abstract]

**A sequential dual method for structural SVMs**. With P. Balamurugan, S.K. Shevade and S. Sundararajan, SDM, 2011. [abstract]**Semi-supervised multi-task learning of structuredprediction models for web information extraction**. With P. Dhillon and S. Sundararajan. CIKM, 2011. [abstract]**A pairwise ranking based approach to learning withpositive and unlabeled examples**. With S. Sundararajan and P. Garg. CIKM, 2011. [abstract]**Semi-supervised SVMs for classification with unknownclass proportions and a small labeled dataset**. With B. Bhar, S. Sundararajan and S.K. Shevade. CIKM, 2011. [abstract]**Transductive classification methods for mixed graphs**. With S. Sundararajan. MLG-2011 (KDD Workshop), 2011. [abstract]**Large scale information extraction from the web**. Industrial Problems Seminar, IMA, Univ. of Minnesotta, Feb 11, 2011. [abstract]

**Efficient algorithms for ranking with SVMs**. With O. Chapelle. Information Retrieval Journal, 13(3):201.215, 2010. [abstract]

**Graph based classification methods using inaccurate externalclassifier information**. With S. Sundararajan, Tech Report, 2009. [abstract]

**A dual coordinate descent method for large-scale linear SVM**. With C.-J. Hsieh, K.-W. Chang, C.-J. Lin, and S. Sundararajan. ICML 2008. [abstract]**A sequential dual coordinate method for large-scale multi-class linear SVMs**. With S. Sundararajan, K.-W. Chang, C.-J. Hsieh, and C.-J. Lin. KDD 2008. [abstract]**Optimization techniques for semi-supervised SVMs**. With Olivier Chapelle and Vikas Sindhwani. JMLR, 2008. [abstract]**Multi-Class Feature Selection with Support Vector Machines**. With Olivier Chapelle. JSM, 2008. [abstract]**Ad delivery with budgeted advertisers: A comprehensive LP approach**. With Zoe Abrams, Ofer Mendelevitch and John Tomlin. JECR, Vol. 9, No.1, 2008. [abstract]**Trust-region Newton method for large scale logistic regression**. With Chih-Jen Lin and Ruby Weng. JMLR 2008, Vol. 9, pp. 627-650. [abstract]**Participation (two winning entries in the wild track) in PASCAL Large Scale Learning Challenge**. With Olivier Chapelle, 2008. [abstract]**Predictive approaches for Gaussian process classifiermodel selection**. With S. Sundararajan, Tech Report, 2008. [abstract]

**CRF versus SVM-struct for sequence labeling**. With S. Sundararajan. Yahoo Research Technical Report, 2007. [abstract]**Predictive approaches for sparse Gaussian process regression**. With S. Sundararajan and Shirish Shevade. Tech Report, April 2007. [abstract]**Constructing a maximum utility slate of on-line advertisements**. With John Tomlin. Submitted for publication in ORSA Journal of Computing. [abstract]**Trust-region Newton method for large scale logistic regression**. With Chih-Jen Lin and Ruby Weng. ICML 2007. [abstract]**Semi-supervised Gaussian processes**. With Wei Chu and Vikas Sindhwani. IJCAI 2007. [abstract]

**An efficient method for gradient-based adaptation of hyperparameters in SVM models**. With Vikas Sindhwani and Olivier Chapelle. NIPS 2006. [Longer tech report.] [abstract]**Relational learning with Gaussian processes**. With Wei Chu, Vikas Sindhwani and Zoubin Ghahramani. NIPS 2006. [abstract]**Branch and bound for semi-supervised support vector machines**. With Olivier Chapelle and Vikas Sindhwani. NIPS 2006. [abstract]**Building Support Vector Machines with Reduced Classifier Complexity**. With Olivier Chapelle and Dennis DeCoste. JMLR 2006. [abstract]**Support vector ordinal regression**. With Wei Chu. Neural Computation 2006. [abstract]**Fast generalized cross validation algorithm for sparse model learning**. With S. Sundararajan and Shirish Shevade. Neural Computation 2006. [abstract]**Large-scale semi-supervised linear SVMs**. With Vikas Sindhwani. SIGIR 2006. [Longer tech report with pseudocodes.] [abstract]**Newton methods for fast solution of semi-supervised linear SVMs**. With Vikas Sindhwani. Book Chapter,*Large Scale Kernel Machines*, MIT Press, 2006. [abstract]**Deterministic annealing for semi-supervised kernel machines**. With Vikas Sindhwani and Olivier Chapelle. ICML 2006. [abstract]**Parallel sequential minimal optimization for the training of support vector machines**. With LJ Cao, CJ Ong, JQ Zhang, U Periyathamby, XJ Fu, HP Lee. IEEE transactions on Neural Networks, 17(4):1039-49. [abstract]**A fast tracking algorithm for generalized LARS/LASSO**. With Shirish Shevade. To appear in IEEE transactions on Neural Networks. [abstract]

**A modified finite Newton method for fast solution of large scale linear SVMs**. With Dennis DeCoste. JMLR 2005. [abstract]**A matching pursuit approach to sparse Gaussian process regression**. With Wei Chu. NIPS 2005. [abstract]**A fast dual algorithm for kernel logistic regression**. With Kaibo Duan, Shirish Shevade and Aun Neow Poo. Machine Learning Journal 2005. [abstract]**An improved conjugate gradient scheme to the solution of least squares SVM**. With Wei Chu and Chong Jin Ong. IEEE Transactions on Neural Networks 2005. [abstract]**New approaches to support vector ordinal regression**. With Wei Chu. ICML 2005. [abstract]**Generalized LARS as an effective feature selection tool for text classification with SVMs**. ICML 2005. [abstract]**Which is the best multiclass SVM method? An empirical study**. With Kaibo Duan. MCS 2005. [abstract]

**Bayesian support vector regression using a unified loss function**. With Wei Chu and Chong Jin Ong. IEEE Transactions on Neural Networks 2004. [abstract]**Stability regions for constrained nonlinear systems and their functional characterization via support-vector-machine learning**. With Chong Jin Ong, Elmer Gilbert and Z.H. Zhang. Automatica 2004. [abstract]**An efficient method for computing leave-one-out error in support vector machines with Gaussian kernels**. With M.M.S. Lee, Chong Jin Ong and Dennis DeCoste. IEEE Transactions on Neural Networks 2004. [abstract]**The needs and benefits of applying textual data mining within the product development process**. With Rakesh Menon, Han Tong Loh, A. Brombacher and C. Leong. Quality and Reliability Engineering International 2004. [abstract]

**A simple and efficient algorithm for gene selection using sparse logistic regression**. With Shirish Shevade. Bioinformatics Journal 2003. [abstract]**Asymptotic behaviours of support vector machines with gaussian kernel**. With Chih-Jen Lin. Neural Computation 2003. [abstract]**Multi-category classification by soft-max combination of binary classifiers**. With Kaibo Duan, Wei Chu and Aun Neow Poo. MCS 2003. [abstract]**Bayesian Trigonometric support vector classifier**. With Wei Chu and Chong Jin Ong. Neural Computation 2003. [abstract]**SMO Algorithm for Least Squares SVM Formulations**. With Shirish Shevade. Neural Computation 2003. [abstract]**Evaluation of simple performance measures for tuning SVM hyperparameters**. With Kaibo Duan and Aun Neow Poo. Neurocomputing 2003. [abstract]**Thermal error measurement and modelling in machine tools. Part II. Hybrid Bayesian Network - support vector machine model**. With R. Ramesh, M.A. Mannan and Aun Neow Poo. International Journal of Machine Tools and Manufacture 2003. [abstract]

**A Machine Learning Approach for the Curation of Biomedical Literature -- KDD Cup 2002 (Task 1)**. [abstract]**Efficient tuning of SVM hyperparameters using radius/margin bound and iterative algorithms**. IEEE Transactions on Neural Networks 2002. [abstract]**Convergence of a generalized SMO algorithm for SVM classifier design**. Machine Learning Journal 2002. [abstract]**A Fast Dual Algorithm for Kernel Logistic Regression**. With Kaibo Duan, Shirish Shevade and Aun Neow Poo. ICML 2002. [abstract]

**A unified loss function in Bayesian framework for support vector regression**. With Wei Chu and Chong Jin Ong. ICML 2001. [abstract]**Evaluation of simple performance measures for tuning SVM hyperparameters**. With Kaibo Duan and Aun Neow Poo. ICONIP 2001. [abstract]**Mean-field methods for a special class of belief networks**. With Chiranjib Bhattacharyya. JAIR 2001. [abstract]**Computation of a penetration measure between 3D convex polyhedral objects for collision detection**. With K. Sridharan. Journal of Robotic Systems 2001. [abstract]**Improvements to Platt's SMO algorithm for SVM classifier design**. With Shirish Shevade, Chiranjib Bhattacharyya and K.R. Krishna Murthy. Neural Computation 2001. [abstract]**Predictive approaches for choosing hyperparameters in Gaussian processes**. With S. Sundararajan. Neural Computation 2001. [abstract]**Rule prepending and post-pruning approach to incremental learning of decision lists**. With K.R. Krishna Murthy and M. Narasimha Murty. Pattern Recognition 2001. [abstract]

**Predictive approaches for choosing hyperparameters in Gaussian processes**. With S. Sundararajan. NIPS 2000. [abstract]**A variational mean-field theory for sigmoidal belief networks**. With Chiranjib Bhattacharyya. NIPS 2000. [abstract]**A fast iterative nearest point algorithm for support vector machine classifier design**. With Shirish Shevade, Chiranjib Bhattacharyya and K.R. Krishna Murthy. IEEE Transactions on Neural Networks 2000. [abstract]**A stochastic connectionist approach for global optimization with application to pattern clustering**. With Phanendra Babu and M. Narasimha Murty. IEEE Transactions on Systems, Man and Cybernetics (Part B) 2000. [abstract]**Improvements to the SMO algorithm for SVM regression**. With Shirish Shevade, Chiranjib Bhattacharyya and K.R. Krishna Murthy. IEEE Transactions on Neural Networks 2000. [abstract]**Information geometry and Plefka's mean-field theory**. With Chiranjib Bhattacharyya. Journal of Physics A: Math. Gen. 2000. [abstract]

**Optimal control of a somersaulting platform diver: a numerical approach**. With Murthy Nukala, IEEE Conference on Robotics and Automation, 1993. [abstract]**A fast procedure for computing the distance between complex objects in 3-dimensional space**. With Elmer Gilbert and Dan Johnson. IEEE Journal of Robotics and Automation, 1988.*The algorithm in this paper, The GJK Algorithm has great use in video games, computer graphics, robotics motionplanning etc. See this video.. the presenter seems to face great difficulty pronouncing my name!*[abstract]**Optimal infinite-horizon feedback laws for a general class of constrained discrete-time systems: stability and moving-horizon approximations**. With Elmer Gilbert. Journal of Optimization Theory and Its Applications, 1988. This is a highly cited paper in the area of Model Predictive Control (MPC). See this survey paper. [abstract]

*Last updated: May, 2020*