AGARWAL, A., CHAPELLE, O., DUD´IK, M., AND LANGFORD, J. 2011. A reliable effective terascale linear

learning system. CoRR abs/1110.4198.

AGARWAL, D., AGRAWAL, R., KHANNA, R., AND KOTA, N. 2010. Estimating rates of rare events with multiple

hierarchies through scalable log-linear models. In Proceedings of the 16th ACM SIGKDD international

conference on Knowledge discovery and data mining. 213–222.

ASHKAN, A., CLARKE, C. L. A., AGICHTEIN, E., AND GUO, Q. 2009. Estimating ad clickthrough rate

through query intent analysis. In Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference

on Web Intelligence and Intelligent Agent Technology.

AUER, P., CESA-BIANCHI, N., AND FISCHER, P. 2002. Finite-time analysis of the multiarmed bandit problem.

Machine learning 47, 2, 235–256.

BACH, F., JENATTON, R., MAIRAL, J., AND OBOZINSKI, G. 2011. Optimization with sparsity-inducing

penalties. Foundations and Trends in Machine Learning 4, 1, 1–106.

BISHOP, C. M. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc.

BLOOM, B. 1970. Space/time trade-offs in hash coding with allowable errors. Communications of the

ACM 13, 7, 422–426.

CANINI, K., CHANDRA, T., IE, E., MCFADDEN, J., GOLDMAN, K., GUNTER, M., HARMSEN, J., LEFEVRE,

K., LEPIKHIN, D., LLINARES, T. L., MUKHERJEE, I., PEREIRA, F., REDSTONE, J., SHAKED, T., AND

SINGER, Y. 2012. Sibyl: A system for large scale supervised machine learning. Presentation at MLSS

Santa Cruz, http://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf.

CHAKRABARTI, D., AGARWAL, D., AND JOSIFOVSKI, V. 2008. Contextual advertising by combining relevance

with click feedback. In Proceedings of the 17th international conference on World Wide Web.

417–426.

CHANG, Y.-W., HSIEH, C.-J., CHANG, K.-W., RINGGAARD, M., AND LIN, C.-J. 2010. Training and testing

low-degree polynomial data mappings via linear SVM. The Journal of Machine Learning Research 11,

1471–1490.

CHAPELLE, O. AND LI, L. 2011. An empirical evaluation of thompson sampling. In Advances in Neural Information

Processing Systems 24, J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K. Weinberger,

Eds. 2249–2257.

CHEN, S. AND GOODMAN, J. 1999. An empirical study of smoothing techniques for language modeling.

Computer Speech & Language 13, 4, 359–393.

CHENG, H. AND CANT´U-PAZ, E. 2010. Personalized click prediction in sponsored search. In Proceedings of

the third ACM international conference on Web search and data mining.

CHENG, H., ZWOL, R. V., AZIMI, J., MANAVOGLU, E., ZHANG, R., ZHOU, Y., AND NAVALPAKKAM, V. 2012.

Multimedia features for click prediction of new ads in display advertising. In Proceedings of the 18th

ACM SIGKDD international conference on Knowledge discovery and data mining.

CHU, C., KIM, S., LIN, Y., YU, Y., BRADSKI, G., NG, A., AND OLUKOTUN, K. 2007. Map-reduce for machine

learning on multicore. In Advances in Neural Information Processing Systems 19: Proceedings of the

2006 Conference. Vol. 19.

CIARAMITA, M., MURDOCK, V., AND PLACHOURAS, V. 2008. Online learning from click data for sponsored

search. In Proceedings of the 17th international conference on World Wide Web. 227–236.

CORTES, C., MANSOUR, Y., AND MOHRI, M. 2010. Learning bounds for importance weighting. In Advances

in Neural Information Processing Systems. Vol. 23. 442–450.

DEAN, J. AND GHEMAWAT, S. 2008. Mapreduce: simplified data processing on large clusters. Communications

of the ACM 51, 1, 107–113.

DUCHI, J., HAZAN, E., AND SINGER, Y. 2010. Adaptive subgradient methods for online learning and

stochastic optimization. Journal of Machine Learning Research 12, 2121–2159.

EVGENIOU, T. AND PONTIL, M. 2004. Regularized multi-task learning. In Proceedings of the tenth ACM

SIGKDD international conference on Knowledge discovery and data mining. ACM, 109–117.

GELMAN, A. AND HILL, J. 2006. Data analysis using regression and multilevel/hierarchical models. Cambridge

University Press.

GITTINS, J. C. 1989. Multi-armed Bandit Allocation Indices. Wiley Interscience Series in Systems and Optimization.

John Wiley & Sons Inc.

GRAEPEL, T., CANDELA, J. Q., BORCHERT, T., AND HERBRICH, R. 2010. Web-scale bayesian click-through

rate prediction for sponsored search advertising in microsoft’s bing search engine. In Proceedings of the

27th International Conference on Machine Learning.

GUYON, I. AND ELISSEEFF, A. 2003. An introduction to variable and feature selection. The Journal of

Machine Learning Research 3, 1157–1182.

HILLARD, D., MANAVOGLU, E., RAGHAVAN, H., LEGGETTER, C., CANT´U -PAZ, E., AND IYER, R. 2011. The

sum of its parts: reducing sparsity in click estimation with query segments. Information Retrieval, 1–22.

HILLARD, D., SCHROEDL, S., MANAVOGLU, E., RAGHAVAN, H., AND LEGGETTER, C. 2010. Improving ad

relevance in sponsored search. In Proceedings of the third ACM international conference on Web search

and data mining. 361–370.

KEARNS, M. 1993. Efficient noise-tolerant learning from statistical queries. In Proceedings of the Twenty-

Fifth Annual ACM Symposium on the Theory of Computing. 392–401.

KING, G. AND ZENG, L. 2001. Logistic regression in rare events data. Political analysis 9, 2, 137–163.

KOEPKE, H. AND BILENKO, M. 2012. Fast prediction of new feature utility. In Proceedings of the 29th

International Conference on Machine Learning.

KOTA, N. AND AGARWAL, D. 2011. Temporal multi-hierarchy smoothing for estimating rates of rare events.

In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data

mining.

LAI, T. AND ROBBINS, H. 1985. Asymptotically efficient adaptive allocation rules. Advances in applied

mathematics 6, 4–22.

LANGFORD, J., LI, L., AND STREHL, A. 2007. Vowpal wabbit open source project. https://github.com/

JohnLangford/vowpal_wabbit/wiki.

LI, L., CHU, W., LANGFORD, J., AND SCHAPIRE, R. E. 2010. A contextual-bandit approach to personalized

news article recommendation. In Proceedings of the 19th international conference on World wide web.

661–670.

LI, L., CHU, W., LANGFORD, J., AND WANG, X. 2011. Unbiased offline evaluation of contextual-bandit-based

news article recommendation algorithms. In Proceedings of the fourth ACM international conference on

Web search and data mining. 297–306.

LIU, Y., PANDEY, S., AGARWAL, D., AND JOSIFOVSKI, V. 2012. Finding the right consumer: optimizing for

conversion in display advertising campaigns. In Proceedings of the fifth ACM international conference

on Web search and data mining.

LOW, Y., GONZALEZ, J., KYROLA, A., BICKSON, D., GUESTRIN, C., AND HELLERSTEIN, J. M. 2010.

Graphlab: A new framework for parallel machine learning. In The 26th Conference on Uncertainty in

Artificial Intelligence.

MACQUEEN, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings

of 5th Berkeley Symposium on Mathematical Statistics and Probability. University of California

Press, Berkeley, CA, 281–297.

MCAFEE, R. 2011. The design of advertising exchanges. Review of Industrial Organization, 1–17.

MCMAHAN, H. B. AND STREETER, M. 2010. Adaptive bound optimization for online convex optimization.

In Proceedings of the 23rd Annual Conference on Learning Theory.

MEEK, C., CHICKERING, D. M., AND WILSON, D. 2005. Stochastic and contingent payment auctions. In

Workshop on Sponsored Search Auctions, ACM Electronic Commerce.

MEIER, L., VAN DE GEER, S., AND B¨UHLMANN, P. 2008. The group lasso for logistic regression. Journal of

the Royal Statistical Society: Series B (Statistical Methodology) 70, 1, 53–71.

MENARD, S. 2001. Applied logistic regression analysis. Vol. 106. Sage Publications, Inc.

MENON, A. K., CHITRAPURA, K.-P., GARG, S., AGARWAL, D., AND KOTA, N. 2011. Response prediction

using collaborative filtering with hierarchies and side-information. In Proceedings of the 17th ACM

SIGKDD international conference on Knowledge discovery and data mining.

MINKA, T. 2003. A comparison of numerical optimizers for logistic regression. Tech. rep., Microsoft Research.

MUTHUKRISHNAN, S. 2009. Ad exchanges: Research issues. In Proceedings of the 5th International Workshop

on Internet and Network Economics.

NIGAM, K., LAFFERTY, J., AND MCCALLUM, A. 1999. Using maximum entropy for text classification. In

IJCAI-99 workshop on machine learning for information filtering. Vol. 1. 61–67.

NOCEDAL, J. 1980. Updating quasi-newton matrices with limited storage. Mathematics of computation

35, 151, 773–782.

OWEN, A. 2007. Infinitely imbalanced logistic regression. The Journal of Machine Learning Research 8,

761–773

REGELSON, M. AND FAIN, D. C. 2006. Predicting click-through rate using keyword clusters. In Proceedings

of the Second Workshop on Sponsored Search Auctions.

RICHARDSON, M., DOMINOWSKA, E., AND RAGNO, R. 2007. Predicting clicks: estimating the click-through

rate for new ads. In Proceedings of the 16th International conference on World Wide Web. New York, NY,

521–530.

ROSALES, R. AND CHAPELLE, O. 2011. Attribute selection by measuring information on reference distributions.

In Tech Pulse Conference, Yahoo!

ROSALES, R., CHENG, H., AND MANAVOGLU, E. 2012. Post-click conversion modeling and analysis for nonguaranteed

delivery display advertising. In Proceedings of the fifth ACM international conference on

Web search and data mining. ACM, 293–302.

SARKAR, J. 1991. One-armed bandit problems with covariates. The Annals of Statistics, 1978–2002.

SCH¨OLKOPF, B. AND SMOLA, A. 2001. Learning with kernels: Support vector machines, regularization, optimization,

and beyond. MIT press.

SHI, Q., PETTERSON, J., DROR, G., LANGFORD, J., SMOLA, A., AND VISHWANATHAN, S. 2009. Hash kernels

for structured data. The Journal of Machine Learning Research 10, 2615–2637.

TEO, C., LE, Q., SMOLA, A., AND VISHWANATHAN, S. 2007. A scalable modular convex solver for regularized

risk minimization. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge

discovery and data mining.

THOMPSON, W. R. 1933. On the likelihood that one unknown probability exceeds another in view of the

evidence of two samples. Biometrika 25, 3–4, 285–294.

WEINBERGER, K., DASGUPTA, A., LANGFORD, J., SMOLA, A., AND ATTENBERG, J. 2009. Feature hashing

for large scale multitask learning. In Proceedings of the 26th Annual International Conference on

Machine Learning. 1113–1120.

YE, J., CHOW, J.-H., CHEN, J., AND ZHENG, Z. 2009. Stochastic gradient boosted distributed decision trees.

In Proceeding of the 18th ACM conference on Information and knowledge management. 2061–2064.

ZAHARIA, M., CHOWDHURY, M., DAS, T., DAVE, A., MA, J., MCCAULEY, M., FRANKLIN, M., SHENKER,

S., AND STOICA, I. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory

cluster computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and

Implementation.

Reference

results matching ""

No results matching ""