Presentations & Talks

Thrilled to share our keynotes and academic presentations with the larger data science community aiming to make this world just a little smarter and better.

The 100-Billion Dollar Equation: Big Data Science of Digital Advertising

What is digital advertising? What skills do you need to be successful in this space? How some of the biggest advertising platforms process events and decide what ad to show? We will answer these and many more questions critical for the digital advertising industry. The lecture is for a mixed audience of executives willing to understand big data and engineers looking for the next big challenge.

UC Berkeley Data Science and Machine Learning at Scale Lecture Series

This course builds on and goes beyond the collect-and-analyze phase of big data by focusing on how machine learning algorithms can be rewritten and extended to scale to work on petabytes of data, both structured and unstructured, to generate sophisticated models used for real-time predictions. It covers fundamentals of MapReduce, Hadoop, Spark, Deep Learning, MLLib, and more.

Computational Advertising: Business Models, Technologies, and Issues (Part 1)

This lecture will review the main business models of online advertising including: the pay-per-impression model (CPM); and the pay-per-click model (CPC); a relative new comer, the pay-per-action model (CPA), where an action could be a product purchase, a site visit, a customer lead, or an email signup; and dynamic CPM (dCPM), which optimizes a campaign towards the sites that perform best.

Computational Advertising: Business Models, Technologies, and Issues (Part 2.1)

This lecture will discuss in detail the technology being leveraged to automatically target ads and largely derives from the fields of machine learning (e.g., logistic regression, online learning), statistics (e.g., binomial maximum likelihood), information retrieval (vector space model, BM25), optimization theory (linear and quadratic programming), and economics (auction mechanisms, game theory).

Computational Advertising: Business Models, Technologies, and Issues (Part 2.2)

This tutorial will discuss in detail the technology being leveraged to automatically target ads and largely derives from the fields of machine learning (e.g., logistic regression, online learning), statistics (e.g., binomial maximum likelihood), information retrieval (vector space model, BM25), optimization theory (linear and quadratic programming), and economics (auction mechanisms, game theory).

Computational Advertising: Business Models, Technologies, and Issues (Part 3)

This lecture will discuss challenges, such as click fraud (the spam of online advertising), deception, user privacy, and other open and controversial issues. We will also cover Web 2.0 applications, such as social networks and video/photo-sharing, and see what new niche problems they pose to the digital advertising pratitioners and researchers.

Modeling Rare Events: Online Ad Targeting using Machine Learning

This lecture describes the digital advertising ecosystem, systems and models that are currently being used, and open issues waiting for innovation. We will talk about traditional advertising problems, such as look-alike modeling, audience extension, and focus on rare events and data sparsity issues, which are specific to the digital world with long-tail effects.

Optimizing Search User Interactions within Professional Social Networks

Most of the search engines today still provide only keyword queries, basic faceted search, and uninformative query-biased snippets overlooking the structured and interlinked nature of entities. This results in siloed information and suboptimal search user experience. We reconsider input, control, and presentation elements of the search UI to enable more effective and efficient search.

Language Models for Information Retrieval and Applications in Job Search

This presentation covers traditional information retrieval models, such as Vector Space Model, BM25, Robertson-Sparck-Jones Model, and pivoted length normalization. The introduction to language modeling and natural language processing is provided. Finally, we present a language modeling approach to information retrieval and consider subtle aspects, such as smoothing and relevance feedback.

Introduction to Data Science and Large-scale Machine Learning

From motivating applications of big data to machine learning theory: logical algorithms (decision trees), graph processing algorithms (PageRank/shortest path), gradient-based algorithms (SVM), and matrix factorization. The theoretical material will be followed by hands-on algorithm development in parallel computing environments (Spark). As a bonus, we share predictions about the future of AI.

Panel Discussion with J. Shanahan, P. Quigley, H. Mortman, H. Linehan, H. Waldram, and G. Sheridan at ICWSM 2012

The International Conference on Weblogs and Social Media (ICWSM) is intended to bring together researchers in the broad field of social media analysis and foster discussions about ongoing research and fundametal insights about weblogs and social media. Sponsored by the Association for the Advancement of Artificial Intelligence.

Bonus for scrolling that far! Social Event at CIKM 2008, Napa Valley, CA :)

The ACM International Conference on Information and Knowledge Management (CIKM) provides a forum for presentation and discussion of research on information and knowledge management, as well as recent advances on data and knowledge bases. We identify challenging problems, design theory and solutions facing the development of future knowledge and information systems.