Text Analytics at Scale

Get your team up to speed with our hands-on deep training on Text Mining. Based on the materials presented at top academic conferences.

What is it about?

In this training, we provide all necessary knowledge to build an end-2-end text information management system from scratch and enable a new set of analytical capabilities. This training consists of three parts.

In the first part, we cover content representation, storage and indexing, consider basic information retrieval models, which are also relevant for text analytics, and machine learning approaches for text classification.

The second part of the training will be devoted to learning low-dimensional representations for text because it is known to drammatically increase model accuracy in end-2-end applications and pipelines. In particular, we will consider approaches based on topic modeling and word2vec deep learning. We will also touch upon the concept of active learning and explain how it can be used to significantly reduce the cost and delivery time for an end-2-end pipeline.

Finally, we will switch to the hands-on part of the training and help the participants design their own text information management and text mining systems using Elasticsearch, Tensorflow, Scikit-learn, and Spark and by incorporating all lectured materials. We will use publicly accessible text collections, such as Wikipedia dump, news articles, web pages (e.g. ClueWeb), or academic publications, as well as data relevant for your organization.

Training Content