Keynote Speakers

Structured Output Learning with a Margin

Speaker: John Shawe-Taylor


John Shawe-Taylor is a professor at UCL where he directs the Centre for Computational Statistics and Machine Learning and heads the Department of Computer Science. His research has contributed to a number of fields ranging from graph theory through cryptography to statistical learning theory and its applications. However, his main contributions have been in the development of the analysis and subsequent algorithmic definition of principled machine learning algorithms founded in statistical learning theory. He has co-authored two influential text books on kernel methods and support vector machines. He has also been instrumental in coordinating a series of influential European Networks of Excellence culminating in the PASCAL networks.


Structured output learning has been developed to borrow strength across multidimensional classifications. There have been approaches to bounding the performance of these classifiers based on different measures such as microlabel errors with a fixed simple output structure. We present a different approach and analysis starting from the assumption that there is a margin attainable in some unknown or fully connected output structure.

The analysis and algorithms flow from this assumption but in a way that the associated inference becomes tractable while the bounds match those attained were we to use the full structure. There are two variants depending on how the margin is estimated. Experimental results show the relative strengths of these variants, both algorithmically and statistically.

Big Data Learning for Interdisciplinary Applications: In-Depth View of Some Key Challenges

Speaker: Vincent S. Tseng


Dr. Vincent S. Tseng is currently a Distinguished Professor at Department of Computer Science and Director of Center for Big Data Technologies and Applications in National Chiao Tung University Taiwan, R.O.C.. He received the PhD in Computer Science from National Chaio Tung University in 1997 and then joined EECS Computer Science Division of UC Berkeley as a research fellow during 1998-1999. He was the Chair for IEEE Computational Intelligence Society Tainan Chapter during 2013-2015 and the President of Taiwanese Association for Artificial Intelligence during 2011-2012. Dr. Tseng has a wide variety of research interests covering data mining, machine learning, biomedical informatics, mobile and Web technologies. He has published more than 300 research papers in referred journals and conferences as well as 15 patents (held and filed). He has been on the editorial board of a number of top-tier journals including IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Knowledge Discovery from Data, IEEE Journal of Biomedical and Health Informatics, etc. He has also been overseeing the directions and architecture of big data technical platforms and interdisciplinary applications for governmental and industrial units in Taiwan. He is also the recipient of 2014 K. T. Li Breakthrough Award and 2015 Outstanding Research Award by Ministry of Science and Technology Taiwan.


Nowadays, large volume of data is being collected at unprecedented and explosive scale in a broad range of application areas. Analytics on such big data deliver amazing value and can drive interdisciplinary applications in various aspects of our life, including healthcare, retail, financial services, mobile services, life sciences, etc. Decisions that previously were based on hypothetical models or just unreliable guesswork can now be made effectively and efficiently by learning from the big data itself. New wave of revolutions in various domains has jumped into this Big Data era with new opportunities and challenges arisen. In this talk, I will investigate some key challenges in Big Data Learning for interdisciplinary applications through in-depth observations from various aspects covering data preprocessing, key feature discovery, learning and modeling, post-processing, etc. Experiences from practical projects in different domains including biomedicine, social media, e-commerce, mobile sensing, etc., will be shared. Finally, some emerging research topics and potential opportunities underlying this topic will also be addressed accordingly.

Invited Speakers

Massive Online Analytics for the Internet of Things (IoT)

Speaker: Albert Bifet


Albert Bifet is Associate Professor at Telecom ParisTech and Honorary Research Associate at the WEKA Machine Learning Group at University of Waikato. Previously he worked at Huawei Noah's Ark Lab in Hong Kong, Yahoo Labs in Barcelona, University of Waikato and UPC BarcelonaTech. He is the author of a book on Adaptive Stream Mining and Pattern Learning and Mining from Evolving Data Streams. He is one of the leaders of MOA and Apache SAMOA software environments for implementing algorithms and running experiments for online learning from evolving data streams. He is serving as Co-Chair of the Industrial track of IEEE MDM 2016, ECML PKDD 2015, and as Co-Chair of BigMine (2015, 2014, 2013, 2012), and ACM SAC Data Streams Track (2016, 2015, 2014, 2013, 2012).


Big Data and the Internet of Things (IoT) have the potential to fundamentally shift the way we interact with our surroundings. The challenge of deriving insights from the Internet of Things (IoT) has been recognized as one of the most exciting and key opportunities for both academia and industry. Advanced analysis of big data streams from sensors and devices is bound to become a key area of data mining research as the number of applications requiring such processing increases. Dealing with the evolution over time of such data streams, i.e., with concepts that drift or change completely, is one of the core issues in stream mining. In this talk, I will present an overview of data stream mining, and I will introduce some popular open source tools for data stream mining.

Tackling big-data and big-model challenges of deep learning

Speaker: Tie-Yan Liu


Tie-Yan Liu is a principal researcher of Microsoft Research Asia. He is very well known for his pioneer work on learning to rank and computational advertising, and his recent research interests include deep learning and distributed machine learning. As a researcher in an industrial lab, Tie-Yan is making his unique contributions to the world. On one hand, many of his technologies have been transferred to Microsoft’s products and online services. On the other hand, he has been actively contributing to academic communities. He is an adjunct/honorary professor of Carnegie Mellon University (CMU) and University of Nottingham. His papers have been cited for tens of thousands of times in refereed conferences and journals. He has won the best student paper award at SIGIR (2008), the most cited paper award at Journal of Visual Communications and Image Representation (2004-2006), the research breakthrough award at Microsoft Research (2012), and Top-10 Springer Computer Science books by Chinese authors (2015). He has been invited to serve as general chair, program committee chair, or area chair for a dozen of top conferences including SIGIR, WWW, KDD, NIPS, IJCAI, and AAAI, as well as associate editor/editorial board member of ACM Transactions on Information Systems, ACM Transactions on the Web, Information Retrieval Journal, and Foundations and Trends in Information Retrieval. Tie-Yan Liu is a senior member of IEEE and ACM.


The success of deep learning could be attributed to the availability of very big training data, the expressiveness of big deep models, and the computational power of GPU clusters. However, they are double-edged swords: it is costly or sometimes impossible to acquire sufficient labeled data for training; big models are usually hard to train and might exceed the capacity of GPU devices; it is non-trivial to distribute the training onto multiple nodes, with linear speed up and without accuracy loss. In this talk, I will introduce our recent research to address these challenges. First, I will introduce a technology called “dual learning”, which leverages the fact that many AI tasks have dual forms to create a closed feedback loop to enable the effective learning from unlabeled data. Second, we study the case that deep learning model is large due to its fat output layer (i.e., with many categories to predict), and propose to map the outputs onto a 2-dimensional table to effectively compress the model. By taking recurrent neural networks (RNN) as example, we show that our technology can lead to better accuracy and several-orders-of-magnitude smaller model. Third, we discuss the embarrassment of parallel computation – synchronous parallelization is slow due to synchronization barrier; asynchronous parallelization hurts accuracy due to communication delay. We then introduce a novel technology that leverages Taylor expansion of the gradient function to compensate the delay in asynchronous parallelization. It can achieve linear speed up and an accuracy comparable to sequential algorithms. All the technologies introduced in this talk will soon be open-sourced through Microsoft CNTK.