Keynote 1: Big Data, Deep Learning, and other Allegories: Scalability and Fault-tolerance of Parallel and Distributed InfrastructuresSpeaker:
Director of Research, Data Analytics, Qatar Computing Research Institute
Professor of Computer Science, University of California at Santa Barbara
Big data has emerged as one of the most promising technology paradigm in the past few years. Availability of large data arises in numerous application contexts: trillions of words in English and other languages, hundreds of billions of text documents, a large number of translations of documents in one language to other languages, billions of images and videos along with textual annotations and summaries, thousands of hours of speech recordings, trillions of log records capturing human activity, and the list goes on. During the past decade, careful processing and analysis of different types of data has had transformative effect. Many applications that were buried in the pages of science fiction have become a reality, e.g., driverless cars, language agnostic conversation, automated image understanding, and most recently deep learning to simulate a human brain. In the technology context, Big Data has resulted in significant research and development challenges. From a systems perspective, scalable storage, retrieval, processing, analysis, and management of data poses the biggest challenge. From an application perspective, leveraging large amounts of data to develop models of physical reality becomes a complex problem. The interesting dichotomy is that the bigness of data in the system context makes some of the known data processing solutions that were “acceptable” to “not acceptable.” For example, standard algorithms for carrying out join processing may have to be revisited in the Big Data context. In contrast, the bigness of data allows many applications to move from being “not possible” to “possible”. For example real time, automated, high quality and robust language translation seems entirely feasible leveraging large amounts of translation data. Thus, new approaches are warranted to develop scalable technologies for processing and managing Big Data. In the same vein, designing and developing robust models for learning from big data remains a significant research challenge. We explore the Big Data problem both from the system perspective as well as from the application perspective. In the popular press, “bigness” or the size of data is touted as a desirable property. Our goal is to clearly comprehend the underlying complexity that must be overcome with the increasing size of data and clearly delineate the challenges and opportunities in this fast developing space.Speaker Bio:
Dr. Divyakant Agrawal is Research Director in Data Analytics at Qatar Computing Research Institute at Doha, Qatar and is also a Professor of Computer Science at the University of California at Santa Barbara (currently on leave). His research expertise is in the areas of database systems, distributed computing, data warehousing, and large-scale information systems. From January 2006 through December 2007, Dr. Agrawal served as VP of Data Solutions and Advertising Systems at the Internet Search Company ASK.com. Dr. Agrawal has also served as a Visiting Senior Research Scientist at the NEC Laboratories of America in Cupertino, CA from 1997 to 2009 and as a Visiting Scientist in the Advertising Infrastructure Group at Google, Inc. in Mountain View, CA in 2013-2014. During his professional career, Dr. Agrawal has served on numerous Program Committees of International Conferences, Symposia, and Workshops and served as an editor of the journal of Distributed and Parallel Databases (1993-2008), the VLDB journal (2003-2008), and IEEE Transactions on Knowledge and Data Engineering (2012-2014). He currently serves as the Editor-in-Chief of Distributed and Parallel Databases and is on the editorial boards of the ACM Transactions on Database Systems and ACM Transactions on Spatial Algorithms and Systems. He serves on the Board of Trustees of the VLDB Endowment and has served on the Executive Committee of ACM Special Interest Group SIGSPATIAL.
Keynote 2: Fast Data and the Imperative for Human Understanding and ControlSpeaker:
CTO at StreamBase Systems
Fast Data, the timely processing of big data in order to make decisions in real-time with the best available information, is receiving a great deal of attention from both media and business stakeholders. Data-rich real-time automation has the potential to improve efficiency and human experience across an enormous range of activities: manufacturing processes to public transit, logistics to healthcare, financial markets to retail. For all of these scenarios, we’ll look at opportunities to combine streaming analytics and machine learning with automated decision-making and integrated business processes. The research community has focused on software and communications architectures to support these applications, and there remain many open problems, but equally important is helping stakeholders — consumers, operators, or business owners — understand and control these systems. Keeping users aware and in control requires tradeoffs in architecture and analytics and an understanding of human factors and critical safety systems; Otherwise, we risk automated systems running wild, or a reactive regulatory regime hampering adoption of our innovations.Speaker Bio:
Richard Tibbetts is most recently Chief Technology Officer for Event Processing at TIBCO, where he was responsible for the StreamBase, Business Events, Live Datamart, API management, In-Memory Transaction Processing, and BRMS product lines. TIBCO is the leading vendor of Streaming Analytics and Complex Event Processing (CEP) software, with flagship customers in financial services, energy, retail, logistics, and transportation. Richard joined TIBCO at the 2013 acquisition of StreamBase Systems, where he was CTO and cofounder. StreamBase was a commercialization of the Aurora Project at MIT, Brown and Brandeis universities. The StreamBase software enables developers to create real-time applications quickly using our development tools and the EventFlow visual programming language. Prior to StreamBase, Richard received his SB and MEng in Computer Science from MIT. His thesis work was on the Linear Road Stream Data Management Benchmark, part of the Aurora Project. Richard also advises startups and research groups in event processing, analytics, and big data infrastructure.
Keynote 3: Workplace 2.0 - applying machine learning to event streams across cloud based collaboration servicesSpeaker:
Corporate Vice President at Microsoft, Norway
The talk will present the core architectural principles and challenges associated with building the Office Graph service. Office Graph consumes event streams across email, chat, social, authoring as well as 3rd party services. These event streams are processed with machine learning to form a derived data model in Office Graph representing all digital entities and how they relate based on the aggregated activity stream analysis. Events will modify the Office Graph both directly in realtime as well as indirectly by triggering analysis across events streams that finally write derived insights into the Office Graph data model. Privacy is a key topic that has been carefully modelled. As a result Office Graph forms a realtime data model capturing the insights across the event streams and offers this as a resource to 1st party applications such as Delve and Outlook as well as 3rd party app development. The talk will present some of these insight driven experiences and how they leverage specific analytics and machine learning applied to the lower level activity streams.Speaker Bio:
PhD Bjørn Olstad is a Corporate Vice President in Microsoft leading the Norwegian research and development center which is driving enterprise search, machine learning and distributed cloud infrastructures for communication systems. Olstad has previously been a professor at the NTNU university in Trondheim, Norway and been the CTO in Vingmed Sound acquired by GE Healthcare and FAST which was acquired by Microsoft. Olstad has as CTO and inventor 3 times won EU's ICT prize for most innovative European IT product. Olstad has also led many international research projects.