Programme

  • Monday 5th September 2016Open or Close
    08:15 – 09:00
    Registration
    09:00 – 09:30
    Introduction (Summer school directors)
    09:30 – 10:15
    Fundamentals of data science (Claudia Wagner, GESIS and University of Koblenz-Landau, Germany)
    10:15 – 10:45
    Break
    10:45 – 12:30
    Probability and statistics (Blaz Fortuna & Jan Rupnik, JSI, Slovenia)

    Random variables & distributions, statistical studies, descriptive statistics, dependent & independent events, regression and inferential statistics.

    12:30 – 14:00
    Break
    14:00 – 14:30
    Introduction to student projects (Allan Third, Open University, UK)
    14:30 – 15:30
    Machine learning (Blaz Fortuna & Jan Rupnik, JSI, Slovenia)
    15:30 – 16:00
    Break
    16:00 – 17:30
    Hands-on: Fundamentals of data analysis (Blaz Fortuna & Jan Rupnik, JSI, Slovenia)
    18:00 – 19:00
    Poster session (Coordinated by Allan Third, Open University, UK)
  • Tuesday 6th September 2016Open or Close
    09:00 – 09:15
    Administrative announcements
    09:15 – 10:15
    Keynote Marko Tadic (University of Zagreb, Croatia)

    Language processing pipelines for knowledge technologies
    The Natural Language Processing is usually considered a (pre)processing step in text-based knowledge technologies. To the expected audience of PhD students the tasks, methods and techniques used in composing the full language processing pipelines will be presented. These pipelines cover not only language processing at the basic levels (sentence splitting, tokenization, POS/MSD-tagging), but also higher levels (NERC, syntactic parsing, semantic parsing, sematic role labelling, etc.). The lecture will cover not only theoretical concepts needed to understand these methods and tools, but also a practical demonstration of pipelines developed in some EU-funded projects

    10:15 – 10:45
    Break
    10:45 – 12:30
    Information extraction (Elena Demidova, University of Southampton, UK)
    12:30 – 14:00
    Break
    14:00 – 15:30
    High-performance computing (Carlos Pedrinaci, Open University, UK)
    15:30 – 16:00
    Break
    16:00 – 17:30
    Hands-on: Information extraction (Elena Demidova, University of Southampton, UK)
    18:00 – 19:00
    Poster session (Coordinated by Allan Third, Open University, UK)
  • Wednesday 7th September 2016 Open or Close
    09:00 – 09:15
    Administrative announcements
    09:15 – 10:15
    Keynote: Stefan Decker (RWTH Aachen and Fraunhofer, Germany)

    Knowledge Representation on the Web using Prototypes: Syntax, Semantics and Pragmatics
    Knowledge Representation (KR) on the Web has been a topic for Semantic Web research for a while and is increasingly relevant for practitioners – e.g., in the Open Data Movements or for Research Data Management. The standard for KR on the Web has been OWL for 10 years., in which numerous experiences has been gained. These experiences has prompted us to propose an approach aiming to augment and complement OWL based on prototypical objects. Prototypes have been explored in early Frame Representation Systems, but have been largely neglected in the last decades. In my talk I present a syntax and a formal semantics for prototype representation systems, proving that also Prototypes Systems can provide a formal underpinning for Knowledge Representation. Initial performance results will also be presented and are encouraging.
    Finally I will conclude with prospects and open research challenges.

    10:15 – 10:45
    Break
    10:45 – 12:30
    Understanding and communicating with data (Chris Phethean, University of Southampton, UK)
    12:30 – 14:00
    Break
    14:00 – 15:30
    Hands-on: Exploratory data analysis and data visualisation (Chris Phethean, University of Southampton, UK)
    15:30
    Social Networking
  • Thursday 8th September 2016 Open or Close
    09:00 – 09:15
    Administrative announcements
    09:15 – 10:15
    Keynote: Ricardo Baeza Yates (former Yahoo Labs, USA)

    Data and Algorithmic Bias in the Web
    The Web is the largest public big data repository that humankind has created. In this overwhelming data ocean, we need to be aware of the quality and, in particular, of the biases that exist in this data. In the Web, biases also come from redundancy and spam, as well as from algorithms that we design to improve the user experience. This problem is further exacerbated by biases that are added by these algorithms, specially in the context of search and recommendation systems. They include selection and presentation bias in many forms, interaction bias, social bias, etc. We give several examples and their relation to sparsity and privacy, stressing the importance of the user context to avoid these biases.

    10:15 – 10:45
    Break
    10:45 – 11:30
    Q&A panel
     
    Project work
  • Friday 9th September 2016 Open or Close
    09:00 – 09:15
    Administrative announcements
    09:15 – 10:15
    Keynote: Rayid Ghani (University of Chicago, USA)

    Data Science for Social Impact: Case Studies, Challenges, and Opportunities
    Can Data Science help reduce police violence and misconduct? Can it help prevent children from getting lead poisoning? Can it help cities better target limited resources to improve lives of citizens? We’re all aware of the data science hype right now but turning this hype into any social impact takes effort. In this talk, I’ll discuss lessons learned while working on dozens of projects over the past few years with non-profits and governments on high-impact social challenges. These lessons span from challenges these organizations face when trying to use data science, to understanding how to effectively train and build cross-disciplinary teams to do practical data science, as well as what machine learning and social science research challenges need to be tackled, and what tools and techniques need to be developed in order to have a social and policy impact with machine learning.

    10:15 – 10:45
    Break
     
    Project work
  • Saturday 10th September 2016 Open or Close
    09:00 – 10:00
    Project presentations
    10:00 – 10:30
    Break
    10:30 – 11:30
    Project presentations
    11:30 – 12:30
    Panel
    12:30 – 13:00
    Awards and closing