Kernelization

Kernelization (CS6100)

Important Links

Class: meet.google.com/iyf-mkpq-eqx
Live class quiz/poll: http://etc.ch/ZwKk

Teaching Group

Instructor: Akanksha Agrawal
Teaching Assistant: Hindanjali Harwanshi

An Introductory Note

A Starting Point (via Parameterized Complexity):  Most of you may have seen different algorithms during your journey in computer science, and more often than not, we envision designing better and better algorithms (or refute their existence), where the "betterness" is quantified by a function of the input size. For example, Merge Sort runs in time O(n log n) and Insertion Sort runs in time O(n^2), where n is the number of numbers given as input.

If you recall correctly, Radix Sort takes time O(n. w), where n is the number of numbers given as input and w is the key length. Quoting the Wikipedia page on Radix Sort:

"In the modern era, radix sorts are most commonly applied to collections of binary strings and integers. It has been shown in some benchmarks to be faster than other more general purpose sorting algorithms, sometimes 50% to three times faster."

If you notice, in the analysis of the algorithm for radix sort, we used two numbers, the number of numbers that we are required to sort and the key length. Practically, as per the quoting previously mentioned, Radix Sort seems to be performing much better, which is an algorithm based on exploiting the structural properties of the input that the key length is possibly small.

Extending forward the exploitation and exploration of how beneficial the specialness of the input may be in obtaining better algorithms is the framework of parameterized complexity, where we design and analyze the algorithms with respect to multiple variables, one of them being the input size, and others being structural properties of the input/output.

The field of parameterized complexity has grown a lot in the last few decades, and we have a very rich collection of tools and techniques to design better algorithms, refute algorithms that perform better than a fixed threshold, design efficient preprocessing algorithms, refute the existence of good preprocessing algorithms , design/refute good (multivariate) approximation algorithms, to name a few.

Preprocessing and Kernelization: The objective of a preprocessing algorithm for a given instance I of a problem Q is to output another instance I' of the problem Q, where the encoding length the instance I' is substantially smaller   than that of the instance I. The hope with such an algorithm at hand is that, any algorithm, even ones with quite bad running time may suffice for our purpose, as the input size of the resulting instance is small. One of the ways to mathematically formulate efficiency of a preprocessing algorithm is via the route of parameterization, which constitutes the field of Kernelization. This course focuses on kernelization, and we will learn several combinatorial tools and techniques used in designing such algorithms. We will further look at how one refutes certain type of preprocessing algorithms, conditionally.

Contents of the Course

First part of the course will focus on the basic topics/tools used in designing kernelization algorithms, which are:

Formal definitions of kernels and Turing kernels
Designing basic kernels for problems like Vertex Cover, Feedback Arc Set, Edge Clique Cover, etc.
Inductive Priorities
Crown Decomposition
Expansion Lemma
Borrowing some tricks from Linear Algebra for designing kernels
Hypertrees
Sunflower Lemma
Matroids
Representative Families
Greedy techniques
Kernelization lower bounds via cross-composiitons

We will also cover a few of the following advanced topics is kernelization:

Kernelization lower bounds via weak cross-composition
Modular partitions and its applications in obtaining kernels
Treewidth
Monadic Second Order Logic and Courcell's Theorem
Bidimensionality
Protrusion replacement
Lossy kernelization
More on Turing kernelization
Advanced topics in kernel lower bounds
An advanced example of linear algebra based kernel

Tune in for fun stuff!

Some Practical Information

Tentative Timings: Monday 10:00-10:50, Tuesday 09:00-09:50, Wednesday 08:00-08:50, and Friday 12:00-12:50. To facilitate learning in the online setting, we will attempt to dedicate certain amount of time (almost) every week to solving short question and building short proofs ourselves.

Evaluation (tentative): Evaluation will be based on the following:

Final exam: 35%
Mid-term exam: 25%
Two assignments: 20%
Scribing a few notes: 10%
Class participations, solving quizes, and building short proofs: 10%

The last component is added in the evaluation with the hope to enhance learning, particularly in the online setting. In case some of you face connectivity issues, then we will make alternative arrangements to facilitate in the last component of the evaluation. In the above case, I will expect that students communicate the issue via email to me, as and when necessary.

Unless all students are allowed to get back to the campus, all exams will be conducted online, and some of them can be oral, depending on the number of students.