New Algorithms for Conditional Linear Regression
Speaker
Brendan Juba
Host
Akshay Degwekar, Pritish Kamath and Govind Ramnarayan
MIT CSAIL
Abstract: The kinds of rules that we know how to fit to data, such as linear rules, are rather limited. Hence we desire algorithms that can find rules that partially
fit the data. We are particularly interested in cases where the predictor is only valid on a small subset of the data, less than 50%. Recent work on such
tasks often cannot achieve meaningful guarantees for rich problems such as linear prediction or classification without using some extra structure. In the conditional linear regression task, we seek a k-DNF rule describing such a subset of the data distribution on which a low-error linear predictor exists, together with a description of this linear predictor function. I will discuss new algorithms for this problem, for the usual squared-error loss, that find a k-DNF that selects nearly the same fraction of the data as some optimal k-DNF.
On the k-DNF we find, we can guarantee for example that there is a linear rule achieving a O~(r log log n) approximation to the loss of the optimal linear rule
on an optimal r-term k-DNF on n attributes, given that the covariance of the distribution on the individual terms does not vary too much.
This talk is based on joint works with Calderon, Li, Li, and Ruan; and with
Hainline, Le, and Woodruff.
fit the data. We are particularly interested in cases where the predictor is only valid on a small subset of the data, less than 50%. Recent work on such
tasks often cannot achieve meaningful guarantees for rich problems such as linear prediction or classification without using some extra structure. In the conditional linear regression task, we seek a k-DNF rule describing such a subset of the data distribution on which a low-error linear predictor exists, together with a description of this linear predictor function. I will discuss new algorithms for this problem, for the usual squared-error loss, that find a k-DNF that selects nearly the same fraction of the data as some optimal k-DNF.
On the k-DNF we find, we can guarantee for example that there is a linear rule achieving a O~(r log log n) approximation to the loss of the optimal linear rule
on an optimal r-term k-DNF on n attributes, given that the covariance of the distribution on the individual terms does not vary too much.
This talk is based on joint works with Calderon, Li, Li, and Ruan; and with
Hainline, Le, and Woodruff.