Organizations face a data discovery problem when their analysts spend more time finding relevant data than analyzing it. This problem has become common as: i) data is stored across multiple storage systems, from databases to data lakes; ii) data scientists do not operate within the limits of well-defined schemas, instead they want to find data across their organization to answer increasingly complex business questions. We have built Aurum as part of the Data Civilizer project. Aurum is a system to tackle data discovery problems at large. It introduces a new discovery language, SRQL, that permits users to declare their intuition of what is relevant through a set of data primitives that expose the relations of the underlying data. The algebra relies on an enterprise knowledge graph (EKG) to answer queries in human-scale latencies. Aurum is scalable: it builds the EKG in linear time, despite the complexity of extracting complex relationships among thousands of data sources.