SIGKDD is the premier Data Science conference. We invite original technical research contributions in all aspects of the data science lifecycle including but not limited to: data cleaning and preparation, data transformation, mining, inference, learning, explainability, data privacy and dissemination of results. Technical data science contributions which advance United Nations Sustainable Development Goals (SDGs) are encouraged.
Data Cleaning and Preparation: A significant part of the data science lifecycle is spent on data cleaning and preparation. In several domains, data cleaning tasks continue to be rule-based and are often brittle, i.e., they break down in face of a constantly changing and evolving environment. Learning-based approaches for data cleaning and preparation which are generalizable and adaptive across domains are highly sought.
Data Transformation and Integration: The process of mapping data from one representation into another is at the heart of data science. The mapping can be query driven, based on a statistical task or might involve integrating data from myriad sources. We seek original contributions which address the trade-off between the complexity of the transformation and algorithmic efficiency.
Mining, Inference and Learning: These topics are the kernel of knowledge discovery from databases (KDD) paradigm and continue to witness massive growth. While classical aspects of supervised learning have been mainstreamed into the development cycle, new variations on unsupervised learning like self-supervision, few shot learning, prescriptive learning (reinforcement learning), transfer learning, meta learning and representational learning are pushing the research boundary in a world where the proportion of labeled and annotated data is becoming miniscule. In each of these topics we seek submissions which highlight the trade-off between accuracy, stability, robustness and efficiency. Submissions which propose “new” inference tasks are strongly encouraged.
Explainability: As data science models are becoming part of daily human activity there is a need, often being expressed in law, that the models be fair, interpretable and provide mechanisms to explain how a prediction or decision by the model was arrived at. Interpretable models will lead to their wider acceptance in the society at large and increase the value of Data Science as a discipline in its own right.
Data Privacy and Ethics: Data privacy or lack thereof, continues to be the achilles heel of the whole data science enterprise. We seek technical contributions that advance the state of data science methods while guaranteeing individual privacy, respect for societal norms and ethical integrity.
Model Dissemination: Migrating a data science model from a research lab to a real world deployment is non-trivial and potentially a continuous ongoing process. We seek research submissions which highlight and address technical and behavioral challenges during model deployment, feedback and upgradation.