Data4Policy.org

Data Science and Big Data in the Public Sector

About

Data4policy.org is an evolving web collection of articles and research papers about using data science and big data in the public sector. Data4policy.org is based at the University of Oxford and sponsored by the Cyber Studies Programme, currently maintained by Innar Liiv (Associate Professor, Tallinn University of Technology & Visiting Research Fellow, University of Oxford).

Data for Policy

ArticleYear
Kim, Gang-Hoon Trimi, Silvana Chung, Ji-Hyong, "Big-data applications in the government sector", pages 78-85, 20142014
Poel, Martijn Schroeder, Ralph Blackman, Colin, "Data for Policy: A study of big data and other innovative data-driven approaches for evidence-informed policymaking", 20152015
Zarsky, Tal Z, "Governmental data mining and its alternatives", HeinOnline, pages 285, 20112011
Rajagopalan, MR Vellaipandiyan, Solaimurugan, "Big data framework for national E-governance plan", ICT and Knowledge Engineering (ICT\&KE), 2013 11th International Conference on, pages 1-5, 20132013
Attard, Judie Orlandi, Fabrizio Scerri, Simon Auer, Soren, "A systematic review of open government data initiatives", Elsevier, pages 399--418, 20152015
Yiu, Chris, "The big data opportunity: Making government faster, smarter and more personal", Policy Exchange, 20122012
Williamson, Andy, "Big Data and the Implications for Government", Cambridge Univ Press, pages 253--257, 20142014
Cate, Fred H, "Government data mining: The need for a legal framework", 20082008
Slobogin, Christopher, "Government data mining and the fourth amendment", JSTOR, pages 317--341, 20082008
Morabito, Vincenzo, "Big Data and Analytics for Government Innovation", Big Data and Analytics, Springer, pages 23--45, 20152015
Munne, Ricard, "Big Data in the Public Sector", New Horizons for a Data-Driven Economy, Springer, pages 195--208, 20162016
Romijn, JH, "Using Big Data in the Public Sector. Uncertainties and Readiness in the Dutch Public Executive Sector", 20142014
Bertot, John Carlo Choi, Heeyoon, "Big data and e-government: issues, policies, and recommendations", Proceedings of the 14th Annual International Conference on Digital Government Research, pages 1--10, 20132013
Minow, Newton Cate, Fred H, "Government Data Mining", Mcgraw-Hill Handbook of Homeland Security, 20082008

Novel Data Sources

ArticleYear
Chunara, Rumi Andrews, Jason R Brownstein, John S, "Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak", ASTMH, pages 39--45, 20122012
Yuan, Qingyu Nsoesie, Elaine O Lv, Benfu Peng, Geng Chunara, Rumi Brownstein, John S, "Monitoring influenza epidemics in china with search query from baidu", Public Library of Science, pages e64323, 20132013
Dawes, Sharon S. Cresswell, Anthony M. Pardo, Theresa A., "From ??Need to Know?? to ??Need to Share??: Tangled Problems, Information Boundaries, and the Building of Public Sector Knowledge Networks", pages 392-402, 20092009

Policy for Data

ArticleYear
Janssen, Marijn van den Hoven, Jeroen, "Big and Open Linked Data (BOLD) in government: A challenge to transparency and privacy?", pages 363368, 20152015
Hu, Margaret, "Small Data Surveillance v. Big Data Cybersurveillance", pages 773-844, 20152015
Taylor, Nick, "To find the needle do you need the whole haystack? Global surveillance and principled regulation", Taylor \& Francis, pages 45--67, 20142014
Joh, Elizabeth E, "The New Surveillance Discretion: Automated Suspicion, Big Data, and Policing", 20152015
Morabito, Vincenzo, "Big Data and Analytics for Government Innovation", pages 23-45, 20152015
Dawes, Sharon S. Cresswell, Anthony M. Pardo, Theresa A., "From ??Need to Know?? to ??Need to Share??: Tangled Problems, Information Boundaries, and the Building of Public Sector Knowledge Networks", pages 392-402, 20092009

Agriculture, Forestry and Rural Development

ArticleYear
Mucherino, Antonio Papajorgji, Petraq Pardalos, Panos M, "A survey of data mining techniques applied to agriculture", Springer, pages 121--140, 20092009

Commerce and Industry

ArticleYear
Muller, Michel Aoki, Takayuki, "Hybrid Fortran: High Productivity GPU Porting Framework Applied to Japanese Weather Prediction Model", pages 22, 20172017

Defence

ArticleYear
Wang, Xiao-bin Yang, Guang-yuan Li, Yi-chao Liu, Dan, "Review on the application of Artificial Intelligence in Antivirus Detection System", pages 4, 20082008
Moskovitch, Robert Elovici, Yuval Lior, Rokach, "Detection of Unknown Computer Worms based on Behavioral Classification of the Host", pages 34, 20082008
Fatima, H. Pradhan, S. K. Khan, M. A., "Applying Data Mining Techniques in Cyber Crimes", 2017 2nd International Conference on Anti-Cyber Crimes (ICACC), pages 213-216, 20172017
Gengo, Gary Clifton, Chris, "Developing custom intrusion detection filters using data mining", MILCOM 2000. 21st Century Military Communications Conference Proceedings, pages 440--443, 20002000

Education and Culture

ArticleYear
Lakkaraju, Himabindu Aguiar, Everaldo Shan, Carl Miller, David Bhanpuri, Nasir Ghani, Rayid Addison, Kecia L, "A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1909--1918, 20152015
Maqsood Ali, Mohod, "Role of Data Mining in Education Sector", pages 374 383, 20132013
Deng, Hepu Duan, Xiaoxia Corbit, Brian, "The Impacts of Government Policies on the Efficiency of Australian Universities: A Multi-Period Data Envelopment Analysis", 2008 International Conference on Computational Intelligence and Security, pages 522-527, 20082008

Employment and Social Affairs

ArticleYear
Rivas, T. Paz, M. Martin, J.E. Matias, J.M. Garcia, J.F. Taboada, J., "Explaining and predicting workplace accidents using data-mining techniques", pages 739-747, 20112011

Energy

ArticleYear
Mitra, Rajendu Kota, Ramachandra Bandyopadhyay, Sambaran Arya, Vijay Sullivan, Brian Mueller, Richard Storey, Heather Labut, Gerard, "Voltage Correlations in Smart Meter Data", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1999--2008, 20152015
Fei, Hongliang Kim, Younghun Sahu, Sambit Naphade, Milind Mamidipalli, Sanjay K Hutchinson, John, "Heat pump detection from coarse grained smart meter data with positive and unlabeled learning", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1330--1338, 20132013
Tso, Geoffrey K. F. Yau, Kelvin K. W., "Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks", pages 1761-1768, 20072007
Ali, Usman Buccella, Concettina Cecati, Carlo, "Households electricity consumption analysis with data mining techniques", IEEE, pages 3966-3971, 20162016
Liu, Gengyuan Yang, Jin Hao, Yan Zhang, Yan, "Big data-informed energy efficiency assessment of China industry sectors based on K-means clustering", pages 304-314, 20182018

Environment

ArticleYear
Zhang, Y. Yi, Xiuwen Li, Ming Li, Ruiyuan Shan, Zhangqing Chang, Eric Li, Tianrui, "Forecasting fine-grained air quality based on big data", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2267--2276, 20152015
Wang, Dawei Ding, Wei Yu, Kui Wu, Xindong Chen, Ping Small, David L Islam, Shafiqul, "Towards long-lead forecasting of extreme flood events: a data mining framework for precipitation cluster precursors identification", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1285--1293, 20132013
Bellinger, Colin Jabbar, Mohomed Shazan Mohomed Zaiane, Osmar Osornio-Vargas, Alvaro, "A systematic review of data mining and machine learning for air pollution epidemiology", pages 907, 20172017
Wilson, Lu, "Big data analytics to identify illegal construction waste dumping: A Hong Kong study", pages 264-272, 20192019

Foreign Affairs

ArticleYear
Strauss, Nadine Kruikemeier, Sanne van der Meulen, Heleen van Noort, Guda, "Digital diplomacy in GCC countries: Strategic communication of Western embassies on Twitter", Elsevier, pages 369--379, 20152015

Health and Food Safety

ArticleYear
Paez, Diego Gachet Aparicio, Fernando Buenaga, Manuel Ascanio, Juan R. Hervas, Ramon Lee, Sungyoung Nugent, Chris Bravo, Jose, "Big Data and IoT for Chronic Patients Monitoring", Ubiquitous Computing and Ambient Intelligence. Personalisation and User Adapted Services, Springer, pages 416-423, 20142014
Zhang, Y. Qiu, M. Tsai, C. W. Hassan, M. M. Alamri, A., "Health-CPS: Healthcare Cyber-Physical System Assisted by Cloud and Big Data", pages 1-8, 20152015
Groves, Peter Kayyali, Basel Knott, David Kuiken, Steve Van, "The ?big data? revolution in healthcare: Accelerating value and innovation", 20132013
Perttula, Arttu Koivisto, Antti Makela, Riikka Suominen, Marko Multisilta, Jari, "Social Navigation with the Collective Mobile Mood Monitoring System", Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments, ACM, pages 117--124, 20112011
Beckman, Richard Bisset, Keith R Chen, Jiangzhuo Lewis, Bryan Marathe, Madhav Stretz, Paula, "Isis: A networked-epidemiology based pervasive web app for infectious disease pandemic planning and response", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1847--1856, 20142014
Park, Yubin Ghosh, Joydeep, "Ludia: An aggregate-constrained low-rank reconstruction algorithm to leverage publicly released health data", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 55--64, 20142014
Tran, Truyen Phung, Dinh Luo, Wei Harvey, Richard Berk, Michael Venkatesh, Svetha, "An integrated framework for suicide risk prediction", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1410--1418, 20132013
Somanchi, Sriram Adhikari, Samrachana Lin, Allen Eneva, Elena Ghani, Rayid, "Early Prediction of Cardiac Arrest (Code Blue) using Electronic Medical Records", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2119--2126, 20152015
Feldman, Ronen Netzer, Oded Peretz, Aviv Rosenfeld, Binyamin, "Utilizing Text Mining on Online Medical Forums to Predict Label Change due to Adverse Drug Reactions", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1779--1788, 20152015
Kate, Kiran Chaudhari, Sneha Prapanca, Andy Kalagnanam, Jayant, "FoodSIS: a text mining system to improve the state of food safety in singapore", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1709--1718, 20142014
Harpaz, Rave DuMouchel, William LePendu, Paea Shah, Nigam H, "Empirical Bayes model to combine signals of adverse drug reactions", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1339--1347, 20132013
Potash, Eric Brew, Joe Loewi, Alexander Majumdar, Subhabrata Reece, Andrew Walsh, Joe Rozier, Eric Jorgenson, Emile Mansour, Raed Ghani, Rayid, "Predictive modeling for public health: Preventing childhood lead poisoning", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2039--2047, 20152015
Shah, Nigam H, "Medicine in the age of electronic health records", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1518--1518, 20142014
Jee, Kyoungyoung Kim, Gang-Hoon, "Potentiality of big data in the medical sector: focus on how to reshape the healthcare system", pages 79--85, 20132013
Chunara, Rumi Andrews, Jason R Brownstein, John S, "Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak", ASTMH, pages 39--45, 20122012
Yuan, Qingyu Nsoesie, Elaine O Lv, Benfu Peng, Geng Chunara, Rumi Brownstein, John S, "Monitoring influenza epidemics in china with search query from baidu", Public Library of Science, pages e64323, 20132013
Ropodi, A. I. Panagou, E. Z. Nychas, G. -J. E., "Data mining derived from food analyses using non-invasive/non- destructive analytical techniques; determination of food authenticity, quality & safety in tandem with computer science disciplines", pages 11-25, 20162016
Brossette, Stephen E. Sprague, Alan P. Hardin, J. Michael Waites, Ken B. Warren, T. Jones Moser, Stephen A., "Association Rules and Data Mining in Hospital Infection Control and Public Health Surveillance", 19981998
Pereira, Sonia Portela, Filipe Santos, Manuel F. Machado, Jose Abelha, Antonio, "Clustering-based Approach for Categorizing Pregnant Women in Obstetrics and Maternity Care", Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering, pages 98-101, 20152015
Sivagowry, S. Durairaj, M. Persia, A., "An empirical study on applying data mining techniques for the analysis and prediction of heart disease", pages 6, 20132013
Koh, Hian Chye Tan, Gerald, "Data Mining Applications in Healthcare", pages 64-72, 20052005
Durairaj, M. Ranjani, V., "Data Mining Applications In Healthcare Sector: A Study", pages 29-35, 20132013
Rahaman, Sophia Biju, Seena, "Data Mining Facilitates e-Patients through e-Healthcare: An Empirical Study", pages 1158-1165, 20092009
LavracV, Nada Bohanec, Marko Pur, Aleksander Cestnik, Bojan Debeljak, Marko Kobler, Andrej, "Data mining and visualization for decision support and modeling of public health-care resources", pages 438-447, 20072007
Tesfaye, Brooke Atique, Sulemas Elias, Noah Dibaba, Legesse Shabbir, Syed-Abdul Kebede, Mihiretu, "Determinants and development of a web-based child mortality prediction model in resource-limited settings: A data mining approach", pages 45-51, 20172017
Benyoussef, El Mehdi Elbyed, Abdeltif Hadiri, Hind El, "Data Mining Approaches for Alzheimer?s Disease Diagnosis", International Symposium on Ubiquitous Networking, Springer, Cham, pages 619-631, 20172017
Murdoch, Travis B Detsky, Allan S, "The Inevitable Application of Big Data to Health Care", pages 1351-1352, 20132013

Housing and Urban Development

ArticleYear
Kermany, Einat Mazzawi, Hanna Baras, Dorit Naveh, Yehuda Michaelis, Hagai, "Analysis of Advanced Meter Infrastructure Data of Water Consumption in Apartment Buildings", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM Press, pages 1159-1167, 20132013
Emerson, Daniel Weligamage, Justin Nayak, Richi, "A data mining driven risk profiling method for road asset management", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM Press, pages 1267-1275, 20132013
Green, Ben Caro, Alejandra Conway, Matthew Manduca, Robert Plagge, Tom Miller, Abby, "Mining Administrative Data to Spur Urban Revitalization", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1829--1838, 20152015
Zoeter, Onno Dance, Christopher Clinchant, Stephane Andreoli, Jean-Marc, "New algorithms for parking demand management and a city-scale deployment", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1819--1828, 20142014
Zoeter, Onno Dance, Christopher Clinchant, Stephane Andreoli, Jean-Marc, "New algorithms for parking demand management and a city-scale deployment", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1819--1828, 20142014
Wu, Huayu Ng, Wee Siong Tan, Kian-Lee Wu, Wei Xiang, Shili Xue, Mingqiang, "A privacy preserving framework for managing vehicle data in road pricing systems", Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1427--1435, 20132013
Nuaimi, Eiman Al Neyadi, Hind Al Mohamed, Nader Al-Jaroodi, Jameela, "Applications of big data to smart cities", pages 15, 20152015
Hashem, Ibrahim Abaker Targio Chang, Victor Anuar, Nor Badrul Adewole, Kayode Yaqoob, Ibrar Gani, Abdullah Ahmed, Ejaz Chiroma, Haruna, "The role of big data in smart city", pages 748758, 20162016
Knorr, Edwin M. Ng, Raymond T., "Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining", pages 884 - 897, 19961996
Rauch, Nadia Nesi, Paolo Bellini, Pierfrancesco, "Knowledge Base Construction Process for Smart-city Services", 2014 19th International Conference on Engineering of Complex Computer Systems, pages 186-189, 20142014
Khan, Murad Iqbal, Javed Talha, Muhammad Arshad, Muhammad Diyan, Muhammad Han, Kijun, "Big Data Processing using Internet of Software Defined Things in Smart Cities", pages 1-14, 20182018
Uppoor, Sandesh Trullols-Cruces, Oscar Fiore, Marco Barcelo-Ordinas, Jose M, "Generation and Analysis of a Large-Scale Urban Vehicular Mobility Dataset", pages 1061-1075, 20142014
Mohammadi, Mehdi Al-Fuqaha, Ala, "Enabling Cognitive Smart Cities Using Big Data and Machine Learning: Approaches and Challenges", pages 94-101, 20182018

Interior

ArticleYear
van den Hoven, Jeroen Cocx, Tim Kosters, Walter A., "Onto clustering criminal careers", pages 92-95, 20062006
Adderley, Richard Townsley, Michael Bond, John, "Use of data mining techniques to model crime scene investigator performance", pages 170176, 20072007
Xiang, Yang Chau, Michael Atabakhsh, Homa Chen, Hsinchun, "Visualizing criminal relationships: comparison of a hyperbolic tree and a hierarchical list", pages 69 83, 20052005
Hale, Scott A Margetts, Helen Yasseri, Taha, "Petition growth and success rates on the UK No. 10 Downing Street website", Proceedings of the 5th annual ACM web science conference, pages 132-138, 20132013

Mobility and Transportation

ArticleYear
Xue, Mingqiang Wu, Huayu Chen, Wei Ng, Wee Siong Goh, Gin Howe, "Identifying tourists from public transport commuters", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1779--1788, 20142014
Holleczek, Thomas Yin, Shanyang Jin, Yunye Antonatos, Spiros Goh, Han Leong Low, Samantha Shi-Nash, Amy others, "Traffic Measurement and Route Recommendation System for Mass Rapid Transit (MRT)", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1859--1868, 20152015
Dowling, Chase Fiez, Tanner Ratlif, Lillian Zhang, Baosen, "How Much Urban Traffic is Searching for Parking?", pages 19, 20172017
Lin, Canhong Choy, King-lun Pang, Grantham Ng, Michelle T. W., "A Data Mining and Optimization-based Real-time Mobile Intelligent Routing System for City Logistics", pages 156-161, 20132013
Zhou, Meng Wang, Donggen Li, Qingquan Yue, Yang Tu, Wei Cao, Rui, "Impacts of weather on public transport ridership: Results from mining data from different sources", pages 1729, 20172017
Capra, Licia Lathia, Neal, "Mining mobility data to minimise travellers' spending on public transport", Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1181--1189, 20112011
Prati, Gabriele Pietrantoni, Luca Fraboni, Federico, "Using data mining techniques to predict the severity of bicycle crashes", pages 44-54, 20172017
Pinelli, Fabio Nair, Rahul Calabrese, Francsesco Berlingerio, Michele Di lorenzo, Giusy Sbodio, Marco Luca, "Data-driven transit network design from mobile phone trajectories", pages 1724--1733, 20162016
Agard, Bruno Trepanier, Martin Morency, Catherine, "MINING PUBLIC TRANSPORT USER BEHAVIOUR FROM SMART CARD DATA", pages 399-404, 20062006
Ma, Xiaolei Wu, Yao-Jan Wang, Yinhai Chen, Feng Liu, Jianfeng, "Mining smart card data for transit riders? travel patterns", pages 1-12, 20132013
Kusakabe, Takahiko Asakura, Yasuo, "Behavioural data mining of transit smart card data: A data fusion approach", pages 179--191, 20142014

Research and Innovation

ArticleYear
Spangler, Scott Wilkins, Angela D Bachman, Benjamin J Nagarajan, Meena Dayaram, Tajhal Haas, Peter Regenbogen, Sam Pickering, Curtis R Comer, Austin Myers, Jeffrey N others, "Automated hypothesis generation based on mining scientific literature", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1877--1886, 20142014
Duggan, Jennie Brodie, Michael L, "Hephaestus: Data Reuse for Accelerating Scientific Discovery.", CIDR, 20152015
Nagarajan, Meenakshi Wilkins, Angela D Bachman, Benjamin J Novikov, Ilya B Bao, Shenghua Haas, Peter Terron-Diaz, Maria E Bhatia, Sumit Adikesavan, Anbu K Labrie, Jacques J others, "Predicting Future Scientific Discoveries Based on a Networked Analysis of the Past Literature", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2019--2028, 20152015
Kitano, Hiroaki, "Artificial Intelligence to Win the Nobel Prize and Beyond: Creating the Engine for Scientific Discovery.", 20162016
Lakomaa, Erik Kallberg, Jan, "Open Data as a Foundation for Innovation: The Enabling Effect of Free Public Sector Information for Entrepreneurs", pages 558 - 563, 20132013
Angeli, Charoula K.Howard, Sarah Ma, Jun Yang, Jie A.Kirschner, Paul, "Data mining in educational technology classroom research: Can it make a contribution?", pages 226-242, 20172017

Treasury

ArticleYear
Dhurandhar, Amit Graves, Bruce Ravi, Rajesh Maniachari, Gopikrishanan Ettl, Markus, "Big Data System for Analyzing Risky Procurement Entities", Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1741--1750, 20152015
Junque de Fortuny, Enric Stankova, Marija Moeyersoms, Julie Minnaert, Bart Provost, Foster Martens, David, "Corporate residence fraud detection", Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1650--1659, 20142014
Septanti, Citra Putri Siregar, Hermanto Sasongko, Hendro, "Analysis the Impact of Macroeconomic on Financial Performance and Stock Returns of State-Owned Banks in Indonesia Stock Exchange 2006-2016", pages 212-217, 20162016
Golmohammadi, Koosha Zaiane, Osmar R., "Data Mining Applications for Fraud Detection in Securities Market", 20122012