Publications

Sort by: Author Type Year

2021

Gauch, M., J. Mai, and J. Lin, "The proper care and feeding of CAMELS: How limited training data affects streamflow prediction.", Environ. Model. Softw., vol. 135, pp. 104926, 2021.

2020

Kassaie, B., and F. Tompa, "A Framework for Extracted View Maintenance.", DocEng, pp. 16:1-16:4, 2020.
Yilmaz, Z., C. Clarke, and J. Lin, "A Lightweight Environment for Learning Experimental IR Research Practices.", SIGIR, pp. 2113–2116, 2020.
Vtyurina, A., C. Clarke, E. Law, J. Trippas, and H. Bota, "A Mixed-Method Analysis of Text and Audio Search Interfaces with Varying Task Complexity.", ICTIR, pp. 61-68, 2020.
Ghenai, A., M. Smucker, and C. Clarke, "A Think-Aloud Study to Understand Factors Affecting Online Health Search.", CHIIR, pp. 273–282, 2020.
Gauch, M., J. Bai, J. Mai, and J. Lin, "An Open-Source Interface to the Canadian Surface Prediction Archive.", JCDL, pp. 529–530, 2020.
Tu, Z., W. Yang, Z. Fu, Y. Xie, L. Tan, K. Xiong, M. Li, and J. Lin, "Approximate Nearest Neighbor Search and Lightweight Dense Vector Reranking in Multi-Stage Retrieval Architectures.", ICTIR, pp. 97–100, 2020.
Wu, R., A. Zhang, I. Ilyas, and T. Rekatsinas, "Attention-based Learning for Missing Data Imputation in HoloClean.", MLSys, 2020.
Fritz, S., I. Milligan, N. Ruest, and J. Lin, "Building community at distance: a datathon during COVID-19.", Digit. Libr. Perspect., vol. 36, pp. 415-428, 2020.
Yates, A., S. Arora, X. Zhang, W. Yang, K. Jose, and J. Lin, "Capreolus: A Toolkit for End-to-End Neural Ad Hoc Retrieval.", WSDM, pp. 861–864, 2020.
Glasbergen, B., K. Langendoen, M. Abebe, and K. Daudjee, "ChronoCache: Predictive and Adaptive Mid-Tier Query Result Caching.", SIGMOD Conference, pp. 2391–2406, 2020.
Agarwal, R., D. Kumar, L. Golab, and S. Keshav, "Consentio: Managing Consent to Data Access using Permissioned Blockchains.", IEEE ICBC, pp. 1-9, 2020.
Adewoye, T., X. Han, N. Ruest, I. Milligan, S. Fritz, and J. Lin, "Content-Based Exploration of Archival Images Using Neural Networks.", JCDL, pp. 489–490, 2020.
Zhang, E., N. Gupta, R. Tang, X. Han, R. Pradeep, K. Lu, Y. Zhang, R. Nogueira, K. Cho, and H. Fang, "Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset.", SDP@EMNLP, pp. 31-41, 2020.
Shi, P., H. Bai, and J. Lin, "Cross-Lingual Training of Neural Models for Document Ranking.", EMNLP (Findings), pp. 2768–2773, 2020.
Xin, J., R. Tang, J. Lee, Y. Yu, and J. Lin, "DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference.", ACL, pp. 2246–2251, 2020.
Yang, J-H., S-C. Lin, R. Nogueira, M-F. Tsai, C-J. Wang, and J. Lin, "Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models.", COLING, pp. 3449–3453, 2020.
Xie, Y., W. Yang, L. Tan, K. Xiong, N. Yuan, B. Huai, M. Li, and J. Lin, "Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering.", WWW, pp. 2934-2940, 2020.
Nogueira, R., Z. Jiang, R. Pradeep, and J. Lin, "Document Ranking with a Pretrained Sequence-to-Sequence Model.", EMNLP (Findings), pp. 708–718, 2020.
Ng, Y., D. Fraser, B. Kassaie, G. Labahn, M. Marzouk, F. Tompa, and K. Wang, "Dowsing for Math Answers with Tangent-L.", CLEF (Working Notes), 2020.
Abebe, M., B. Glasbergen, and K. Daudjee, "DynaMast: Adaptive Dynamic Mastering for Replicated Systems.", ICDE, pp. 1381–1392, 2020.
Zhang, X., M. \" Ozsu, and L. Chen, "ELite: Cost-effective Approximation of Exploration-based Graph Analysis.", GRADES-NDA@SIGMOD, pp. 6:1-6:10, 2020.
Szlichta, J., P. Godfrey, L. Golab, M. Kargar, and D. Srivastava, "Erratum for Discovering Order Dependencies through Order Compatibility (EDBT 2019).", EDBT, pp. 659–663, 2020.
Nogueira, R., Z. Jiang, K. Cho, and J. Lin, "Evaluating Pretrained Transformer Models for Citation Recommendation.", BIR@ECIR, pp. 89–100, 2020.
Adhikari, A., A. Ram, R. Tang, W. Hamilton, and J. Lin, "Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT.", RepL4NLP@ACL, pp. 72–77, 2020.
Toman, D., and G. Weddell, "First Order Rewritability for Ontology Mediated Querying in Horn-DLFD.", Description Logics, 2020.
Yates, A., K. Jose, X. Zhang, and J. Lin, "Flexible IR Pipelines with Capreolus.", CIKM, pp. 3181–3188, 2020.
Yan, D., G. Guo, M. Chowdhury, M. \" Ozsu, W-S. Ku, and J. Lui, "G-thinker: A Distributed Framework for Mining Subgraphs in a Big Graph.", ICDE, pp. 1369–1380, 2020.
Lin, J., C. Zhong, D. Hu, C. Rudin, and M. Seltzer, "Generalized and Scalable Optimal Sparse Decision Trees.", ICML, pp. 6150–6160, 2020.
Zeng, L., L. Zou, M. \" Ozsu, L. Hu, and F. Zhang, "GSI: GPU-friendly Subgraph Isomorphism.", ICDE, pp. 1249–1260, 2020.
Tang, R., J. Lee, A. Razi, J. Cambre, I. Bicking, J. Kaye, and J. Lin, "Howl: A Deployed, Open-Source Wake Word Detection System.", CoRR, vol. abs/2008.09606, 2020.
Jiang, Z., R. Tang, J. Xin, and J. Lin, "Inserting Information Bottleneck for Attribution in Transformers.", EMNLP (Findings), pp. 3850–3857, 2020.
Kumar, D., L. Mou, L. Golab, and O. Vechtomova, "Iterative Edit-Based Unsupervised Sentence Simplification.", ACL, pp. 7918–7928, 2020.
Farhat, O., H. Bindra, and K. Daudjee, "Leaving stragglers at the window: low-latency stream sampling with accuracy guarantees.", DEBS, pp. 15-26, 2020.
Buchanan, G., D. McKay, C. Clarke, L. Azzopardi, and J. Trippas, "Made to Measure: A Workshop on Human-centred metrics for information seeking.", CHIIR, pp. 484–487, 2020.
Li, Q., M. \" Ozsu, and H. Xiong, "Message from the General Chairs of DSC 2020.", DSC, pp. 1, 2020.
Clarke, C., M. Smucker, and A. Vtyurina, "Offline Evaluation by Maximum Similarity to an Ideal Ranking.", CIKM, pp. 225–234, 2020.
Clarke, C., A. Vtyurina, and M. Smucker, "Offline Evaluation without Gain.", ICTIR, pp. 185–192, 2020.
Meng, X., and L. Golab, "Parallel Scheduling of Data-Intensive Tasks.", Euro-Par, pp. 117–133, 2020.
Khan, A., and L. Golab, "Reddit Mining to Understand Gendered Movements.", EDBT/ICDT Workshops, 2020.
Jacobs, A., S. Chopra, and L. Golab, "Reddit Mining to Understand Women’s Issues in STEM.", EDBT/ICDT Workshops, 2020.
Pacaci, A., A. Bonifati, and M. \" Ozsu, "Regular Path Query Evaluation on Streaming Graphs.", SIGMOD Conference, pp. 1415–1430, 2020.
Guo, R., and K. Daudjee, "Research challenges in deep reinforcement learning-based join query optimization.", aiDM@SIGMOD, pp. 3:1-3:6, 2020.
Glasbergen, B., M. Abebe, K. Daudjee, D. Vogel, and J. Zhao, "Sentinel: Understanding Data Systems.", SIGMOD Conference, pp. 2729–2732, 2020.
Tang, R., J. Lee, J. Xin, X. Liu, Y. Yu, and J. Lin, "Showing Your Work Doesn’t Always Work.", ACL, pp. 2766–2772, 2020.
Satuluri, V., Y. Wu, X. Zheng, Y. Qian, B. Wichers, Q. Dai, G. Tang, J. Jiang, and J. Lin, "SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter.", KDD, pp. 3183-3193, 2020.
Ozsu, M. \", "Streaming graph processing and analytics.", DEBS, pp. 1, 2020.
Lin, J., J. Mackenzie, C. Kamphuis, C. MacDonald, A. Mallia, M. Siedlaczek, A. Trotman, and A. de Vries, "Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format.", SIGIR, pp. 2149–2152, 2020.
Sequiera, R., L. Tan, Y. Zhang, and J. Lin, "Update Delivery Mechanisms for Prospective Information Needs: A Reproducibility Study.", CHIIR, pp. 308–312, 2020.
Lin, J., I. Milligan, D. Oard, N. Ruest, and K. Shilton, "We Could, but Should We?: Ethical Considerations for Providing Access to GeoCities and Other Historical Digital Collections.", CHIIR, pp. 135–144, 2020.
Kamphuis, C., A. de Vries, L. Boytsov, and J. Lin, "Which BM25 Do You Mean? A Large-Scale Reproducibility Study of Scoring Variants.", ECIR (2), pp. 28–34, 2020.
Gorenflo, C., L. Golab, and S. Keshav, "XOX Fabric: A hybrid approach to blockchain transaction execution.", IEEE ICBC, pp. 1-9, 2020.
Gauch, M., and J. Lin, "A Data Scientist’s Guide to Streamflow Prediction.", CoRR, vol. abs/2006.12975, 2020.
Lin, J., "A Prototype of Serverless Lucene.", CoRR, vol. abs/2002.01447, 2020.
Mhedhbi, A., P. Gupta, S. Khaliq, and S. Salihoglu, "A+ Indexes: Lightweight and Highly Flexible Adjacency Lists for Graph Database Management Systems.", CoRR, vol. abs/2004.00130, 2020.
Chen, Y., G. Xiao, M. \" Ozsu, C. Liu, A. Zomaya, and T. Li, "aeSpTV: An Adaptive and Efficient Framework for Sparse Tensor-Vector Product Kernel on a High-Performance Computing Platform.", IEEE Trans. Parallel Distributed Syst., vol. 31, pp. 2329–2345, 2020.
Livshits, E., A. Heidari, I. Ilyas, and B. Kimelfeld, "Approximate Denial Constraints.", Proc. VLDB Endow., vol. 13, pp. 1682–1695, 2020.
Livshits, E., A. Heidari, I. Ilyas, and B. Kimelfeld, "Approximate Denial Constraints.", CoRR, vol. abs/2005.08540, 2020.
Clarke, C., A. Vtyurina, and M. Smucker, "Assessing top-k preferences.", CoRR, vol. abs/2007.11682, 2020.
Oliveira, P., D. Kaster, C. Traina, Jr., and I. Ilyas, "Batchwise Probabilistic Incremental Data Cleaning.", CoRR, vol. abs/2011.04730, 2020.
Khan, A., L. Golab, M. Kargar, J. Szlichta, and M. Zihayat, "Compact group discovery in attributed graphs and social networks.", Inf. Process. Manag., vol. 57, pp. 102054, 2020.
Lin, S-C., J-H. Yang, R. Nogueira, M-F. Tsai, C-J. Wang, and J. Lin, "Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models.", CoRR, vol. abs/2004.01909, 2020.
Zhang, E., N. Gupta, R. Tang, X. Han, R. Pradeep, K. Lu, Y. Zhang, R. Nogueira, K. Cho, H. Fang, et al., "Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset.", CoRR, vol. abs/2007.07846, 2020.
Ding, S., E. Zhang, and J. Lin, "Cydex: Neural Search Infrastructure for the Scholarly Literature.", SDP@EMNLP, pp. 168–173, 2020.
Xin, J., R. Tang, J. Lee, Y. Yu, and J. Lin, "DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference.", CoRR, vol. abs/2004.12993, 2020.
Kassaie, B., and F. Tompa, "Detecting Opportunities for Differential Maintenance of Extracted Views.", CoRR, vol. abs/2007.01973, 2020.
Karegar, R., M. Mirsafian, P. Godfrey, L. Golab, M. Kargar, D. Srivastava, and J. Szlichta, "Discovering Domain Orders through Order Dependencies.", CoRR, vol. abs/2005.14068, 2020.
Lin, S-C., J-H. Yang, and J. Lin, "Distilling Dense Representations for Ranking using Tightly-Coupled Teachers.", CoRR, vol. abs/2010.11386, 2020.
Nogueira, R., Z. Jiang, and J. Lin, "Document Ranking with a Pretrained Sequence-to-Sequence Model.", CoRR, vol. abs/2003.06713, 2020.
Zhang, H., G. Cormack, M. Grossman, and M. Smucker, "Evaluating sentence-level relevance feedback for high-recall information retrieval.", Inf. Retr. J., vol. 23, pp. 1-26, 2020.
Gorenflo, C., S. Lee, L. Golab, and S. Keshav, "FastFabric: Scaling hyperledger fabric to 20 000 transactions per second.", Int. J. Netw. Manag., vol. 30, 2020.
Lin, J., C. Zhong, D. Hu, C. Rudin, and M. Seltzer, "Generalized Optimal Sparse Decision Trees.", CoRR, vol. abs/2006.08690, 2020.
Sahu, S., and S. Salihoglu, "Graphsurge: Graph Analytics on View Collections Using Differential Computation.", CoRR, vol. abs/2004.05297, 2020.
Jiang, Z., R. Tang, J. Xin, and J. Lin, "Inserting Information Bottlenecks for Attribution in Transformers.", CoRR, vol. abs/2012.13838, 2020.
Chen, S., P. Chrysanthis, K. Daudjee, M. Hsu, and M. Sadoghi, "Introduction to the special issue on Self-managing and Hardware-Optimized Database Systems 2019.", Distributed Parallel Databases, vol. 38, pp. 767–769, 2020.
Kumar, D., L. Mou, L. Golab, and O. Vechtomova, "Iterative Edit-Based Unsupervised Sentence Simplification.", CoRR, vol. abs/2006.09639, 2020.
Ge, C., S. Mohapatra, X. He, and I. Ilyas, "Kamino: Constraint-Aware Differentially Private Data Synthesis.", CoRR, vol. abs/2012.15713, 2020.
Li, M., H. Bai, L. Tan, K. Xiong, M. Li, and J. Lin, "Latte-Mix: Measuring Sentence Semantic Similarity with Latent Categorical Mixtures.", CoRR, vol. abs/2010.11351, 2020.
Agarwal, R., R. Cohen, L. Golab, and A. Tsang, "Locating Influential Agents in Social Networks: Budget-Constrained Seed Set Selection.", Canadian Conference on AI, pp. 15–28, 2020.
Chen, L., and L. Golab, "Micro-journal mining to understand mood triggers.", Computing, vol. 102, pp. 1227–1244, 2020.
Abebe, M., B. Glasbergen, and K. Daudjee, "MorphoSys: Automatic Physical Design Metamorphosis for Distributed Database Systems.", Proc. VLDB Endow., vol. 13, pp. 3573–3587, 2020.
Nogueira, R., Z. Jiang, K. Cho, and J. Lin, "Navigation-based candidate expansion and pretrained language models for citation recommendation.", Scientometrics, vol. 125, pp. 3001–3016, 2020.
Nogueira, R., Z. Jiang, K. Cho, and J. Lin, "Navigation-Based Candidate Expansion and Pretrained Language Models for Citation Recommendation.", CoRR, vol. abs/2001.08687, 2020.
Heidari, A., S. Kushagra, and I. Ilyas, "On sampling from data with duplicate records.", CoRR, vol. abs/2008.10549, 2020.
Wang, X-J., M. Grossman, and S. Hyun, "Participation in TREC 2020 COVID Track Using Continuous Active Learning.", CoRR, vol. abs/2011.01453, 2020.
Lin, J., R. Nogueira, and A. Yates, "Pretrained Transformers for Text Ranking: BERT and Beyond.", CoRR, vol. abs/2010.06467, 2020.
Gauch, M., F. Kratzert, D. Klotz, G. Nearing, J. Lin, and S. Hochreiter, "Rainfall-Runoff Prediction at Multiple Timescales with a Single Long Short-Term Memory Network.", CoRR, vol. abs/2010.07921, 2020.
Zhang, R., W. Yang, L. Lin, Z. Tu, Y. Xie, Z. Fu, Y. Xie, L. Tan, K. Xiong, and J. Lin, "Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents.", CoRR, vol. abs/2002.01861, 2020.
Tang, R., R. Nogueira, E. Zhang, N. Gupta, P. Cam, K. Cho, and J. Lin, "Rapidly Bootstrapping a Question Answering Dataset for COVID-19.", CoRR, vol. abs/2004.11339, 2020.
Zhang, E., N. Gupta, R. Nogueira, K. Cho, and J. Lin, "Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned.", CoRR, vol. abs/2004.05125, 2020.
Heidari, A., G. Michalopoulos, S. Kushagra, I. Ilyas, and T. Rekatsinas, "Record fusion: A learning approach.", CoRR, vol. abs/2006.10208, 2020.
Pacaci, A., A. Bonifati, and M. \" Ozsu, "Regular Path Query Evaluation on Streaming Graphs.", CoRR, vol. abs/2004.02012, 2020.
Bryson, S., H. Davoudi, L. Golab, M. Kargar, Y. Lytvyn, P. Mierzejewski, J. Szlichta, and M. Zihayat, Robust keyword search in large attributed graphs., vol. 23, pp. 502-524, 2020.
Guo, G., D. Yan, M. \" Ozsu, and Z. Jiang, "Scalable Mining of Maximal Quasi-Cliques: An Algorithm-System Codesign Approach.", CoRR, vol. abs/2005.00081, 2020.
Pradeep, R., X. Ma, R. Nogueira, and J. Lin, "Scientific Claim Verification with VERT5ERINI.", CoRR, vol. abs/2010.11930, 2020.
Bai, H., P. Shi, J. Lin, L. Tan, K. Xiong, W. Gao, and M. Li, "SegaBERT: Pre-training of Segment-aware BERT for Language Understanding.", CoRR, vol. abs/2004.14996, 2020.
Bai, H., P. Shi, J. Lin, L. Tan, K. Xiong, W. Gao, J. Liu, and M. Li, "Semantics of the Unwritten.", CoRR, vol. abs/2004.02251, 2020.
Glasbergen, B., M. Abebe, K. Daudjee, and A. Levi, "Sentinel: Universal Analysis and Insight for Data Systems.", Proc. VLDB Endow., vol. 13, pp. 2720–2733, 2020.
Tang, R., J. Lee, J. Xin, X. Liu, Y. Yu, and J. Lin, "Showing Your Work Doesn’t Always Work.", CoRR, vol. abs/2004.13705, 2020.
Salem, K., "Special issue on best papers of DaMoN 2018.", VLDB J., vol. 29, pp. 755, 2020.
Boncz, P., and K. Salem, "Special issue on best papers of VLDB 2017.", VLDB J., vol. 29, 2020.
Lin, J., J. Mackenzie, C. Kamphuis, C. MacDonald, A. Mallia, M. Siedlaczek, A. Trotman, and A. de Vries, "Supporting Interoperability Between Open-Source Search Engines with the Common Index File Format.", CoRR, vol. abs/2003.08276, 2020.
Ruest, N., J. Lin, I. Milligan, and S. Fritz, "The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives.", CoRR, vol. abs/2001.05399, 2020.
Sakr, S., A. Bonifati, H. Voigt, A. Iosup, K. Ammar, R. Angles, W. Aref, M. Arenas, M. Besta, P. Boncz, et al., "The Future is Big Graphs! A Community View on Graph Processing Systems.", CoRR, vol. abs/2012.06171, 2020.
Sahu, S., A. Mhedhbi, S. Salihoglu, J. Lin, and M. \" Ozsu, "The ubiquity of large graphs and surprising challenges of graph processing: extended survey.", VLDB J., vol. 29, pp. 595–618, 2020.
Zhang, M., L. Tan, Z. Tu, Z. Fu, K. Xiong, M. Li, and J. Lin, "To Paraphrase or Not To Paraphrase: User-Controllable Selective Paraphrase Generation.", CoRR, vol. abs/2008.09290, 2020.
Lin, S-C., J-H. Yang, R. Nogueira, M-F. Tsai, C-J. Wang, and J. Lin, "TTTTTackling WinoGrande Schemas.", CoRR, vol. abs/2003.08380, 2020.
Toman, D., and G. Weddell, "Using Feature-Based Description Logics to avoid Duplicate Elimination in Object-Relational Query Languages.", K\"unstliche Intell., vol. 34, pp. 355–363, 2020.
Ozsu, M. \", and P. Valduriez, Principles of Distributed Database Systems, 4th Edition, 2020.

2019

Ilyas, I., and X. Chu, Data Cleaning, pp. 285, 2019.
De Sa, C., I. Ilyas, B. Kimelfeld, C. R\'e, and T. Rekatsinas, "A Formal Framework for Probabilistic Unclean Databases.", ICDT, 2019.
Kushagra, S., H. Saxena, I. Ilyas, and S. Ben-David, "A Semi-Supervised Framework of Clustering Selection for De-Duplication.", ICDE, pp. 208–219, 2019.
Yang, H-W., Y. Zou, P. Shi, W. Lu, J. Lin, and X. Sun, "Aligning Cross-Lingual Entities with Multi-Aspect Information.", EMNLP/IJCNLP (1), pp. 4430–4440, 2019.
Ge, C., X. He, I. Ilyas, and A. Machanavajjhala, "APEx: Accuracy-Aware Differentially Private Data Exploration.", SIGMOD Conference, pp. 177–194, 2019.
Yilmaz, Z., S. Wang, W. Yang, H. Zhang, and J. Lin, "Applying BERT to Document Retrieval with Birch.", EMNLP/IJCNLP (3), pp. 19–24, 2019.
Heidari, A., I. Ilyas, and T. Rekatsinas, "Approximate Inference in Structured Instances with Noisy Categorical Observations.", UAI, pp. 152, 2019.
Rao, J., L. Liu, Y. Tay, H-W. Yang, P. Shi, and J. Lin, "Bridging the Gap between Relevance Matching and Semantic Matching for Short Text Similarity Modeling.", EMNLP/IJCNLP (1), pp. 5369–5380, 2019.
Davoudi, H., P. Godfrey, L. Golab, M. Kargar, D. Srivastava, and J. Szlichta, "Bring Order to Data.", AMW, 2019.
Milligan, I., N. Casemajor, S. Fritz, J. Lin, N. Ruest, M. Weber, and N. Worby, "Building Community and Tools for Analyzing Web Archives Through Datathons.", JCDL, pp. 265–268, 2019.
T\"ure, F., J. Rao, R. Tang, and J. Lin, "Challenges and Opportunities in Understanding Spoken Queries Directed at Modern Entertainment Platforms.", SIGIR, pp. 1375–1376, 2019.
Yilmaz, Z., W. Yang, H. Zhang, and J. Lin, "Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval.", EMNLP/IJCNLP (1), pp. 3488–3494, 2019.
Neumann, T., and K. Salem, "DaMoN 19: The 15th International Workshop on Data Management on New Hardware", SIGMOD Conference, pp. 2070–2071, 2019.
Yang, W., L. Tan, C. Lu, A. Cui, H. Li, X. Chen, K. Xiong, M. Wang, M. Li, J. Pei, et al., "Detecting Customer Complaint Escalation with Recurrent Neural Networks and Manually-Engineered Features.", NAACL-HLT (2), pp. 56–63, 2019.
Yang, W., L. Tan, C. Lu, A. Cui, H. Li, X. Chen, K. Xiong, M. Wang, M. Li, and J. Pei, "Detecting Customer Complaint Escalation with Recurrent Neural Networks and Manually-Engineered Features.", NAACL-HLT (2), pp. 56–63, 2019.
Saxena, H., L. Golab, and I. Ilyas, "Distributed Discovery of Functional Dependencies.", ICDE, pp. 1590–1593, 2019.
Alonso, G., C. Binnig, I. Pandis, K. Salem, J. Skrzypczak, R. Stutsman, L. Thostrup, T. Wang, Z. Wang, and T. Ziegler, "DPI: The Data Processing Interface for Modern Networks.", CIDR, 2019.
Cormack, G., H. Zhang, N. Ghelani, M. Abualsaud, M. Smucker, M. Grossman, S. Rahbariasl, and A. Ghenai, "Dynamic Sampling Meets Pooling.", SIGIR, pp. 1217–1220, 2019.
Yang, W., Y. Xie, A. Lin, X. Li, L. Tan, K. Xiong, M. Li, and J. Lin, "End-to-End Open-Domain Question Answering with BERTserini.", NAACL-HLT (Demonstrations), pp. 72–77, 2019.
Toman, D., and G. Weddell, "Exhaustive Query Answering via Referring Expressions.", Description Logics, 2019.
Toman, D., and G. Weddell, "Exhaustive Query Answering via Referring Expressions.", Description Logics, 2019.
Pacaci, A., and M. \" Ozsu, "Experimental Analysis of Streaming Algorithms for Graph Partitioning.", SIGMOD Conference, pp. 1375–1392, 2019.
Le Guilly, M., J-M. Petit, V-M. Scuturici, and I. Ilyas, "ExplIQuE: Interactive Databases Exploration with SQL.", CIKM, pp. 2877–2880, 2019.
Gorenflo, C., S. Lee, L. Golab, and S. Keshav, "FastFabric: Scaling Hyperledger Fabric to 20, 000 Transactions per Second.", IEEE ICBC, pp. 455–463, 2019.
Toman, D., and G. Weddell, "Finding ALL Answers to OBDA Queries Using Referring Expressions.", Australasian Conference on Artificial Intelligence, pp. 117–129, 2019.
Grand, A., R. Muir, J. Ferenczi, and J. Lin, "From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation Performance.", ECIR (2), pp. 20-27, 2019.
McIntyre, S., D. Toman, and G. Weddell, "FunDL - A Family of Feature-Based Description Logics, with Applications in Querying Structured Data Sources.", Description Logic, Theory Combination, and All That, pp. 404–430, 2019.
Chopra, S., A. Khan, M. Mirsafian, and L. Golab, "Gender Differences in Science and Engineering: A Data Mining Approach.", EDBT/ICDT Workshops, 2019.
Chopra, S., A. Khan, M. Mirsafian, and L. Golab, "Gender Differences in Work-Integrated Learning Assessments.", EDM, 2019.
Anzum, N., S. Salihoglu, and D. Vogel, "GraphWrangler: An Interactive Graph View on Relational Data.", SIGMOD Conference, pp. 1865–1868, 2019.
Heidari, A., J. McGrath, I. Ilyas, and T. Rekatsinas, "HoloDetect: Few-Shot Learning for Error Detection.", SIGMOD Conference, pp. 829–846, 2019.
Lee, J., R. Tang, and J. Lin, "Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting.", EMNLP/IJCNLP (3), pp. 91–96, 2019.
Toman, D., and G. Weddell, "Identity Resolution in Ontology Based Data Access to Structured Data Sources.", PRICAI (1), pp. 473–485, 2019.
Liu, L., W. Yang, J. Rao, R. Tang, and J. Lin, "Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling.", EMNLP/IJCNLP (1), pp. 1204–1209, 2019.
Clancy, R., J. Lee, Z. Yilmaz, and J. Lin, "Information Retrieval Meets Scalable Text Analytics: Solr Integration with Spark.", SIGIR, pp. 1313–1316, 2019.
Vollmer, M., L. Golab, K. B\"ohm, and D. Srivastava, "Informative Summarization of Numeric Data.", SSDBM, pp. 97–108, 2019.
Gorenflo, C., L. Golab, and S. Keshav, "Mitigating Trust Issues in Electric Vehicle Charging using a Blockchain.", e-Energy, pp. 160–164, 2019.
Rao, J., W. Yang, Y. Zhang, F. T\"ure, and J. Lin, "Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search.", AAAI, pp. 232–240, 2019.
Tang, R., Y. Lu, and J. Lin, "Natural Language Generation for Effective Knowledge Distillation.", DeepLo@EMNLP-IJCNLP, pp. 202–208, 2019.
McIntyre, S., A. Borgida, D. Toman, and G. Weddell, "On Limited Conjunctions and Partial Features in Parameter-Tractable Feature Logics.", AAAI, pp. 2995–3002, 2019.
Borgida, A., D. Toman, and G. Weddell, "On Special Description Logics for Processes and Plans.", Description Logics, 2019.
Kumar, D., R. Cohen, and L. Golab, "Online abuse detection: the value of preprocessing and neural attention models.", WASSA@NAACL-HLT, pp. 16–24, 2019.
Clancy, R., N. Ferro, C. Hauff, J. Lin, T. Sakai, and Z. Wu, "Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019).", OSIRRC@SIGIR, pp. 1–7, 2019.
Lin, J., A. Roegiest, L. Tan, R. McCreadie, E. Voorhees, and F. Diaz, "Overview of the TREC 2016 Real-Time Summarization Track.", TREC, 2019.
Abualsaud, M., and M. Smucker, "Patterns of Search Result Examination: Query to First Action.", CIKM, pp. 1833–1842, 2019.
Kassaie, B., and F. Tompa, "Predictable and Consistent Information Extraction.", DocEng, pp. 14:1-14:10, 2019.
Cormack, G., and M. Grossman, "Quantifying Bias and Variance of System Rankings.", SIGIR, pp. 1089–1092, 2019.
Yang, J-H., S-C. Lin, C-J. Wang, J. Lin, and M-F. Tsai, "Query and Answer Expansion from Conversation History.", TREC, 2019.
Adhikari, A., A. Ram, R. Tang, and J. Lin, "Rethinking Complex Neural Network Architectures for Document Classification.", NAACL-HLT (1), pp. 4046–4051, 2019.
Yang, H-W., L. Liu, I. Milligan, N. Ruest, and J. Lin, "Scalable Content-Based Analysis of Images in Web Archives with TensorFlow and the Archives Unleashed Toolkit.", JCDL, pp. 436–437, 2019.
Kushagra, S., S. Ben-David, and I. Ilyas, "Semi-supervised clustering for de-duplication.", AISTATS, pp. 1659–1667, 2019.
Kazhamiaka, M., B. Memon, C. Kankanamge, S. Sahu, S. Rizvi, B. Wong, and K. Daudjee, "Sift: resource-efficient consensus with RDMA.", CoNEXT, pp. 260–271, 2019.
Shi, P., J. Rao, and J. Lin, "Simple Attention-Based Representation Learning for Ranking Short Social Media Posts.", NAACL-HLT (1), pp. 2212–2217, 2019.
Yu, R., Y. Xie, and J. Lin, "Simple Techniques for Cross-Collection Relevance Feedback.", ECIR (1), 2019.
Clancy, R., T. Eskildsen, N. Ruest, and J. Lin, "Solr Integration in the Anserini Information Retrieval Toolkit.", SIGIR, pp. 1285–1288, 2019.
Yan, D., G. Guo, M. Chowdhury, M. Özsu, J. Lui, and W. Tan, "T-thinker: a task-centric distributed framework for compute-intensive divide-and-conquer algorithms.", PPoPP, pp. 411-412, 2019.
Deschamps, R., N. Ruest, J. Lin, S. Fritz, and I. Milligan, "The Archives Unleashed Notebook: Madlibs for Jumpstarting Scholarly Exploration of Web Archives.", JCDL, pp. 337–338, 2019.
Deschamps, R., S. Fritz, J. Lin, I. Milligan, and N. Ruest, "The Cost of a WARC: Analyzing Web Archives in the Cloud.", JCDL, pp. 261–264, 2019.
Lin, J., and P. Yang, "The Impact of Score Ties on Repeatability in Document Ranking.", SIGIR, pp. 1125–1128, 2019.
Clancy, R., N. Ferro, C. Hauff, J. Lin, T. Sakai, and Z. Wu, "The SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019).", SIGIR, pp. 1432–1434, 2019.
Li, Y., L. Zou, M. \" Ozsu, and D. Zhao, "Time Constrained Continuous Subgraph Search Over Streaming Graphs.", ICDE, pp. 1082–1093, 2019.
Rahbariasl, S., and M. Smucker, "Time-Limits and Summaries for Faster Relevance Assessing.", SIGIR, pp. 901–904, 2019.
Lee, J., R. Tang, and J. Lin, "Universal voice-enabled user interfaces using JavaScript.", IUI Companion, pp. 81-82, 2019.
Clancy, R., Z. Yilmaz, Z. Wu, and J. Lin, "University of Waterloo Docker Images for OSIRRC at SIGIR 2019.", OSIRRC@SIGIR, pp. 36, 2019.
Deng, D., W. Tao, Z. Abedjan, A. Elmagarmid, I. Ilyas, G. Li, S. Madden, M. Ouzzani, M. Stonebraker, and N. Tang, "Unsupervised String Transformation Learning for Entity Consolidation.", ICDE, pp. 196–207, 2019.
Abualsaud, M., F. Beylunioglu, M. Smucker, and P. Duimering, "UWaterlooMDS at the TREC 2019 Decision Track.", TREC, 2019.
Ruest, N., I. Milligan, and J. Lin, "Warclight: A Rails Engine for Web Archive Discovery.", JCDL, pp. 442–443, 2019.
Abebe, M., B. Glasbergen, and K. Daudjee, "WatDFS: A Project for Understanding Distributed Systems in the Undergraduate Curriculum.", SIGCSE, pp. 920-926, 2019.
Xin, J., J. Lin, and Y. Yu, "What Part of the Neural Network Does This? Understanding LSTMs by Measuring and Dissecting Neurons.", EMNLP/IJCNLP (1), pp. 5822–5829, 2019.
Yang, H-W., Y. Zou, P. Shi, W. Lu, J. Lin, and X. Sun, "Aligning Cross-Lingual Entities with Multi-Aspect Information.", CoRR, vol. abs/1910.06575, 2019.
Heidari, A., I. Ilyas, and T. Rekatsinas, "Approximate Inference in Structured Instances with Noisy Categorical Observations.", CoRR, vol. abs/1907.00141, 2019.
Liu, L., H. Wang, J. Lin, R. Socher, and C. Xiong, "Attentive Student Meets Multi-Task Teacher: Improved Knowledge Distillation for Pretrained Models.", CoRR, vol. abs/1911.03588, 2019.
Alway, K., E. Blais, and S. Salihoglu, "Box Covers and Domain Orderings for Beyond Worst-Case Join Processing.", CoRR, vol. abs/1909.12102, 2019.
Alu\c{c}, G\"unes., M. \" Ozsu, and K. Daudjee, "Building self-clustering RDF databases using Tunable-LSH.", VLDB J., vol. 28, issue 2, 2019.
Agarwal, R., D. Kumar, L. Golab, and S. Keshav, "Consentio: Managing Consent to Data Access using Permissioned Blockchains.", CoRR, vol. abs/1910.07110, 2019.
Zhang, X., and M. \" Ozsu, "Correlation Constraint Shortest Path over Large Multi-Relation Graphs.", Proc. VLDB Endow., vol. 12, pp. 488–501, 2019.
Shi, P., and J. Lin, "Cross-Lingual Relevance Transfer for Document Retrieval.", CoRR, vol. abs/1911.02989, 2019.
Ehsan, N., A. Shakery, and F. Tompa, "Cross-lingual text alignment for fine-grained plagiarism detection.", J. Inf. Sci., vol. 45, issue 4, 2019.
Yang, W., Y. Xie, L. Tan, K. Xiong, M. Li, and J. Lin, "Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering.", CoRR, vol. abs/1904.06652, 2019.
Ilyas, I., "Data unification at scale: data tamer.", Making Databases Work, 2019.
Tang, R., Y. Lu, L. Liu, L. Mou, O. Vechtomova, and J. Lin, "Distilling Task-Specific Knowledge from BERT into Simple Neural Networks.", CoRR, vol. abs/1903.12136, 2019.
Saxena, H., L. Golab, and I. Ilyas, "Distributed Dependency Discovery.", CoRR, vol. abs/1903.05228, 2019.
Saxena, H., L. Golab, and I. Ilyas, "Distributed Implementations of Dependency Discovery Algorithms.", PVLDB, vol. 12, pp. 1624–1636, 2019.
Adhikari, A., A. Ram, R. Tang, and J. Lin, "DocBERT: BERT for Document Classification.", CoRR, vol. abs/1904.08398, 2019.
Nogueira, R., W. Yang, J. Lin, and K. Cho, "Document Expansion by Query Prediction.", CoRR, vol. abs/1904.08375, 2019.
Yang, W., Y. Xie, A. Lin, X. Li, L. Tan, K. Xiong, M. Li, and J. Lin, "End-to-End Open-Domain Question Answering with BERTserini.", CoRR, vol. abs/1902.01718, 2019.
Godfrey, P., L. Golab, M. Kargar, D. Srivastava, and J. Szlichta, "Errata Note: Discovering Order Dependencies through Order Compatibility.", CoRR, vol. abs/1905.02010, 2019.
Ram, A., J. Xin, M. Nagappan, Y. Yu, R\'io. Lozoya, A. Sabetta, and J. Lin, "Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits.", CoRR, vol. abs/1911.07620, 2019.
Gorenflo, C., S. Lee, L. Golab, and S. Keshav, "FastFabric: Scaling Hyperledger Fabric to 20, 000 Transactions per Second.", CoRR, vol. abs/1901.00910, 2019.
Salihoglu, S., and N. Yakovets, "Graph Query Processing.", Encyclopedia of Big Data Technologies, 2019.
Zeng, L., L. Zou, M. \" Ozsu, L. Hu, and F. Zhang, "GSI: GPU-friendly Subgraph Isomorphism.", CoRR, vol. abs/1906.03420, 2019.
Heidari, A., J. McGrath, I. Ilyas, and T. Rekatsinas, "HoloDetect: Few-Shot Learning for Error Detection.", CoRR, vol. abs/1904.02285, 2019.
Teofili, T., and J. Lin, "Lucene for Approximate Nearest-Neighbors Search on Arbitrary Dense Vectors.", CoRR, vol. abs/1910.10208, 2019.
Azmy, M., P. Shi, J. Lin, and I. Ilyas, "Matching Entities Across Different Knowledge Graphs with Graph Embeddings.", CoRR, vol. abs/1903.06607, 2019.
Nogueira, R., W. Yang, K. Cho, and J. Lin, "Multi-Stage Document Ranking with BERT.", CoRR, vol. abs/1910.14424, 2019.
Mhedhbi, A., and S. Salihoglu, "Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins.", PVLDB, vol. 12, issue 11, 2019.
Mhedhbi, A., and S. Salihoglu, "Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins.", CoRR, vol. abs/1903.02076, 2019.
Livshits, E., I. Ilyas, B. Kimelfeld, and S. Roy, "Principles of Progress Indicators for Database Repairing.", CoRR, vol. abs/1904.06492, 2019.
Lin, S-C., J-H. Yang, R. Nogueira, M-F. Tsai, C-J. Wang, and J. Lin, "Query Reformulation using Query History for Passage Retrieval in Conversational Search.", CoRR, vol. abs/2005.02230, 2019.
Ge, C., I. Ilyas, and F. Kerschbaum, "Secure Multi-Party Functional Dependency Discovery.", Proc. VLDB Endow., vol. 13, pp. 184–196, 2019.
Yang, W., H. Zhang, and J. Lin, "Simple Applications of BERT for Ad Hoc Document Retrieval.", CoRR, vol. abs/1903.10972, 2019.
Shi, P., and J. Lin, "Simple BERT Models for Relation Extraction and Semantic Role Labeling.", CoRR, vol. abs/1904.05255, 2019.
Sun, J., D. Deng, I. Ilyas, G. Li, S. Madden, M. Ouzzani, M. Stonebraker, and N. Tang, "Technical Report: Optimizing Human Involvement for Entity Matching and Consolidation.", CoRR, vol. abs/1906.06574, 2019.
Lin, J., L. Paniak, and G. Boerke, "The Performance Envelope of Inverted Indexing on Modern Hardware.", CoRR, vol. abs/1910.11028, 2019.
Gauch, M., J. Mai, and J. Lin, "The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction.", CoRR, vol. abs/1911.07249, 2019.
Golab, L., "Types of Stream Processing Algorithms.", Encyclopedia of Big Data Technologies, 2019.
Lee, J., R. Tang, and J. Lin, "What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning.", CoRR, vol. abs/1911.03090, 2019.
Gorenflo, C., L. Golab, and S. Keshav, "XOX Fabric: A hybrid approach to transaction execution.", CoRR, vol. abs/1906.11229, 2019.
"Foreword.", Making Databases Work, 2019.

2018

Zhang, H., M. Abualsaud, and M. Smucker, "A Study of Immediate Requery Behavior in Search.", CHIIR, pp. 181–190, 2018.
Abualsaud, M., N. Ghelani, H. Zhang, M. Smucker, G. Cormack, and M. Grossman, "A System for Efficient High-Recall Retrieval.", SIGIR, pp. 1317–1320, 2018.
Koutris, P., S. Salihoglu, and D. Suciu, "Algorithmic Aspects of Parallel Query Processing.", SIGMOD Conference, pp. 1659–1664, 2018.
Tang, R., W. Wang, Z. Tu, and J. Lin, "An Experimental Analysis of the Power Consumption of Convolutional Neural Networks for Keyword Spotting.", ICASSP, pp. 5479–5483, 2018.
Glasbergen, B., M. Abebe, K. Daudjee, S. Foggo, and A. Pacaci, "Apollo: Learning Query Correlations for Predictive Caching in Geo-Distributed Systems.", EDBT, pp. 253–264, 2018.
Cormack, G., and M. Grossman, "Beyond Pooling.", SIGIR, pp. 1169–1172, 2018.
Mansour, E., D. Deng, R. Fernandez, A. Qahtan, W. Tao, Z. Abedjan, A. Elmagarmid, I. Ilyas, S. Madden, M. Ouzzani, et al., "Building Data Civilizer Pipelines with an Advanced Workflow Engine.", ICDE, pp. 1593–1596, 2018.
Yan, X., L. Yang, H. Zhang, X. Lin, B. Wong, K. Salem, and T. Brecht, "Carousel: Low-Latency Transaction Processing for Globally-Distributed Data.", SIGMOD Conference, pp. 231–243, 2018.
Fraser, D., A. Kane, and F. Tompa, "Choosing Math Features for BM25 Ranking with Tangent-L.", DocEng, pp. 17:1-17:10, 2018.
Langouri, M., Z. Zheng, F. Chiang, L. Golab, and J. Szlichta, "Contextual Data Cleaning.", ICDE Workshops, pp. 21–24, 2018.
Chopra, S., Y. Jiang, A. Toulis, and L. Golab, "Data Analytics to Improve Co-Operative Education.", EDBT/ICDT Workshops, pp. 16–21, 2018.
Tang, R., and J. Lin, "Deep Residual Learning for Small-Footprint Keyword Spotting.", ICASSP, pp. 5484–5488, 2018.
Pacaci, A., and M. Özsu, "Distribution-Aware Stream Partitioning for Distributed Stream Processing Systems.", BeyondMR@SIGMOD, pp. 6:1-6:10, 2018.
Abebe, M., K. Daudjee, B. Glasbergen, and Y. Tian, "EC-Store: Bridging the Gap between Storage and Latency in Distributed Erasure Coded Systems.", ICDCS, pp. 255–266, 2018.
Zihayat, M., A. An, L. Golab, M. Kargar, and J. Szlichta, "Effective Team Formation in Expert Networks.", AMW, 2018.
Zhang, H., M. Abualsaud, N. Ghelani, M. Smucker, G. Cormack, and M. Grossman, "Effective User Interaction for High-Recall Retrieval: Less is More.", CIKM, pp. 187–196, 2018.
Azmy, M., P. Shi, J. Lin, and I. Ilyas, "Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia.", COLING, pp. 2093–2103, 2018.
Mihaylov, A., P. Godfrey, L. Golab, M. Kargar, D. Srivastava, and J. Szlichta, "FASTOD: Bringing Order to Data.", ICDE, pp. 1561–1564, 2018.
Zheng, Z., M. Alipour, Z. Qu, I. Currie, F. Chiang, L. Golab, and J. Szlichta, "FastOFD: Contextual Data Cleaning with Ontology Functional Dependencies.", EDBT, pp. 694–697, 2018.
Chopra, S., H. Gautreau, A. Khan, M. Mirsafian, and L. Golab, "Gender Differences in Undergraduate Engineering Applicants: A Text Mining Approach.", EDM, 2018.
Toman, D., and G. Weddell, "Identity Resolution in Conjunctive Querying over DL-Based Knowledge Bases.", Description Logics, 2018.
Peng, P., L. Zou, M. Özsu, and D. Zhao, "Multi-query Optimization in Federated RDF Systems.", DASFAA (1), pp. 745–765, 2018.
McIntyre, S., A. Borgida, D. Toman, and G. Weddell, "On Limited Conjunctions in Polynomial Feature Logics, with Applications in OBDA.", KR, pp. 655–656, 2018.
Mackenzie, J., J. Culpepper, R. Blanco, M. Crane, C. Clarke, and J. Lin, "Query Driven Algorithm Selection in Early Stage Retrieval.", WSDM, pp. 396–404, 2018.
Memon, B., X. Lin, A. Mufti, A. Wesley, T. Brecht, K. Salem, B. Wong, and B. Cassell, "RaMP: A Lightweight RDMA Abstraction for Loosely Coupled Applications.", HotCloud, 2018.
Grewal, A., J. Jiang, G. Lam, T. Jung, L. Vuddemarri, Q. Li, A. Landge, and J. Lin, "RecService: Distributed Real-Time Graph Processing at Twitter.", HotCloud, 2018.
Ghelani, N., G. Cormack, and M. Smucker, "Refresh Strategies in Continuous Active Learning.", ProfS/KG4IR/Data:Search@SIGIR, pp. 18–23, 2018.
Mior, M., and K. Salem, "Renormalization of NoSQL Database Schemas.", ER, pp. 479–487, 2018.
Yang, P., S. Thiagarajan, and J. Lin, "Robust, Scalable, Real-Time Event Time Series Aggregation at Twitter.", SIGMOD Conference, pp. 595–599, 2018.
Fernandez, R., E. Mansour, A. Qahtan, A. Elmagarmid, I. Ilyas, S. Madden, M. Ouzzani, M. Stonebraker, and N. Tang, "Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery.", ICDE, pp. 989–1000, 2018.
Kim, Y., and J. Lin, "Serverless Data Analytics with Flint.", IEEE CLOUD, pp. 451–455, 2018.
Aleardi, L., S. Salihoglu, G. Singh, and M. Ovsjanikov, "Spectral Measures of Distortion for Change Detection in Dynamic Graphs.", COMPLEX NETWORKS (2), pp. 54–66, 2018.
Kane, A., and F. Tompa, "Split-Lists and Initial Thresholds for WAND-based Search.", SIGIR, pp. 877–880, 2018.
Gao, L., L. Golab, M. Özsu, and G. Aluç, "Stream WatDiv: A Streaming RDF Benchmark.", SBD@SIGMOD, pp. 3:1-3:6, 2018.
Mohammed, S., P. Shi, and J. Lin, "Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks.", NAACL-HLT (2), pp. 291–296, 2018.
Grewal, A., and J. Lin, "The Evolution of Content Analysis for Personalized Recommendations at Twitter.", SIGIR, pp. 1355–1356, 2018.
Cormack, G., and M. Grossman, "The Quest for Total Recall.", DocEng, pp. 6:1-6:2, 2018.
Ma, W., C. Keet, W. Oldford, D. Toman, and G. Weddell, "The Utility of the Abstract Relational Model and Attribute Paths in SQL.", EKAW, pp. 195–211, 2018.
Glasbergen, B., M. Abebe, and K. Daudjee, "Tutorial: Adaptive Replication and Partitioning in Data Systems.", Middleware (Tutorials), pp. 1:1-1:5, 2018.
Lin, J., S. Mohammed, R. Sequiera, and L. Tan, "Update Delivery Mechanisms for Prospective Information Needs: An Analysis of Attention in Mobile Users.", SIGIR, pp. 785–794, 2018.
Rao, J., F. Türe, and J. Lin, "What Do Viewers Say to Their TVs?: An Analysis of Voice Queries to Entertainment Systems.", SIGIR, pp. 1213–1216, 2018.
Korkmaz, M., M. Karsten, K. Salem, and S. Salihoglu, "Workload-Aware CPU Performance Scaling for Transactional Database Systems.", SIGMOD Conference, pp. 291–306, 2018.
Liang, Y., Z. Tu, L. Huang, and J. Lin, "CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities.", NAACL-HLT (Demonstrations), pp. 61-65, 2018.
Liang, Y., Z. Tu, L. Huang, and J. Lin, "CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities.", NAACL-HLT (Demonstrations), pp. 61-65, 2018.
Tompa, F., Fashioning a Search Engine to Support Humanities Research., vol. abs/1901.00910: DocEng, pp. 32:1-32:10, 2018.
Tompa, F., "Hypertexts.", Encyclopedia of Database Systems (2nd ed.), 2018.
Grossman, M., and G. Cormack, "MRG_UWaterloo Participation in the TREC 2018 Common Core Track.", TREC, 2018.
Sequiera, R., L. Tan, and J. Lin, "Overview of the TREC 2018 Real-Time Summarization Track.", TREC, 2018.
Tu, Z., M. Li, and J. Lin, "Pay-Per-Request Deployment of Neural Network Models Using Serverless Architectures.", NAACL-HLT (Demonstrations), pp. 6-10, 2018.
Mohammed, S., P. Shi, and J. Lin, "Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks.", NAACL-HLT (2), pp. 291–296, 2018.
Abualsaud, M., G. Cormack, N. Ghelani, A. Ghenai, M. Grossman, S. Rahbariasl, H. Zhang, and M. Smucker, "UWaterlooMDS at the TREC 2018 Common Core Track.", TREC, 2018.
De Sa, C., I. Ilyas, B. Kimelfeld, C. Ré, and T. Rekatsinas, "A Formal Framework For Probabilistic Unclean Databases.", CoRR, vol. abs/1801.06750, 2018.
Ren, Y., M. Tomko, F. Salim, J. Chan, C. Clarke, and M. Sanderson, "A Location-Query-Browse Graph for Contextual Recommendation.", IEEE Trans. Knowl. Data Eng., vol. 30, no. 2, pp. 204–218, 2018.
Chomicki, J., and D. Toman, "Abstract Versus Concrete Temporal Query Languages.", Encyclopedia of Database Systems (2nd ed.), 2018.
Tang, R., and J. Lin, "Adaptive Pruning of Neural Language Models for Mobile Devices.", CoRR, vol. abs/1809.10282, 2018.
Koutris, P., S. Salihoglu, and D. Suciu, "Algorithmic Aspects of Parallel Data Processing.", Found. Trends Databases, vol. 8, no. 4, pp. 239–370, 2018.
Yang, P., H. Fang, and J. Lin, "Anserini: Reproducible Ranking Baselines Using Lucene.", ACM J. Data Inf. Qual., vol. 10, no. 4, pp. 16:1-16:20, 2018.
Tang, G., S. Keshav, L. Golab, and K. Wu, "Bikeshare Pool Sizing for Bike-and-Ride Multimodal Transit.", IEEE Trans. Intelligent Transportation Systems, vol. 19, no. 7, pp. 2279–2289, 2018.
Özsu, M., "Client-Server Architecture.", Encyclopedia of Database Systems (2nd ed.), 2018.
Stonebraker, M., and I. Ilyas, "Data Integration: The Current Status and the Way Forward.", IEEE Data Eng. Bull., vol. 41, no. 2, pp. 3–9, 2018.
Özsu, M., "Data Manipulation Language (DML).", Encyclopedia of Database Systems (2nd ed.), 2018.
Golab, L., "Data Stream.", Encyclopedia of Database Systems (2nd ed.), 2018.
Ozsu, M. \", "Database Administrator (DBA).", Encyclopedia of Database Systems (2nd ed.), 2018.
Ozsu, M. \", "Database.", Encyclopedia of Database Systems (2nd ed.), 2018.
Ammar, K., F. McSherry, S. Salihoglu, and M. Joglekar, "Distributed Evaluation of Subgraph Queries Using Worst-case Optimal and Low-Memory Dataflows.", PVLDB, vol. 11, no. 6, pp. 691–704, 2018.
Ammar, K., F. McSherry, S. Salihoglu, and M. Joglekar, "Distributed Evaluation of Subgraph Queries Using Worstcase Optimal LowMemory Dataflows.", CoRR, vol. abs/1802.03760, 2018.
Tompa, F., "Document Databases.", Encyclopedia of Database Systems (2nd ed.), 2018.
Szlichta, J., P. Godfrey, L. Golab, M. Kargar, and D. Srivastava, "Effective and complete discovery of bidirectional order dependencies via set-based axioms.", VLDB J., vol. 27, no. 4, pp. 573–591, 2018.
Tompa, F., "Enterprise Content Management.", Encyclopedia of Database Systems (2nd ed.), 2018.
Lamb, C., D. Brown, and C. Clarke, "Evaluating Computational Creativity: An Interdisciplinary Tutorial.", ACM Comput. Surv., vol. 51, no. 2, pp. 28:1-28:34, 2018.
Zhang, H., G. Cormack, M. Grossman, and M. Smucker, "Evaluating Sentence-Level Relevance Feedback for High-Recall Information Retrieval.", CoRR, vol. abs/1803.08988, 2018.
Hopfgartner, F., A. Hanbury, H. Müller, I. Eggel, K. Balog, T. Brodt, G. Cormack, J. Lin, J. Kalpathy-Cramer, N. Kando, et al., "Evaluation-as-a-Service for the Computational Sciences: Overview and Outlook.", J. Data and Information Quality, vol. 10, no. 4, pp. 15:1-15:32, 2018.
Ammar, K., and M. Özsu, "Experimental Analysis of Distributed Graph Systems.", PVLDB, vol. 11, no. 10, pp. 1151–1164, 2018.
Ammar, K., and M. Özsu, "Experimental Analysis of Distributed Graph Systems.", CoRR, vol. abs/1806.08082, 2018.
Gebaly, K., G. Feng, L. Golab, F. Korn, and D. Srivastava, "Explanation Tables.", IEEE Data Eng. Bull., vol. 41, no. 3, pp. 43–51, 2018.
Tang, R., A. Adhikari, and J. Lin, "FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks.", CoRR, vol. abs/1811.03060, 2018.
Gebaly, K., and J. Lin, "In-Browser Split-Execution Support for Interactive Analytics in the Cloud.", CoRR, vol. abs/1804.08822, 2018.
Rao, J., W. Yang, Y. Zhang, F. Türe, and J. Lin, "Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search.", CoRR, vol. abs/1805.08159, 2018.
Toman, D., "Point-Stamped Temporal Models.", Encyclopedia of Database Systems (2nd ed.), 2018.
Tang, R., and J. Lin, "Progress and Tradeoffs in Neural Language Models.", CoRR, vol. abs/1811.00942, 2018.
Ilyas, I., "Rank-Aware Query Processing.", Encyclopedia of Database Systems (2nd ed.), 2018.
Ilyas, I., "Rank-Join.", Encyclopedia of Database Systems (2nd ed.), 2018.
Lin, J., and P. Yang, "Repeatability Corner Cases in Document Ranking: The Impact of Score Ties.", CoRR, vol. abs/1807.05798, 2018.
Liu, Y., M. Kato, C. Clarke, N. Kando, and T. Sakai, "Report on NTCIR-13: The Thirteenth Round of NII Testbeds and Community for Information Access Research.", SIGIR Forum, vol. 52, no. 1, pp. 102–110, 2018.
Culpepper, J., F. Diaz, and M. Smucker, "Research Frontiers in Information Retrieval: Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018).", SIGIR Forum, vol. 52, no. 1, pp. 34–90, 2018.
Salihoglu, S., and M. Özsu, "Response to “Scale Up or Scale Out for Graph Processing”.", IEEE Internet Computing, vol. 22, no. 5, pp. 18–24, 2018.
Salem, K., "Sagas.", Encyclopedia of Database Systems (2nd ed.), 2018.
El-Roby, A., K. Ammar, A. Aboulnaga, and J. Lin, "Sapphire: Querying RDF Data Made Simple.", CoRR, vol. abs/1805.11728, 2018.
Lin, J., "Scale Up or Scale Out for Graph Processing?", IEEE Internet Computing, vol. 22, no. 3, pp. 72–78, 2018.
Kushagra, S., S. Ben-David, and I. Ilyas, "Semi-supervised clustering for de-duplication.", CoRR, vol. abs/1810.04361, 2018.
Kim, Y., and J. Lin, "Serverless Data Analytics with Flint.", CoRR, vol. abs/1803.06354, 2018.
Shi, P., J. Rao, and J. Lin, "Simple Attention-Based Representation Learning for Ranking Short Social Media Posts.", CoRR, vol. abs/1811.01013, 2018.
Golab, L., "Stream Models.", Encyclopedia of Database Systems (2nd ed.), 2018.
Tang, R., G. Yang, H. Wei, Y. Mao, F. Türe, and J. Lin, "Streaming Voice Query Recognition using Causal Convolutional Recurrent Neural Networks.", CoRR, vol. abs/1812.07754, 2018.
Chomicki, J., and D. Toman, "Temporal Logic in Database Query Languages.", Encyclopedia of Database Systems (2nd ed.), 2018.
Chomicki, J., and D. Toman, "Temporal Relational Calculus.", Encyclopedia of Database Systems (2nd ed.), 2018.
Roddick, J., and D. Toman, "Temporal Vacuuming.", Encyclopedia of Database Systems (2nd ed.), 2018.
Lin, J., "The Neural Hype and Comparisons Against Weak Baselines.", SIGIR Forum, vol. 52, issue 2, pp. 40–51, 2018.
Li, Y., L. Zou, M. Özsu, and D. Zhao, "Time Constrained Continuous Subgraph Search over Streaming Graphs.", CoRR, vol. abs/1801.09240, 2018.
Ilyas, I., "Top-k Queries.", Encyclopedia of Database Systems (2nd ed.), 2018.
Clarke, C., "Web Question Answering.", Encyclopedia of Database Systems (2nd ed.), 2018.
Abedjan, Z., L. Golab, F. Naumann, and T. Papenbrock, Data Profiling, 2018.
Lin, J., "Summarization.", Encyclopedia of Database Systems (2nd ed.), 2018.

2017

Crane, M., J. Culpepper, J. Lin, J. Mackenzie, and A. Trotman, "A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation.", WSDM, pp. 201–210, 2017.
Baruah, G., R. McCreadie, and J. Lin, "A Comparison of Nuggets and Clusters for Evaluating Timeline Summaries.", CIKM, pp. 67–76, 2017.
Fernandez, R., D. Deng, E. Mansour, A. Qahtan, W. Tao, Z. Abedjan, A. Elmagarmid, I. Ilyas, S. Madden, M. Ouzzani, et al., "A Demo of the Data Civilizer System.", SIGMOD Conference, pp. 1639–1642, 2017.
Karyakin, A., and K. Salem, "An analysis of memory power consumption in database systems.", DaMoN, pp. 2:1-2:9, 2017.
Crane, M., and J. Lin, "An Exploration of Serverless Architectures for Information Retrieval.", ICTIR, pp. 241–244, 2017.
Yang, P., H. Fang, and J. Lin, "Anserini: Enabling the Use of Lucene for Information Retrieval Research.", SIGIR, pp. 1253–1256, 2017.
Zihayat, M., A. An, L. Golab, M. Kargar, and J. Szlichta, "Authority-based Team Discovery in Social Networks.", EDBT, pp. 498–501, 2017.
Grossman, M., G. Cormack, and A. Roegiest, "Automatic and Semi-Automatic Document Selection for Technology-Assisted Review.", SIGIR, pp. 905–908, 2017.
Zhang, H., J. Rao, J. Lin, and M. Smucker, "Automatically Extracting High-Quality Negative Examples for Answer Selection in Question Answering.", SIGIR, pp. 797–800, 2017.
Borgida, A., D. Toman, and G. Weddell, "Concerning Referring Expressions in Query Answers.", IJCAI, pp. 4791–4795, 2017.
Abedjan, Z., L. Golab, and F. Naumann, "Data Profiling: A Tutorial.", SIGMOD Conference, pp. 1747–1751, 2017.
Pacaci, A., A. Zhou, J. Lin, and M. Özsu, "Do We Need Specialized Graph Databases?: Benchmarking Real-Time Social Networking Applications.", GRADES@SIGMOD/PODS, pp. 12:1-12:7, 2017.
Baskaran, S., A. Keller, F. Chiang, L. Golab, and J. Szlichta, "Efficient Discovery of Ontology Functional Dependencies.", CIKM, pp. 1847–1856, 2017.
Ghelani, N., S. Mohammed, S. Wang, and J. Lin, "Event Detection on Curated Tweet Streams.", SIGIR, pp. 1325–1328, 2017.
Rao, J., H. He, and J. Lin, "Experiments with Convolutional Neural Network Models for Answer Selection.", SIGIR, pp. 1217–1220, 2017.
Vtyurina, A., D. Savenkov, E. Agichtein, and C. Clarke, "Exploring Conversational Search With Humans, Assistants, and Wizards.", CHI Extended Abstracts, pp. 2187–2193, 2017.
Sequiera, R., and J. Lin, "Finally, a Downloadable Test Collection of Tweets.", SIGIR, pp. 1225–1228, 2017.
Toulis, A., and L. Golab, "Graph Mining to Characterize Competition for Employment.", NDA@SIGMOD, pp. 3:1-3:7, 2017.
Kankanamge, C., S. Sahu, A. Mhedbhi, J. Chen, and S. Salihoglu, "Graphflow: An Active Graph Database.", SIGMOD Conference, pp. 1695–1698, 2017.
Afrati, F., M. Joglekar, C. Ré, S. Salihoglu, and J. Ullman, "GYM: A Multiround Distributed Join Algorithm.", ICDT, pp. 4:1-4:18, 2017.
Fink, S., L. Golab, S. Keshav, and H. de Meer, "How Similar is the Usage of Electric Cars and Electric Bicycles?", e-Energy, pp. 334–340, 2017.
Gebaly, K., and J. Lin, "In-Browser Interactive SQL Analytics with Afterburner.", SIGMOD Conference, pp. 1623–1626, 2017.
Gorenflo, C., L. Golab, and S. Keshav, "Managing Sensor Data Streams: Lessons Learned from the WeBike Project.", SSDBM, pp. 1:1-1:11, 2017.
Rao, J., F. Türe, X. Niu, and J. Lin, "Mining the Temporal Statistics of Query Terms for Searching Social Media Posts.", ICTIR, pp. 133–140, 2017.
Cui, X., M. Mior, B. Wong, K. Daudjee, and S. Rizvi, "Netstore: leveraging network optimizations to improve distributed transaction processing performance.", ACTIVE@Middleware, pp. 1–10, 2017.
Roegiest, A., L. Tan, and J. Lin, "Online In-Situ Interleaved Evaluation of Real-Time Push Notification Systems.", SIGIR, pp. 415–424, 2017.
Meng, X., and L. Golab, "Optimal reducer placement to minimize data transfer in MapReduce-style processing.", BigData, pp. 339–346, 2017.
Lin, J., S. Mohammed, R. Sequiera, L. Tan, N. Ghelani, M. Abualsaud, R. McCreadie, D. Milajevs, and E. Voorhees, "Overview of the TREC 2017 Real-Time Summarization Track.", TREC, 2017.
Mohammed, S., M. Crane, and J. Lin, "Quantization in Append-Only Collections.", ICTIR, pp. 265–268, 2017.
Mate, J., K. Daudjee, and S. Kamali, "Robust Multi-tenant Server Consolidation in the Cloud for Data Analytics Workloads.", ICDCS, pp. 2111–2118, 2017.
Feng, G., L. Golab, and D. Srivastava, "Scalable Informative Rule Mining.", ICDE, pp. 437–448, 2017.
Kane, A., and F. Tompa, "Small-Term Distribution for Disk-Based Search.", DocEng, pp. 49–58, 2017.
Toulis, A., and L. Golab, "Social Media Mining to Understand Public Mental Health.", DMAH@VLDB, pp. 55–70, 2017.
Rao, J., F. Türe, H. He, O. Jojic, and J. Lin, "Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks.", CIKM, pp. 557–566, 2017.
Clarke, C., G. Cormack, J. Lin, and A. Roegiest, "Ten Blue Links on Mars.", WWW, pp. 273–281, 2017.
Deng, D., R. Fernandez, Z. Abedjan, S. Wang, M. Stonebraker, A. Elmagarmid, I. Ilyas, S. Madden, M. Ouzzani, and N. Tang, "The Data Civilizer System.", CIDR, 2017.
Azzopardi, L., M. Crane, H. Fang, G. Ingersoll, J. Lin, Y. Moshfeghi, H. Scells, P. Yang, and G. Zuccon, "The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017.", SIGIR, pp. 1429–1430, 2017.
Pogacar, F., A. Ghenai, M. Smucker, and C. Clarke, "The Positive and Negative Influence of Search Results on People’s Decisions about the Efficacy of Medical Treatments.", ICTIR, pp. 209–216, 2017.
Zhang, H., M. Abualsaud, N. Ghelani, A. Ghosh, M. Smucker, G. Cormack, and M. Grossman, "UWaterlooMDS at the TREC 2017 Common Core Track.", TREC, 2017.
He, H., K. Ganjam, N. Jain, J. Lundin, R. White, and J. Lin, "An Insight Extraction System on BioMedical Literature with Deep Neural Networks.", EMNLP, pp. 2691–2701, 2017.
Deng, D., R. Fernandez, Z. Abedjan, S. Wang, M. Stonebraker, A. Elmagarmid, I. Ilyas, S. Madden, M. Ouzzani, and N. Tang, "The Data Civilizer System.", CIDR, 2017.
Tang, R., W. Wang, Z. Tu, and J. Lin, "An Experimental Analysis of the Power Consumption of Convolutional Neural Networks for Keyword Spotting.", CoRR, vol. abs/1711.00333, 2017.
Tu, Z., M. Crane, R. Sequiera, J. Zhang, and J. Lin, "An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures.", CoRR, vol. abs/1707.08275, 2017.
Abdelaziz, I., R. Harbi, S. Salihoglu, and P. Kalnis, "Combining Vertex-Centric Graph Processing with SPARQL for Large-Scale RDF Data Analytics.", IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 12, pp. 3374–3388, 2017.
Sadiq, S., T. Dasu, X. Dong, J. Freire, I. Ilyas, S. Link, R. Miller, F. Naumann, X. Zhou, and D. Srivastava, "Data Quality: The Role of Empiricism.", SIGMOD Rec., vol. 46, no. 4, pp. 35–43, 2017.
Tang, R., and J. Lin, "Deep Residual Learning for Small-Footprint Keyword Spotting.", CoRR, vol. abs/1710.10361, 2017.
Mohammed, S., N. Ghelani, and J. Lin, "Distant Supervision for Topic Classification of Tweets in Curated Streams.", CoRR, vol. abs/1704.06726, 2017.
Szlichta, J., P. Godfrey, L. Golab, M. Kargar, and D. Srivastava, "Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization.", PVLDB, vol. 10, no. 7, pp. 721–732, 2017.
Mackenzie, J., J. Culpepper, R. Blanco, M. Crane, C. Clarke, and J. Lin, "Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems.", CoRR, vol. abs/1704.03970, 2017.
Deng, D., W. Tao, Z. Abedjan, A. Elmagarmid, I. Ilyas, S. Madden, M. Ouzzani, M. Stonebraker, and N. Tang, "Entity Consolidation: The Golden Record Problem.", CoRR, vol. abs/1709.10436, 2017.
Sequiera, R., G. Baruah, Z. Tu, S. Mohammed, J. Rao, H. Zhang, and J. Lin, "Exploring the Effectiveness of Convolutional Neural Networks for Answer Selection in End-to-End Question Answering.", CoRR, vol. abs/1707.07804, 2017.
Yan, D., H. Chen, J. Cheng, M. Özsu, Q. Zhang, and J. Lui, "G-thinker: Big Graph Mining Made Easier and Faster.", CoRR, vol. abs/1709.03110, 2017.
Zou, L., and M. Özsu, "Graph-Based RDF Data Management.", Data Science and Engineering, vol. 2, no. 1, pp. 56–70, 2017.
Rekatsinas, T., X. Chu, I. Ilyas, and C. Ré, "HoloClean: Holistic Data Repairs with Probabilistic Inference.", PVLDB, vol. 10, no. 11, pp. 1190–1201, 2017.
Rekatsinas, T., X. Chu, I. Ilyas, and C. Ré, "HoloClean: Holistic Data Repairs with Probabilistic Inference.", CoRR, vol. abs/1702.00820, 2017.
Vadehra, A., M. Grossman, and G. Cormack, "Impact of Feature Selection on Micro-Text Classification.", CoRR, vol. abs/1708.08123, 2017.
Lin, J., "In Defense of MapReduce.", IEEE Internet Computing, vol. 21, no. 3, pp. 94–98, 2017.
Rao, J., H. He, H. Zhang, F. Türe, R. Sequiera, S. Mohammed, and J. Lin, "Integrating Lexical and Temporal Signals in Neural Ranking Models for Searching Social Media Streams.", CoRR, vol. abs/1707.07792, 2017.
Konow, R., G. Navarro, C. Clarke, and A. López-Ortiz, "Inverted Treaps.", ACM Trans. Inf. Syst., vol. 35, no. 3, pp. 22:1-22:45, 2017.
"Logic programming approach to automata-based decision procedures.", J. Log. Algebraic Methods Program., vol. 86, no. 1, pp. 391–407, 2017.
Mior, M., K. Salem, A. Aboulnaga, and R. Liu, "NoSE: Schema Design for NoSQL Applications.", IEEE Trans. Knowl. Data Eng., vol. 29, no. 10, pp. 2275–2289, 2017.
Allan, J., N. Belkin, P. Bennett, J. Callan, C. Clarke, F. Diaz, S. Dumais, N. Ferro, D. Harman, D. Hiemstra, et al., "Overview of Special Issue.", SIGIR Forum, vol. 51, no. 2, pp. 1–25, 2017.
Ge, C., I. Ilyas, X. He, and A. Machanavajjhala, "Private Exploration Primitives for Data Cleaning.", CoRR, vol. abs/1712.10266, 2017.
Liu, X., L. Golab, W. Golab, I. Ilyas, and S. Jin, "Smart Meter Data Analytics: Systems, Algorithms, and Benchmarking.", ACM Trans. Database Syst., vol. 42, no. 1, pp. 2:1-2:39, 2017.
Mohammed, S., P. Shi, and J. Lin, "Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks.", CoRR, vol. abs/1712.01969, 2017.
Rao, J., F. Türe, H. He, O. Jojic, and J. Lin, "Talking to Your TV: Context-Aware Voice Search with Hierarchical Recurrent Neural Networks.", CoRR, vol. abs/1705.04892, 2017.
Lin, J., "The Lambda and the Kappa.", IEEE Internet Computing, vol. 21, no. 5, pp. 60–66, 2017.
Lin, J., and A. Trotman, "The role of index compression in score-at-a-time query evaluation.", Inf. Retr. Journal, vol. 20, no. 3, pp. 199–220, 2017.
Sahu, S., A. Mhedhbi, S. Salihoglu, J. Lin, and M. Özsu, "The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing.", PVLDB, vol. 11, no. 4, pp. 420–431, 2017.
Sahu, S., A. Mhedhbi, S. Salihoglu, J. Lin, and M. Özsu, "The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: A User Survey.", CoRR, vol. abs/1709.03188, 2017.
Yang, Y., L. Golab, and M. Özsu, "ViewDF: Declarative incremental view maintenance for streaming data.", Inf. Syst., vol. 71, pp. 55–67, 2017.
Lin, J., I. Milligan, J. Wiebe, and A. Zhou, "Warcbase: Scalable Analytics Infrastructure for Exploring Web Archives.", JOCCH, vol. 10, no. 4, pp. 22:1-22:30, 2017.
Shen, C., T. Shen, and J. Lin, "Comparative Assessment of Alignment Algorithms for NGS Data: Features, Considerations, Implementations, and Future.", Algorithms for Next-Generation Sequencing Data, pp. 187–202, 2017.

2016

Agrawal, S., and K. Daudjee, "A Performance Comparison of Algorithms for Byzantine Agreement in Distributed Systems.", EDCC, pp. 249–260, 2016.
Roegiest, A., L. Tan, J. Lin, and C. Clarke, "A Platform for Streaming Push Notifications to Mobile Assessors.", SIGIR, pp. 1077–1080, 2016.
Wu, G., and F. Tompa, "A Space-Efficient Data Structure for Fast Access Control in ECM Systems.", SACMAT, pp. 191–201, 2016.
Roegiest, A., and G. Cormack, "An Architecture for Privacy-Preserving and Replicable High-Recall Retrieval Experiments.", SIGIR, pp. 1085–1088, 2016.
Hashemi, S., C. Clarke, A. Dean-Hall, J. Kamps, and J. Kiseleva, "An Easter Egg Hunting Approach to Test Collection Building in Dynamic Domains.", EVIA@NTCIR, 2016.
Tan, L., A. Roegiest, J. Lin, and C. Clarke, "An Exploration of Evaluation Metrics for Mobile Push Notifications.", SIGIR, pp. 741–744, 2016.
Al-Harbi, A., and M. Smucker, "Are Secondary Assessors Uncertain When They Disagree About Relevance Judgements?", CHIIR, pp. 233–236, 2016.
Farid, M., A. Roatis, I. Ilyas, H-F. Hoffmann, and X. Chu, "CLAMS: Bringing Quality to Data Lakes.", SIGMOD Conference, pp. 2089–2092, 2016.
Rao, J., X. Niu, and J. Lin, "Compressing and Decoding Term Statistics Time Series.", ECIR, pp. 675–681, 2016.
Milligan, I., N. Ruest, and J. Lin, "Content Selection and Curation for Web Archiving: The Gatekeepers vs. the Masses.", JCDL, pp. 107–110, 2016.
Cafarella, M., I. Ilyas, M. Kornacker, T. Kraska, and C. Ré, "Dark Data: Are we solving the right problems?", ICDE, pp. 1444–1445, 2016.
Chu, X., I. Ilyas, S. Krishnan, and J. Wang, "Data Cleaning: Overview and Emerging Challenges.", SIGMOD Conference, pp. 2201–2206, 2016.
Abedjan, Z., L. Golab, and F. Naumann, "Data profiling.", ICDE, pp. 1432–1435, 2016.
Abedjan, Z., J. Morcos, I. Ilyas, M. Ouzzani, P. Papotti, and M. Stonebraker, "DataXFormer: A robust transformation discovery system.", ICDE, pp. 1134–1145, 2016.
Jackson, A., J. Lin, I. Milligan, and N. Ruest, "Desiderata for Exploratory Search Interfaces to Web Archives in Support of Scholarly Activities.", JCDL, pp. 103–106, 2016.
Buntain, C., J. Lin, and J. Golbeck, "Discovering key moments in social media streams.", CCNC, pp. 366–374, 2016.
Culpepper, J., C. Clarke, and J. Lin, "Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems.", ADCS, pp. 17–24, 2016.
Kargar, M., L. Golab, and J. Szlichta, "eGraphSearch: Effective Keyword Search in Graphs.", CIKM, pp. 2461–2464, 2016.
Cormack, G., and M. Grossman, "Engineering Quality and Reliability in Technology-Assisted Review.", SIGIR, pp. 75–84, 2016.
Bommannavar, P., J. Lin, and A. Rajaraman, "Estimating topical volume in social media streams.", SAC, pp. 1096–1101, 2016.
Lamb, C., D. Brown, and C. Clarke, "Evaluating digital poetry: Insights from the CAT.", ICCC, pp. 60–67, 2016.
Oard, D., K. Shilton, and J. Lin, "Evaluating Search Among Secrets.", EVIA@NTCIR, 2016.
Milligan, I., J. Lin, J. Wiebe, and A. Zhou, "Exploring and Discovering Archive-It Collections with Warcbase.", DH, pp. 285–288, 2016.
Roegiest, A., and G. Cormack, "Impact of Review-Set Selection on Human Assessment for Text Classification.", SIGIR, pp. 861–864, 2016.
Trotman, A., and J. Lin, "In Vacuo and In Situ Evaluation of SIMD Codecs.", ADCS, pp. 1–8, 2016.
Farid, M., I. Ilyas, S. Whang, and C. Yu, "LONLIES: Estimating Property Values for Long Tail Entities.", SIGIR, pp. 1125–1128, 2016.
Smucker, M., and C. Clarke, "Modeling Optimal Switching Behavior.", CHIIR, pp. 317–320, 2016.
Zanibbi, R., K. Davila, A. Kane, and F. Tompa, "Multi-Stage Math Formula Search: Using Appearance-Based Similarity Metrics at Scale.", SIGIR, pp. 145–154, 2016.
Rao, J., H. He, and J. Lin, "Noise-Contrastive Estimation for Answer Selection with Deep Neural Networks.", CIKM, pp. 1913–1916, 2016.
Mior, M., K. Salem, A. Aboulnaga, and R. Liu, "NoSE: Schema design for NoSQL applications.", ICDE, pp. 181–192, 2016.
Jacques, J. St., D. Toman, and G. Weddell, "Object-Relational Queries over CFDI_nc Knowledge Bases: OBDA for the SQL-Literate (extended abstract).", Description Logics, 2016.
Jiang, Y., and L. Golab, "On Competition for Undergraduate Co-op Placements: A Graph Mining Approach.", EDM, pp. 394–399, 2016.
Toman, D., and G. Weddell, "On Partial Features in the DLF Family of Description Logics.", PRICAI, pp. 529–542, 2016.
Borgida, A., D. Toman, and G. Weddell, "On Referring Expressions in Information Systems Derived from Conceptual Modelling.", ER, pp. 183–197, 2016.
Borgida, A., D. Toman, and G. Weddell, "On Referring Expressions in Query Answering over First Order Knowledge Bases.", KR, pp. 319–328, 2016.
Toman, D., and G. Weddell, "Ontology Based Data Access with Referring Expressions for Logics with the Tree Model Property - (Extended Abstract).", Australasian Conference on Artificial Intelligence, pp. 353–361, 2016.
Baruah, G., H. Zhang, R. Guttikonda, J. Lin, M. Smucker, and O. Vechtomova, "Optimizing Nugget Annotations with Active Learning.", CIKM, pp. 2359–2364, 2016.
Bonenfant, M., B. Desai, D. Desai, B. Fung, M. Özsu, and J. Ullman, "Panel: The State of Data: Invited Paper from panelists.", IDEAS, pp. 2–11, 2016.
Yilmaz, E., and C. Clarke, "Preface.", EVIA@NTCIR, 2016.
Yang, G., I. Soboroff, L. Xiong, C. Clarke, and S. Garfinkel, "Privacy-Preserving IR 2016: Differential Privacy, Search, and Social Media.", SIGIR, pp. 1247–1248, 2016.
Lin, J., Z. Tu, M. Rose, and P. White, "Prizm: A Wireless Access Point for Proxy-Based Web Lifelogging.", LTA@MM, pp. 19–25, 2016.
Han, M., and K. Daudjee, "Providing Serializability for Pregel-like Graph Processing Systems.", EDBT, pp. 77–88, 2016.
Gebhard, L., L. Golab, S. Keshav, and H. de Meer, "Range prediction for electric bicycles.", e-Energy, pp. 21:1-21:11, 2016.
Elbagoury, A., M. Crane, and J. Lin, "Rank-at-a-Time Query Processing.", ICTIR, pp. 229–232, 2016.
Paik, J., and J. Lin, "Retrievability in API-Based “Evaluation as a Service”.", ICTIR, pp. 91–94, 2016.
Zhang, H., J. Lin, G. Cormack, and M. Smucker, "Sampling Strategies and Active Learning for Volume Estimation.", SIGIR, pp. 981–984, 2016.
Cormack, G., and M. Grossman, "Scalability of Continuous Active Learning for Reliable High-Recall Text Classification.", CIKM, pp. 1039–1048, 2016.
Murdock, V., C. Clarke, J. Kamps, and J. Karlgren, "Second Workshop on Search and Exploration of X-Rated Information (SEXI’16): WSDM Workshop Summary.", WSDM, pp. 697–698, 2016.
Moschitti, A., L. Màrquez, P. Nakov, E. Agichtein, C. Clarke, and I. Szpektor, "SIGIR 2016 Workshop WebQA II: Web Question Answering Beyond Factoids.", SIGIR, pp. 1251–1252, 2016.
Tan, L., A. Roegiest, C. Clarke, and J. Lin, "Simple Dynamic Emission Strategies for Microblog Filtering.", SIGIR, pp. 1009–1012, 2016.
Davila, K., R. Zanibbi, A. Kane, and F. Tompa, "Tangent-3 at the NTCIR-12 MathIR Task.", NTCIR, 2016.
Rao, J., and J. Lin, "Temporal Query Expansion Using a Continuous Hidden Markov Model.", ICTIR, pp. 295–298, 2016.
Clarke, C., G. Cormack, J. Lin, and A. Roegiest, "Total Recall: Blue Sky on Mars.", ICTIR, pp. 45–48, 2016.
Lin, J., M. Crane, A. Trotman, J. Callan, I. Chattopadhyaya, J. Foley, G. Ingersoll, C. MacDonald, and S. Vigna, "Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge.", ECIR, pp. 408–420, 2016.
Grossman, M., G. Cormack, and A. Roegiest, "TREC 2016 Total Recall Track Overview.", TREC, 2016.
Radhakrishnan, S., B. Muscedere, and K. Daudjee, "V-Hadoop: Virtualized Hadoop using containers.", NCA, pp. 237–241, 2016.
Hartig, O., and M. Özsu, "Walking Without a Map: Ranking-Based Traversal for Querying Linked Data.", International Semantic Web Conference (1), pp. 305–324, 2016.
Hashemi, S., J. Kamps, J. Kiseleva, C. Clarke, and E. Voorhees, "Overview of the TREC 2016 Contextual Suggestion Track.", TREC, 2016.
He, H., J. Wieting, K. Gimpel, J. Rao, and J. Lin, "UMD-TTIC-UW at SemEval-2016 Task 1: Attention-Based Multi-Perspective Convolutional Neural Networks for Textual Similarity Measurement.", SemEval@NAACL-HLT, pp. 1103–1108, 2016.
Yan, D., J. Cheng, M. Özsu, F. Yang, Y. Lu, J. Lui, Q. Zhang, and W. Ng, "A General-Purpose Query-Centric Framework for Querying Big Graphs.", PVLDB, vol. 9, no. 7, pp. 564–575, 2016.
Özsu, M., "A survey of RDF data management systems.", Frontiers Comput. Sci., vol. 10, no. 3, pp. 418–432, 2016.
Özsu, M., "A Survey of RDF Data Management Systems.", CoRR, vol. abs/1601.00707, 2016.
Gebaly, K., and J. Lin, "Afterburner: The Case for In-Browser Analytics.", CoRR, vol. abs/1605.04035, 2016.
Clarke, C., J. Culpepper, and A. Moffat, "Assessing efficiency-effectiveness tradeoffs in multi-stage retrieval systems without using relevance judgments.", Inf. Retr. Journal, vol. 19, no. 4, pp. 351–377, 2016.
Zihayat, M., A. An, L. Golab, M. Kargar, and J. Szlichta, "Authority-based Team Discovery in Social Networks.", CoRR, vol. abs/1611.02992, 2016.
Jiang, Y., S. Syed, and L. Golab, "Data Mining of Undergraduate Course Evaluations.", Informatics in Education, vol. 15, no. 1, pp. 85–102, 2016.
Bär, A., P. Casas, A. D’Alconzo, P. Fiadino, L. Golab, M. Mellia, and E. Schikuta, "DBStream: A holistic approach to large-scale network traffic monitoring and analysis.", Comput. Networks, vol. 107, pp. 5–19, 2016.
Abedjan, Z., X. Chu, D. Deng, R. Fernandez, I. Ilyas, M. Ouzzani, P. Papotti, M. Stonebraker, and N. Tang, "Detecting Data Errors: Where are we and what needs to be done?", PVLDB, vol. 9, no. 12, pp. 993–1004, 2016.
Chu, X., I. Ilyas, and P. Koutris, "Distributed Data Deduplication.", PVLDB, vol. 9, no. 11, pp. 864–875, 2016.
Culpepper, J., C. Clarke, and J. Lin, "Dynamic Trade-Off Prediction in Multi-Stage Retrieval Systems.", CoRR, vol. abs/1610.02502, 2016.
Bizer, C., L. Dong, I. Ilyas, and M-E. Vidal, "Editorial: Special Issue on Web Data Quality.", J. Data and Information Quality, vol. 8, no. 1, pp. 1:1-1:3, 2016.
Szlichta, J., P. Godfrey, L. Golab, M. Kargar, and D. Srivastava, "Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization.", CoRR, vol. abs/1608.06169, 2016.
Ilyas, I., "Effective Data Cleaning with Continuous Evaluation.", IEEE Data Eng. Bull., vol. 39, no. 2, pp. 38–46, 2016.
Clarke, C., and E. Yilmaz, "EVIA 2016: The Seventh International Workshop on Evaluating Information Access.", SIGIR Forum, vol. 50, no. 2, pp. 44–46, 2016.
Boncz, P., and K. Salem, "Front Matter.", PVLDB, vol. 10, no. 1, pp. i–vi, 2016.
Sharma, A., J. Jiang, P. Bommannavar, B. Larson, and J. Lin, "GraphJet: Real-Time Content Recommendations at Twitter.", PVLDB, vol. 9, no. 13, pp. 1281–1292, 2016.
Khabsa, M., A. Elmagarmid, I. Ilyas, H. Hammady, and M. Ouzzani, "Learning to identify relevant studies for systematic reviews using random forest and external information.", Machine Learning, vol. 102, no. 3, pp. 465–482, 2016.
Quamar, A., A. Deshpande, and J. Lin, "NScale: neighborhood-centric large-scale graph analytics in the cloud.", VLDB J., vol. 25, no. 2, pp. 125–150, 2016.
Drzadzewski, G., and F. Tompa, "Partial materialization for online analytical processing over multi-tagged document collections.", Knowl. Inf. Syst., vol. 47, no. 3, pp. 697–732, 2016.
Peng, P., L. Zou, M. Özsu, L. Chen, and D. Zhao, "Processing SPARQL queries over distributed RDF graphs.", VLDB J., vol. 25, no. 2, pp. 243–268, 2016.
Chu, X., and I. Ilyas, "Qualitative Data Cleaning.", PVLDB, vol. 9, no. 13, pp. 1605–1608, 2016.
Yan, D., J. Cheng, M. Özsu, F. Yang, Y. Lu, J. Lui, Q. Zhang, and W. Ng, "Quegel: A General-Purpose Query-Centric Framework for Querying Big Graphs.", CoRR, vol. abs/1601.06497, 2016.
El-Roby, A., K. Ammar, A. Aboulnaga, and J. Lin, "Sapphire: Querying RDF Data Made Simple.", PVLDB, vol. 9, no. 13, pp. 1481–1484, 2016.
Lin, J., C. Clarke, and G. Baruah, "Searching from Mars.", IEEE Internet Computing, vol. 20, no. 1, pp. 78–82, 2016.
Clarke, C., G. Cormack, J. Lin, and A. Roegiest, "Ten Blue Links on Mars.", CoRR, vol. abs/1610.06468, 2016.
Tan, L., J. Lin, A. Roegiest, and C. Clarke, "The Effects of Latency Penalties in Evaluating Push Notification Systems.", CoRR, vol. abs/1606.03066, 2016.
Lin, J., and K. Gebaly, "The Future of Big Data Is ... JavaScript?", IEEE Internet Computing, vol. 20, no. 5, pp. 82–88, 2016.

2015

Shen, X., L. Zou, M. Özsu, L. Chen, Y. Li, S. Han, and D. Zhao, "A graph-based RDF triple store.", ICDE, pp. 1508–1511, 2015.
Wu, J., T. Kinash, D. Toman, and G. Weddell, "Absorption for ABoxes and TBoxes with General Value Restrictions.", Australasian Conference on Artificial Intelligence, pp. 609–622, 2015.
Lin, J., and A. Trotman, "Anytime Ranking for Impact-Ordered Indexes.", ICTIR, pp. 301–304, 2015.
Wang, Y., G. Sherman, J. Lin, and M. Efron, "Assessor Differences and User Preferences in Tweet Timeline Generation.", SIGIR, pp. 615–624, 2015.
Liu, X., L. Golab, W. Golab, and I. Ilyas, "Benchmarking Smart Meter Data Analytics.", EDBT, pp. 385–396, 2015.
Khayyat, Z., I. Ilyas, A. Jindal, S. Madden, M. Ouzzani, P. Papotti, J-A. Quiané-Ruiz, N. Tang, and S. Yin, "BigDansing: A System for Big Data Cleansing.", SIGMOD Conference, pp. 1215–1230, 2015.
Lin, J., "Building a Self-Contained Search Engine in the Browser.", ICTIR, pp. 309–312, 2015.
Bär, A., L. Golab, S. Ruehrup, M. Schiavone, and P. Casas, "Cache-oblivious scheduling of shared workloads.", ICDE, pp. 855–866, 2015.
Kiseleva, J., J. Kamps, and C. Clarke, "Contextual Search and Exploration.", RuSSIR, pp. 3–23, 2015.
Kim, J., K. Salem, K. Daudjee, A. Aboulnaga, and X. Pan, "Database high availability using SHADOW systems.", SoCC, pp. 209–221, 2015.
Morcos, J., Z. Abedjan, I. Ilyas, M. Ouzzani, P. Papotti, and M. Stonebraker, "DataXFormer: An Interactive Data Transformation Tool.", SIGMOD Conference, pp. 883–888, 2015.
Abedjan, Z., J. Morcos, M. Gubanov, I. Ilyas, M. Stonebraker, P. Papotti, and M. Ouzzani, "Dataxformer: Leveraging the Web for Semantic Transformations.", CIDR, 2015.
Saxena, H., and K. Salem, "EdgeX: Edge Replication for Web Applications.", CLOUD, pp. 1041–1044, 2015.
Drzadzewski, G., and F. Tompa, "Enhancing Exploration with a Faceted Browser through Summarization.", DocEng, pp. 61–64, 2015.
Baruah, G., M. Smucker, and C. Clarke, "Evaluating Streams of Evolving News Events.", SIGIR, pp. 675–684, 2015.
Aluç, G., M. Özsu, K. Daudjee, and O. Hartig, "Executing queries over schemaless RDF databases.", ICDE, pp. 807–818, 2015.
Bislimovska, B., G. Aluç, M. Özsu, and P. Fraternali, "Graph Search of Software Models Using Multidimensional Scaling.", EDBT/ICDT Workshops, pp. 163–170, 2015.
Petroni, F., L. Querzoni, K. Daudjee, S. Kamali, and G. Iacoboni, "HDRF: Stream-Based Partitioning for Power-Law Graphs.", CIKM, pp. 243–252, 2015.
Nicoara, D., S. Kamali, K. Daudjee, and L. Chen, "Hermes: Dynamic Partitioning for Distributed Social Network Graph Databases.", EDBT, pp. 25–36, 2015.
Lamb, C., D. Brown, and C. Clarke, "Human Competence in Creativity Evaluation.", ICCC, pp. 102–109, 2015.
Weissman, S., S. Ayhan, J. Bradley, and J. Lin, "Identifying Duplicate and Contradictory Information in Wikipedia.", JCDL, pp. 57–60, 2015.
Roegiest, A., G. Cormack, C. Clarke, and M. Grossman, "Impact of Surrogate Assessments on High-Recall Retrieval.", SIGIR, pp. 555–564, 2015.
Ge, C., M. Kaufmann, L. Golab, P. Fischer, and A. Goel, "Indexing bi-temporal windows.", SSDBM, pp. 19:1-19:12, 2015.
Clarke, C., M. Smucker, and E. Yilmaz, "IR Evaluation: Modeling User Behavior for Measuring Effectiveness.", SIGIR, pp. 1117–1120, 2015.
Chu, X., J. Morcos, I. Ilyas, M. Ouzzani, P. Papotti, N. Tang, and Y. Ye, "KATARA: A Data Cleaning System Powered by Knowledge Bases and Crowdsourcing.", SIGMOD Conference, pp. 1247–1261, 2015.
Tan, L., H. Zhang, C. Clarke, and M. Smucker, "Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings.", ACL (2), pp. 657–661, 2015.
Cormack, G., and M. Grossman, "Multi-Faceted Recall of Continuous Active Learning for Technology-Assisted Review.", SIGIR, pp. 763–766, 2015.
Hudek, A., D. Toman, and G. Weddell, "On Enumerating Query Plans Using Analytic Tableau.", TABLEAUX, pp. 339–354, 2015.
Toman, D., and G. Weddell, "On the Krom Extension of CFDI^∀ -_nc.", Australasian Conference on Artificial Intelligence, pp. 559–571, 2015.
Hashemi, S., C. Clarke, A. Dean-Hall, J. Kamps, and J. Kiseleva, "On the Reusability of Open Test Collections.", SIGIR, pp. 827–830, 2015.
Toman, D., and G. Weddell, "On the Utility of CFDI.", Description Logics, 2015.
Dean-Hall, A., C. Clarke, J. Kamps, and J. Kiseleva, "Online Evaluation of Point-Of-Interest Recommendation Systems.", SCST@ECIR, 2015.
Dean-Hall, A., C. Clarke, J. Kamps, J. Kiseleva, and E. Voorhees, "Overview of the TREC 2015 Contextual Suggestion Track.", TREC, 2015.
Fillottrani, P., C. Keet, and D. Toman, "Polynomial encoding of ORM conceptual models in CFDI.", Description Logics, 2015.
Baruah, G., A. Roegiest, and M. Smucker, "Pooling for User-Oriented Evaluation Measures.", ICTIR, pp. 341–344, 2015.
Rao, J., J. Lin, and M. Efron, "Reproducible Experiments on Lexical and Temporal Feedback for Tweet Search.", ECIR, pp. 755–767, 2015.
Lin, J., "Scaling Down Distributed Infrastructure on Wimpy Machines for Personal Web Archiving.", WWW (Companion Volume), pp. 1351–1355, 2015.
Arguello, J., F. Diaz, J. Lin, and A. Trotman, "SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR).", SIGIR, pp. 1147–1148, 2015.
Borgida, A., D. Toman, and G. Weddell, "Singular Referring Expressions in Conjunctive Query Answers: the case for a CFD DL Dialect.", Description Logics, 2015.
Golab, L., F. Korn, F. Li, B. Saha, and D. Srivastava, "Size-Constrained Weighted Set Cover.", ICDE, pp. 879–890, 2015.
Liu, X., L. Golab, and I. Ilyas, "SMAS: A smart meter data analytics system.", ICDE, pp. 1476–1479, 2015.
Wang, Y., and J. Lin, "The Feasibility of Brute Force Scans for Real-Time Tweet Search.", ICTIR, pp. 321–324, 2015.
Dean-Hall, A., and C. Clarke, "The Power of Contextual Suggestion.", ECIR, pp. 352–357, 2015.
Korkmaz, M., A. Karyakin, M. Karsten, and K. Salem, "Towards Dynamic Green-Sizing for Database Servers.", ADMS@VLDB, pp. 25–36, 2015.
Tan, L., A. Roegiest, and C. Clarke, "University of Waterloo at TREC 2015 Microblog Track.", TREC, 2015.
Ghenai, A., E. Khalilov, P. Valov, and C. Clarke, "WaterlooClarke: TREC 2015 Clinical Decision Support Track.", TREC, 2015.
Hoffmann, H., P. Addala, and C. Clarke, "WaterlooClarke: TREC 2015 Contextual Suggestion Track.", TREC, 2015.
Vtyurina, A., A. Dey, B. Sarrafzadeh, and C. Clarke, "WaterlooClarke: TREC 2015 LiveQA Track.", TREC, 2015.
Abualsaud, M., M. Ghaznavi, D. Recoskie, and C. Clarke, "WaterlooClarke: TREC 2015 Microblog Track.", TREC, 2015.
Raza, A., D. Rotondo, and C. Clarke, "WaterlooClarke: TREC 2015 Temporal Summarization Track.", TREC, 2015.
Zhang, H., W. Lin, Y. Wang, C. Clarke, and M. Smucker, "WaterlooClarke: TREC 2015 Total Recall Track.", TREC, 2015.
Agichtein, E., D. Carmel, C. Clarke, P. Paritosh, D. Pelleg, and I. Szpektor, "Web Question Answering: Beyond Factoids: SIGIR 2015 Workshop.", SIGIR, pp. 1143, 2015.
Gao, P., L. Golab, and S. Keshav, "What’s Wrong with my Solar Panels: a Data-Driven Approach.", EDBT/ICDT Workshops, pp. 86–93, 2015.
Kim, J., K. Salem, and K. Daudjee, "Write Amplification: An Analysis of In-Memory Database Durability Techniques.", IMDM@VLDB, pp. 1:1-1:7, 2015.
He, H., K. Gimpel, and J. Lin, "Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks.", EMNLP, pp. 1576–1586, 2015.
Roegiest, A., G. Cormack, C. Clarke, and M. Grossman, "TREC 2015 Total Recall Track Overview.", TREC, 2015.
Tan, L., and C. Clarke, "A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference.", IEEE Trans. Knowl. Data Eng., vol. 27, no. 11, pp. 2865–2877, 2015.
Chowdhury, S., A. Roy, M. Shaikh, and K. Daudjee, "A taxonomy of decentralized online social networks.", Peer Peer Netw. Appl., vol. 8, no. 3, pp. 367–383, 2015.
Agrawal, D., A. Abbadi, and K. Salem, "A Taxonomy of Partitioned Replicated Cloud-based Database Systems.", IEEE Data Eng. Bull., vol. 38, no. 1, pp. 4–9, 2015.
Clarke, C., J. Culpepper, and A. Moffat, "Assessing Efficiency-Effectiveness Tradeoffs in Multi-Stage Retrieval Systems Without Using Relevance Judgments.", CoRR, vol. abs/1506.00717, 2015.
Cormack, G., and M. Grossman, "Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review.", CoRR, vol. abs/1504.06868, 2015.
Aluç, G., M. Özsu, and K. Daudjee, "Clustering RDF Databases Using Tunable-LSH.", CoRR, vol. abs/1504.02523, 2015.
Kargar, M., L. Golab, and J. Szlichta, "Effective Keyword Search in Graphs.", CoRR, vol. abs/1512.06395, 2015.
Hanbury, A., H. Müller, K. Balog, T. Brodt, G. Cormack, I. Eggel, T. Gollub, F. Hopfgartner, J. Kalpathy-Cramer, N. Kando, et al., "Evaluation-as-a-Service: Overview and Outlook.", CoRR, vol. abs/1512.07454, 2015.
He, H., J. Lin, and A. Lopez, "Gappy Pattern Matching on GPUs for On-Demand Extraction of Hierarchical Translation Grammars.", Trans. Assoc. Comput. Linguistics, vol. 3, pp. 87–100, 2015.
Han, M., and K. Daudjee, "Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems.", PVLDB, vol. 8, no. 9, pp. 950–961, 2015.
Lin, J., "Is Big Data a Transient Problem?", IEEE Internet Comput., vol. 19, no. 5, pp. 86–90, 2015.
Chu, X., M. Ouzzani, J. Morcos, I. Ilyas, P. Papotti, N. Tang, and Y. Ye, "KATARA: Reliable Data Cleaning with Knowledge Bases and Crowdsourcing.", PVLDB, vol. 8, no. 12, pp. 1952–1955, 2015.
Buntain, C., J. Lin, and J. Golbeck, "Learning to Discover Key Moments in Social Media Streams.", CoRR, vol. abs/1508.00488, 2015.
Balkesen, C., J. Teubner, G. Alonso, and M. Özsu, "Main-Memory Hash Joins on Modern Processor Architectures.", IEEE Trans. Knowl. Data Eng., vol. 27, no. 7, pp. 1754–1766, 2015.
Abu-Khzam, F., K. Daudjee, A. Mouawad, and N. Nishimura, "On scalable parallel recursive backtracking.", J. Parallel Distrib. Comput., vol. 84, pp. 65–75, 2015.
Abedjan, Z., L. Golab, and F. Naumann, "Profiling relational data: a survey.", VLDB J., vol. 24, no. 4, pp. 557–581, 2015.
Hopfgartner, F., A. Hanbury, H. Müller, N. Kando, S. Mercer, J. Kalpathy-Cramer, M. Potthast, T. Gollub, A. Krithara, J. Lin, et al., "Report on the Evaluation-as-a-Service (EaaS) Expert Workshop.", SIGIR Forum, vol. 49, no. 1, pp. 57–65, 2015.
Arguello, J., M. Crane, F. Diaz, J. Lin, and A. Trotman, "Report on the SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR).", SIGIR Forum, vol. 49, no. 2, pp. 107–116, 2015.
Calvanese, D., M. Koubarakis, and D. Toman, "Special issue of the Journal of Web Semantics on ontology-based data access.", J. Web Semant., vol. 33, pp. 1-2, 2015.
Zanibbi, R., K. Davila, A. Kane, and F. Tompa, "The Tangent Search Engine: Improved Similarity Metrics and Scalability for Math Formula Search.", CoRR, vol. abs/1507.06235, 2015.
Ilyas, I., and X. Chu, "Trends in Cleaning Relational Data: Consistency and Deduplication.", Foundations and Trends in Databases, vol. 5, no. 4, pp. 281–393, 2015.

2014

Al-Harbi, A., and M. Smucker, "A qualitative exploration of secondary assessor relevance judging behavior.", IIiX, pp. 195–204, 2014.
Dean-Hall, A., and C. Clarke, "Assessing Contextual Suggestion.", EVIA@NTCIR, 2014.
Mühleisen, H., T. Samar, J. Lin, and A. de Vries, "Column Stores as an IR Prototyping Tool.", ECIR, pp. 789–792, 2014.
Ardakanian, O., N. Koochakzadeh, R. Singh, L. Golab, and S. Keshav, "Computing Electricity Consumption Profiles from Household Smart Meter Data.", EDBT/ICDT Workshops, pp. 140–147, 2014.
Robinson, N., S. McIlraith, and D. Toman, "Cost-Based Query Optimization via AI Planning.", AAAI, pp. 2344–2351, 2014.
Gebremeskel, G., J. He, A. de Vries, and J. Lin, "Cumulative Citation Recommendation: A Feature-Aware Comparison of Approaches.", DEXA Workshops, pp. 193–197, 2014.
Syed, S., Y. Jiang, and L. Golab, "Data mining of undergraduate course evaluations.", EDM, pp. 347–348, 2014.
Golab, L., and T. Johnson, "Data stream warehousing.", ICDE, pp. 1290–1293, 2014.
Bär, A., P. Casas, L. Golab, and A. Finamore, "DBStream: An online aggregation, filtering and processing system for network traffic monitoring.", IWCMC, pp. 611–616, 2014.
Chalamalla, A., I. Ilyas, M. Ouzzani, and P. Papotti, "Descriptive and prescriptive data cleaning.", SIGMOD Conference, pp. 445–456, 2014.
Golab, L., M. Hadjieleftheriou, H. Karloff, and B. Saha, "Distributed data placement to minimize communication costs via graph partitioning.", SSDBM, pp. 20:1-20:12, 2014.
Aluç, G., O. Hartig, M. Özsu, and K. Daudjee, "Diversified Stress Testing of RDF Data Management Systems.", International Semantic Web Conference (1), pp. 197–212, 2014.
Said, A., A. Bellogín, J. Lin, and A. de Vries, "Do recommendations matter?: news recommendation in real life.", CSCW Companion, pp. 237–240, 2014.
Wu, G., and F. Tompa, "Effective and Efficient Bitmaps for Access Control.", DCC, pp. 433, 2014.
Albakour, M-D., C. MacDonald, I. Ounis, C. Clarke, and V. Bicer, "Information Access in Smart Cities (i-ASC).", ECIR, pp. 810–814, 2014.
Myers, S., A. Sharma, P. Gupta, and J. Lin, "Information network or social network?: the structure of the twitter follow graph.", WWW (Companion Volume), pp. 493–498, 2014.
Lin, J., M. Gholami, and J. Rao, "Infrastructure for supporting exploration and discovery in web archives.", WWW (Companion Volume), pp. 851–856, 2014.
Lin, J., and M. Efron, "Infrastructure support for evaluation as a service.", WWW (Companion Volume), pp. 79–82, 2014.
Carpenter, T., L. Golab, and S. Syed, "Is the grass greener?: mining electric vehicle opinions.", e-Energy, pp. 241–252, 2014.
Bär, A., A. Finamore, P. Casas, L. Golab, and M. Mellia, "Large-scale network traffic monitoring with DBStream, a system for rolling big data analysis.", BigData, pp. 165–170, 2014.
Avram, C-A., K. Salem, and B. Wong, "Latency Amplification: Characterizing the Impact of Web Page Content on Load Times.", SRDS Workshops, pp. 20–25, 2014.
Wang, L., J. Lin, D. Metzler, and J. Han, "Learning to efficiently rank on big data.", WWW (Companion Volume), pp. 209–210, 2014.
Hartig, O., and M. Özsu, "Linked Data query processing.", ICDE, pp. 1286–1289, 2014.
Singh, A., X. Cui, B. Cassell, B. Wong, and K. Daudjee, "MicroFuge: A Middleware Approach to Providing Performance Isolation in Cloud Storage Systems.", ICDCS, pp. 503–513, 2014.
Smucker, M., X. Guo, and A. Toulis, "Mouse movement during relevance judging: implications for determining user attention.", SIGIR, pp. 979–982, 2014.
Elmagarmid, A., I. Ilyas, M. Ouzzani, J-A. Quiané-Ruiz, N. Tang, and S. Yin, "NADEEF/ER: generic and interactive entity resolution.", SIGMOD Conference, pp. 1071–1074, 2014.
Mühleisen, H., T. Samar, J. Lin, and A. de Vries, "Old dogs are great at new tricks: column stores for ir prototyping.", SIGIR, pp. 863–866, 2014.
Toman, D., and G. Weddell, "On Adding Inverse Features to the Description Logic CFD", PRICAI, pp. 587–599, 2014.
Voorhees, E., J. Lin, and M. Efron, "On run diversity in Evaluation as a Service.", SIGIR, pp. 959–962, 2014.
Daudjee, K., S. Kamali, and A. López-Ortiz, "On the online fault-tolerant server consolidation problem.", SPAA, pp. 12–21, 2014.
Kumar, K., J. Gluck, A. Deshpande, and J. Lin, "Optimization Techniques for “Scaling Down” Hadoop on Multi-Core, Shared-Memory Systems.", EDBT, pp. 13–24, 2014.
Dean-Hall, A., C. Clarke, J. Kamps, P. Thomas, and E. Voorhees, "Overview of the TREC 2014 Contextual Suggestion Track.", TREC, 2014.
Rao, J., J. Lin, and H. Samet, "Partitioning strategies for spatio-textual similarity join.", BigSpatial@SIGSPATIAL, pp. 40–49, 2014.
Jiang, Y., R. Levman, L. Golab, and J. Nathwani, "Predicting peak-demand days in the ontario peak reduction program for large consumers.", e-Energy, pp. 221–222, 2014.
Toman, D., and G. Weddell, "Pushing the CFDnc Envelope.", Description Logics, pp. 340–351, 2014.
Li, F., M. Özsu, G. Chen, and B. Ooi, "R-Store: A scalable distributed system for supporting real-time analytics.", ICDE, pp. 40–51, 2014.
Hartig, O., and M. Özsu, "Reachable subwebs for traversal-based query execution.", WWW (Companion Volume), pp. 541–546, 2014.
Chu, X., I. Ilyas, P. Papotti, and Y. Ye, "RuleMiner: Data quality rules discovery.", ICDE, pp. 1222–1225, 2014.
Kane, A., and F. Tompa, "Skewed partial bitvectors for list intersection.", SIGIR, pp. 263–272, 2014.
Tan, L., and C. Clarke, "Succinct Queries for Linking and Tracking News in Social Media.", CIKM, pp. 1883–1886, 2014.
Lin, J., K. Kraus, and R. Punzalan, "Supporting “Distant Reading” for Web Archives.", DH, 2014.
Efron, M., J. Lin, J. He, and A. de Vries, "Temporal feedback for tweet search with non-parametric density estimation.", SIGIR, pp. 33–42, 2014.
Baruah, G., A. Roegiest, and M. Smucker, "The effect of expanding relevance judgements with duplicates.", SIGIR, pp. 1159–1162, 2014.
Wang, Y., and J. Lin, "The Impact of Future Term Statistics in Real-Time Tweet Search.", ECIR, pp. 567–572, 2014.
Clarke, C., and M. Smucker, "Time well spent.", IIiX, pp. 205–214, 2014.
Li, L., and M. Smucker, "Tolerance of Effectiveness Measures to Relevance Judging Errors.", ECIR, pp. 148–159, 2014.
Xu, Z., D. Goldwasser, B. Bederson, and J. Lin, "Visual analytics of MOOCs at maryland.", L@S, pp. 195–196, 2014.
Lin, J., Y. Wang, M. Efron, and G. Sherman, "Overview of the TREC-2014 Microblog Track.", TREC, 2014.
Tan, L., and C. Clarke, "A Family of Rank Similarity Measures based on Maximized Effectiveness Difference.", CoRR, vol. abs/1408.3587, 2014.
Wu, J., A. Hudek, D. Toman, and G. Weddell, "Absorption for ABoxes.", J. Autom. Reasoning, vol. 53, no. 3, pp. 215–243, 2014.
Serafini, M., E. Mansour, A. Aboulnaga, K. Salem, T. Rafiq, and U. Minhas, "Accordion: Elastic Scalability for Database Systems Supporting Distributed Transactions.", PVLDB, vol. 7, no. 12, pp. 1035–1046, 2014.
Han, M., K. Daudjee, K. Ammar, M. Özsu, X. Wang, and T. Jin, "An Experimental Comparison of Pregel-like Graph Processing Systems.", PVLDB, vol. 7, no. 12, pp. 1047–1058, 2014.
Chairunnanda, P., K. Daudjee, and M. Özsu, "ConfluxDB: Multi-Master Replication for Partitioned Snapshot Isolation Databases.", PVLDB, vol. 7, no. 11, pp. 947–958, 2014.
Golab, L., H. Karloff, F. Korn, B. Saha, and D. Srivastava, "Discovering Conservation Rules.", IEEE Trans. Knowl. Data Eng., vol. 26, no. 6, pp. 1332–1348, 2014.
Li, F., B. Ooi, M. Özsu, and S. Wu, "Distributed data management using MapReduce.", ACM Comput. Surv., vol. 46, no. 3, pp. 31:1-31:42, 2014.
Türe, F., and J. Lin, "Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval.", ACM Trans. Inf. Syst., vol. 32, no. 4, pp. 19:1-19:32, 2014.
Zou, L., M. Özsu, L. Chen, X. Shen, R. Huang, and D. Zhao, "gStore: a graph-based SPARQL query engine.", VLDB J., vol. 23, no. 4, pp. 565–590, 2014.
Weissman, S., S. Ayhan, J. Bradley, and J. Lin, "Identifying Duplicate and Contradictory Information in Wikipedia.", CoRR, vol. abs/1406.1143, 2014.
Liu, X., and K. Salem, "Integrating SSD Caching into Database Systems.", IEEE Data Eng. Bull., vol. 37, no. 2, pp. 35–43, 2014.
Gebaly, K., P. Agrawal, L. Golab, F. Korn, and D. Srivastava, "Interpretable and Informative Explanations of Outcomes.", PVLDB, vol. 8, no. 1, pp. 61–72, 2014.
Ashkan, A., and C. Clarke, "Location- and Query-Aware Modeling of Browsing and Click Behavior in Sponsored Search.", ACM TIST, vol. 5, no. 4, pp. 59:1-59:31, 2014.
Quamar, A., A. Deshpande, and J. Lin, "NScale: Neighborhood-centric Analytics on Large Graphs.", PVLDB, vol. 7, no. 13, pp. 1673–1676, 2014.
Quamar, A., A. Deshpande, and J. Lin, "NScale: Neighborhood-centric Large-Scale Graph Analytics in the Cloud.", CoRR, vol. abs/1405.1499, 2014.
Peng, P., L. Zou, M. Özsu, L. Chen, and D. Zhao, "Processing SPARQL Queries Over Linked Data-A Distributed Graph-based Approach.", CoRR, vol. abs/1411.6763, 2014.
Gupta, P., V. Satuluri, A. Grewal, S. Gurumurthy, V. Zhabiuk, Q. Li, and J. Lin, "Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs.", PVLDB, vol. 7, no. 13, pp. 1379–1380, 2014.
Albakour, M-D., C. MacDonald, I. Ounis, C. Clarke, and V. Bicer, "Report on the 1st International Workshop on Information Access in Smart Cities (i-ASC 2014).", SIGIR Forum, vol. 48, no. 2, pp. 96–104, 2014.
Balog, K., D. Elsweiler, E. Kanoulas, L. Kelly, and M. Smucker, "Report on the CIKM workshop on living labs for information retrieval evaluation.", SIGIR Forum, vol. 48, no. 1, pp. 21–28, 2014.
Asadi, N., J. Lin, and A. de Vries, "Runtime Optimizations for Tree-Based Machine Learning Models.", IEEE Trans. Knowl. Data Eng., vol. 26, no. 9, pp. 2281–2292, 2014.
Beskales, G., I. Ilyas, L. Golab, and A. Galiullin, "Sampling from repairs of conditional functional dependency violations.", VLDB J., vol. 23, no. 1, pp. 103–128, 2014.
Boykin, P., S. Ritchie, I. O’Connell, and J. Lin, "Summingbird: A Framework for Integrating Batch and Online MapReduce Computations.", PVLDB, vol. 7, no. 13, pp. 1441–1451, 2014.
Dallachiesa, M., T. Palpanas, and I. Ilyas, "Top-k Nearest Neighbor Search In Uncertain Data Series.", PVLDB, vol. 8, no. 1, pp. 13–24, 2014.
Toman, D., and G. Weddell, "Undecidability of Finite Model Reasoning in DLFD.", CoRR, vol. abs/1408.4468, 2014.
Aluç, G., M. Özsu, and K. Daudjee, "Workload Matters: Why RDF Databases Need a New Design.", PVLDB, vol. 7, no. 10, pp. 837–840, 2014.
Ilyas, I., "Data unification at scale: data tamer.", Making Databases Work, pp. 269–277, 2014.
"Distributed and Parallel Database Systems.", Computing Handbook, 3rd ed. (2), pp. 13: 1-24, 2014.

2013

Said, A., J. Lin, A. Bellogín, and A. de Vries, "A month in the life of a production news recommender system.", LivingLab@CIKM, pp. 7–10, 2013.
Wu, J., T. Kinash, D. Toman, and G. Weddell, "Absorption for ABoxes with Local Universal Restrictions.", Description Logics, pp. 489–500, 2013.
Balkesen, C., N. Tatbul, and M. Özsu, "Adaptive input admission and management for parallel stream processing.", DEBS, pp. 15–26, 2013.
Deziel, M., D. Olawo, L. Truchon, and L. Golab, "Analyzing the Mental Health of Engineering Students using Classification and Regression.", EDM, pp. 228–231, 2013.
Toman, D., and G. Weddell, "CFDnc: A PTIME Description Logic with Functional Constraints and Disjointness.", Description Logics, pp. 451–463, 2013.
Balog, K., D. Elsweiler, E. Kanoulas, L. Kelly, and M. Smucker, "CIKM 2013 workshop on living labs for information retrieval evaluation.", CIKM, pp. 2557–2558, 2013.
Whissell, J., and C. Clarke, "Classification-Based Clustering Evaluation.", ICDM, pp. 1229–1234, 2013.
Bellogín, A., G. Gebremeskel, J. He, A. Said, T. Samar, A. de Vries, J. Lin, and J. Vuurens, "CWI and TU Delft Notebook TREC 2013: Contextual Suggestion, Federated Web Search, KBA, and Web Tracks.", TREC, 2013.
Stonebraker, M., D. Bruckner, I. Ilyas, G. Beskales, M. Cherniack, S. Zdonik, A. Pagan, and S. Xu, "Data Curation at Scale: The Data Tamer System.", CIDR, 2013.
Lei, B., I. Surya, S. Kamali, and K. Daudjee, "Data Partitioning for Video-on-Demand Services.", NCA, pp. 49–54, 2013.
Golab, L., and T. Johnson, "Data stream warehousing.", SIGMOD Conference, pp. 949–952, 2013.
Asadi, N., J. Lin, and M. Busch, "Dynamic memory allocation policies for postings in real-time Twitter search.", KDD, pp. 1186–1194, 2013.
Whissell, J., and C. Clarke, "Effective measures for inter-document similarity.", CIKM, pp. 1361–1370, 2013.
Dean-Hall, A., C. Clarke, J. Kamps, and P. Thomas, "Evaluating Contextual Suggestion.", EVIA@NTCIR, 2013.
Mishne, G., J. Dalton, Z. Li, A. Sharma, and J. Lin, "Fast data in the era of big data: Twitter’s real-time related query suggestion architecture.", SIGMOD Conference, pp. 1147–1158, 2013.
Konow, R., G. Navarro, C. Clarke, and A. López-Ortiz, "Faster and smaller inverted indices with treaps.", SIGIR, pp. 193–202, 2013.
Chu, X., I. Ilyas, and P. Papotti, "Holistic data cleaning: Putting violations into context.", ICDE, pp. 458–469, 2013.
Balkesen, C., J. Teubner, G. Alonso, and M. Özsu, "Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware.", ICDE, pp. 362–373, 2013.
Agrawal, D., A. Abbadi, H. Mahmoud, F. Nawab, and K. Salem, "Managing Geo-replicated Data in Multi-datacenters.", DNIS, pp. 23–43, 2013.
Jin, C., R. Liu, and K. Salem, "Materialized views for eventually consistent record stores.", ICDE Workshops, pp. 250–257, 2013.
Eidelman, V., K. Wu, F. Türe, P. Resnik, and J. Lin, "Mr. MIRA: Open-Source Large-Margin Structured Learning on MapReduce.", ACL (Conference System Demonstrations), pp. 199–204, 2013.
Dallachiesa, M., A. Ebaid, A. Eldawy, A. Elmagarmid, I. Ilyas, M. Ouzzani, and N. Tang, "NADEEF: a commodity data cleaning system.", SIGMOD Conference, pp. 541–552, 2013.
Clarke, C., "Nugget-Based Computation of Graded Relevance.", EVIA@NTCIR, 2013.
Beskales, G., I. Ilyas, L. Golab, and A. Galiullin, "On the relative trust between inconsistent data and inaccurate constraints.", ICDE, pp. 541–552, 2013.
Dean-Hall, A., C. Clarke, N. Simone, J. Kamps, P. Thomas, and E. Voorhees, "Overview of the TREC 2013 Contextual Suggestion Track.", TREC, 2013.
Smucker, M., G. Kazai, and M. Lease, "Overview of the TREC 2013 Crowdsourcing Track.", TREC, 2013.
Lin, J., and M. Efron, "Overview of the TREC-2013 Microblog Track.", TREC, 2013.
Northam, L., R. Smits, K. Daudjee, and J. Istead, "Ray tracing in the cloud using MapReduce.", HPCS, pp. 19–26, 2013.
Kamali, S., and F. Tompa, "Retrieving documents with mathematical content.", SIGIR, pp. 353–362, 2013.
Murdock, V., C. Clarke, J. Kamps, and J. Karlgren, "Search and exploration of X-Rated information (SEXI 2013).", WSDM, pp. 795–796, 2013.
Clarke, C., L. Freund, M. Smucker, and E. Yilmaz, "SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation.", SIGIR, pp. 1134, 2013.
Kamali, S., and F. Tompa, "Structural Similarity Search for Mathematics Retrieval.", MKM/Calculemus/DML, pp. 246–262, 2013.
Lutz, C., I. Seylan, D. Toman, and F. Wolter, "The Combined Approach to OBDA: Taming Role Hierarchies Using Filters.", International Semantic Web Conference (1), pp. 314–330, 2013.
Sakai, T., Z. Dou, and C. Clarke, "The impact of intent selection on diversified search evaluation.", SIGIR, pp. 921–924, 2013.
Clarke, C., "Time-Biased Gain.", NTCIR, 2013.
Asadi, N., and J. Lin, "Training Efficient Tree-Based Models for Document Ranking.", ECIR, pp. 146–157, 2013.
Forsyth, S., and K. Daudjee, "Update Management in Decentralized Social Networks.", ICDCS Workshops, pp. 196–201, 2013.
Rios, M., and J. Lin, "Visualizing the “Pulse” of World Cities on Twitter.", ICWSM, 2013.
DeWitt, D., I. Ilyas, J. Naughton, and M. Stonebraker, "We are drowning in a sea of least publishable units (LPUs).", SIGMOD Conference, pp. 921–922, 2013.
Ammar, K., and M. Özsu, "WGB: Towards a Universal Graph Benchmark.", WBDB, pp. 58–72, 2013.
Gupta, P., A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh, "WTF: the who to follow service at Twitter.", WWW, pp. 505–514, 2013.
Mehdad, Y., G. Carenini, F. Tompa, and R. Ng, "Abstractive Meeting Summarization with Entailment and Fusion.", ENLG, pp. 136–146, 2013.
Eidelman, V., K. Wu, F. Türe, P. Resnik, and J. Lin, "Towards Efficient Large-Scale Feature-Rich Statistical Machine Translation.", WMT@ACL, pp. 128–133, 2013.
Özsu, M., "ACM books to launch.", Commun. ACM, vol. 56, no. 12, pp. 5, 2013.
Abu-Khzam, F., K. Daudjee, A. Mouawad, and N. Nishimura, "An Easy-to-use Scalable Framework for Parallel Recursive Backtracking.", CoRR, vol. abs/1312.7626, 2013.
Liu, R., A. Aboulnaga, and K. Salem, "DAX: A Widely Distributed Multi-tenant Storage Service for DBMS Hosting.", PVLDB, vol. 6, no. 4, pp. 253–264, 2013.
Chu, X., I. Ilyas, and P. Papotti, "Discovering Denial Constraints.", PVLDB, vol. 6, no. 13, pp. 1498–1509, 2013.
Golab, L., M. Hadjieleftheriou, H. Karloff, and B. Saha, "Distributed Data Placement via Graph Partitioning.", CoRR, vol. abs/1312.0285, 2013.
Asadi, N., and J. Lin, "Document vector representations for feature extraction in multi-stage document ranking.", Inf. Retr., vol. 16, no. 6, pp. 747–768, 2013.
Asadi, N., J. Lin, and M. Busch, "Dynamic Memory Allocation Policies for Postings in Real-Time Twitter Search", CoRR, vol. abs/1302.5302, 2013.
Lin, J., and M. Efron, "Evaluation as a service for information retrieval.", SIGIR Forum, vol. 47, no. 2, pp. 8–14, 2013.
Akinyemi, J., and C. Clarke, "Fast and effective soft links.", Softw., Pract. Exper., vol. 43, no. 5, pp. 577–593, 2013.
Asadi, N., and J. Lin, "Fast candidate generation for real-time tweet search with bloom filter chains.", ACM Trans. Inf. Syst., vol. 31, no. 3, pp. 13, 2013.
Asadi, N., and J. Lin, "Fast, Incremental Inverted Indexing in Main Memory for Web-Scale Collections", CoRR, vol. abs/1305.0699, 2013.
Capra, R., L. Freund, C. Smith, M. Smucker, and R. White, "HCIR 2013: the seventh international symposium on human-computer interaction and information retrieval.", SIGIR Forum, vol. 47, no. 2, pp. 33–40, 2013.
Kumar, K., J. Gluck, A. Deshpande, and J. Lin, "Hone: “Scaling Down” Hadoop on Shared-Memory Systems.", PVLDB, vol. 6, no. 12, pp. 1354–1357, 2013.
Liu, X., and K. Salem, "Hybrid Storage Management for Database Systems.", Proc. VLDB Endow., vol. 6, no. 8, pp. 541–552, 2013.
Ashkan, A., and C. Clarke, "Impact of query intent and search context on clickthrough behavior in sponsored search.", Knowl. Inf. Syst., vol. 34, no. 2, pp. 425–452, 2013.
Golbus, P., J. Aslam, and C. Clarke, "Increasing evaluation sensitivity to diversity.", Inf. Retr., vol. 16, no. 4, pp. 530–555, 2013.
Balkesen, C., G. Alonso, J. Teubner, and M. Özsu, "Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited.", PVLDB, vol. 7, no. 1, pp. 85–96, 2013.
Ebaid, A., A. Elmagarmid, I. Ilyas, M. Ouzzani, J-A. Quiané-Ruiz, N. Tang, and S. Yin, "NADEEF: A Generalized Data Cleaning System.", PVLDB, vol. 6, no. 12, pp. 1218–1221, 2013.
Chen, T., L. Chen, M. Özsu, and N. Xiao, "Optimizing Multi-Top-k Queries over Uncertain Data Streams.", IEEE Trans. Knowl. Data Eng., vol. 25, no. 8, pp. 1814–1829, 2013.
Chen, L., I. Ilyas, C. Ré, and X. Zhou, "Probabilistic Web Data Management.", World Wide Web, vol. 16, no. 3, pp. 271–272, 2013.
Minhas, U., S. Rajagopalan, B. Cully, A. Aboulnaga, K. Salem, and A. Warfield, "RemusDB: transparent high availability for database systems.", VLDB J., vol. 22, no. 1, pp. 29–45, 2013.
Clarke, C., L. Freund, M. Smucker, and E. Yilmaz, "Report on the SIGIR 2013 workshop on modeling user behavior for information retrieval evaluation (MUBE 2013).", SIGIR Forum, vol. 47, no. 2, pp. 84–95, 2013.
Murdock, V., C. Clarke, J. Kamps, and J. Karlgren, "Report on the workshop on search and exploration of x-rated information (SEXI 2013).", SIGIR Forum, vol. 47, no. 1, pp. 31–37, 2013.
Golab, L., "Data Warehouse Quality: Summary and Outlook.", Handbook of Data Quality, pp. 121–140, 2013.
Ng, R., P. Arocena, D. Barbosa, G. Carenini, L. Gomes, Jr., S. Jou, R. Leung, E. Milios, R. Miller, J. Mylopoulos, et al., "Perspectives on Business Intelligence", Perspectives on Business Intelligence, pp. 1–163, 2013.

2012

MacDonald, C., J. Wang, and C. Clarke, "2nd international workshop on diversity in document retrieval (DDR 2012).", WSDM, pp. 769–770, 2012.
Golab, L., T. Johnson, S. Sen, and J. Yates, "A Sequence-Oriented Stream Warehouse Paradigm for Network Monitoring Applications.", PAM, pp. 53–63, 2012.
Wu, J., A. Hudek, D. Toman, and G. Weddell, "Absorption for ABoxes.", Description Logics, 2012.
Wu, J., A. Hudek, D. Toman, and G. Weddell, "Assertion Absorption in Object Queries over Knowledge Bases.", KR, 2012.
Türe, F., J. Lin, and D. Oard, "Combining Statistical Translation Techniques for Cross-Language Information Retrieval.", COLING, pp. 2685–2702, 2012.
Golab, L., H. Karloff, F. Korn, B. Saha, and D. Srivastava, "Discovering Conservation Rules.", ICDE, pp. 738–749, 2012.
Busch, M., K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. Lin, "Earlybird: Real-Time Search at Twitter.", ICDE, pp. 1360–1369, 2012.
Minhas, U., R. Liu, A. Aboulnaga, K. Salem, J. Ng, and S. Robertson, "Elastic Scale-Out for Partition-Based Database Systems.", ICDE Workshops, pp. 281–288, 2012.
McCullough, D., J. Lin, C. MacDonald, I. Ounis, and R. McCreadie, "Evaluating Real-Time Search over Tweets.", ICWSM, 2012.
Drzadzewski, G., and F. Tompa, "Exploring and analyzing documents with OLAP.", PIKM, pp. 33–40, 2012.
Chairunnanda, P., S. Forsyth, and K. Daudjee, "Graph data partition models for online social networks.", HT, pp. 175–180, 2012.
Smucker, M., J. Allan, and B. Dachev, "Human question answering performance using an interactive document retrieval system.", IIiX, pp. 35–44, 2012.
Pound, J., A. Hudek, I. Ilyas, and G. Weddell, "Interpreting keyword queries over web knowledge bases.", CIKM, pp. 305–314, 2012.
El-Helw, A., M. Farid, and I. Ilyas, "Just-in-time information extraction using extraction views.", SIGMOD Conference, pp. 613–616, 2012.
Lin, J., and A. Kolcz, "Large-scale machine learning at twitter.", SIGMOD Conference, pp. 793–804, 2012.
Raveendran, G., and C. Clarke, "Lightweight contrastive summarization for news comment mining.", SIGIR, pp. 1103–1104, 2012.
Türe, F., J. Lin, and D. Oard, "Looking inside the box: context-sensitive translation for cross-language information retrieval.", SIGIR, pp. 1105–1106, 2012.
Ashkan, A., and C. Clarke, "Modeling browsing behavior for click analysis in sponsored search.", CIKM, pp. 2015–2019, 2012.
Smucker, M., and C. Clarke, "Modeling user variance in time-biased gain.", HCIR, pp. 3, 2012.
McCreadie, R., I. Soboroff, J. Lin, C. MacDonald, I. Ounis, and D. McCullough, "On building a reusable Twitter corpus.", SIGIR, pp. 1113–1114, 2012.
Dean-Hall, A., C. Clarke, J. Kamps, P. Thomas, and E. Voorhees, "Overview of the TREC 2012 Contextual Suggestion Track.", TREC, 2012.
Smucker, M., G. Kazai, and M. Lease, "Overview of the TREC 2012 Crowdsourcing Track.", TREC, 2012.
Clarke, C., N. Craswell, and E. Voorhees, "Overview of the TREC 2012 Web Track.", TREC, 2012.
Soboroff, I., I. Ounis, C. MacDonald, and J. Lin, "Overview of the TREC-2012 Microblog Track.", TREC, 2012.
Smucker, M., and C. Clarke, "Stochastic simulation of time-biased gain.", CIKM, pp. 2040–2044, 2012.
Lutz, C., I. Seylan, D. Toman, and F. Wolter, "The Combined Approach to OBDA: Taming Role Hierarchies using Filters.", SSWS+HPCSW@ISWC, pp. 16–31, 2012.
Smucker, M., and C. Jethani, "Time to judge relevance as an indicator of assessor error.", SIGIR, pp. 1153–1154, 2012.
Smucker, M., and C. Clarke, "Time-based calibration of effectiveness measures.", SIGIR, pp. 95–104, 2012.
Bär, A., and L. Golab, "Towards benchmarking stream data warehouses.", DOLAP, pp. 105–112, 2012.
Mishne, G., and J. Lin, "Twanchor text: a preliminary study of the value of tweets as anchor text.", SIGIR, pp. 1159–1160, 2012.
Lin, J., and G. Mishne, "A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)", CoRR, vol. abs/1205.6855, 2012.
Zou, L., L. Chen, M. Özsu, and D. Zhao, "Answering pattern match queries in large graph databases via graph embedding.", VLDB J., vol. 21, no. 1, pp. 97–120, 2012.
Mishne, G., J. Dalton, Z. Li, A. Sharma, and J. Lin, "Fast Data in the Era of Big Data: Twitter’s Real-Time Related Query Suggestion Architecture", CoRR, vol. abs/1210.7350, 2012.
Beskales, G., I. Ilyas, L. Golab, and A. Galiullin, "On the Relative Trust between Inconsistent Data and Inaccurate Constraints", CoRR, vol. abs/1207.5226, 2012.
Trotman, A., C. Clarke, I. Ounis, J. Culpepper, M-A. Cartright, and S. Geva, "Open source information petrieval: a report on the SIGIR 2012 workshop.", SIGIR Forum, vol. 46, no. 2, pp. 95–101, 2012.
Asadi, N., J. Lin, and A. de Vries, "Runtime Optimizations for Prediction with Tree-Based Models", CoRR, vol. abs/1212.2287, 2012.
Golab, L., T. Johnson, and V. Shkapenyuk, "Scalable Scheduling of Updates in Streaming Data Warehouses.", IEEE Trans. Knowl. Data Eng., vol. 24, no. 6, pp. 1092–1105, 2012.
Lin, J., and D. Ryaboy, "Scaling big data mining infrastructure: the twitter experience.", SIGKDD Explorations, vol. 14, no. 2, pp. 6–19, 2012.
Beskales, G., G. Das, A. Elmagarmid, I. Ilyas, F. Naumann, M. Ouzzani, P. Papotti, J-A. Quiané-Ruiz, and N. Tang, "The data analytics group at the qatar computing research institute.", SIGMOD Record, vol. 41, no. 4, pp. 33–38, 2012.
Lee, G., J. Lin, C. Liu, A. Lorek, and D. Ryaboy, "The Unified Logging Infrastructure for Data Analytics at Twitter", CoRR, vol. abs/1208.4171, 2012.
Lee, G., J. Lin, C. Liu, A. Lorek, and D. Ryaboy, "The Unified Logging Infrastructure for Data Analytics at Twitter.", PVLDB, vol. 5, no. 12, pp. 1771–1780, 2012.

2011

Wang, L., J. Lin, and D. Metzler, "A cascade ranking model for efficient ranked retrieval.", SIGIR, pp. 105–114, 2011.
Clarke, C., N. Craswell, I. Soboroff, and A. Ashkan, "A comparative analysis of cascade measures for novelty and diversity.", WSDM, pp. 75–84, 2011.
Pound, J., D. Toman, G. Weddell, and J. Wu, "An Assertion Retrieval Algebra for Object Queries over Knowledge Bases.", IJCAI, pp. 1051–1056, 2011.
Leibert, F., J. Mannix, J. Lin, and B. Hamadani, "Automatic management of partitioned, replicated search services.", SoCC, pp. 27, 2011.
Whissell, J., and C. Clarke, "Clustering for semi-supervised spam filtering.", CEAS, pp. 125–134, 2011.
Golab, L., and T. Johnson, "Consistency in a Stream Warehouse.", CIDR, pp. 114–122, 2011.
Asadi, N., D. Metzler, and J. Lin, "Cross-corpus relevance projection.", SIGIR, pp. 1163–1164, 2011.
"Distributed data management in 2020?", ICDE, pp. 1360, 2011.
Akinyemi, J., and C. Clarke, "Do Subtopic Judgments Reflect Diversity?", ICTIR, pp. 309–312, 2011.
Kamali, S., P. Ghodsnia, and K. Daudjee, "Dynamic data allocation with replication in distributed systems.", IPCCC, pp. 1–8, 2011.
Cheng, J., Y. Ke, S. Chu, and M. Özsu, "Efficient core decomposition in massive networks.", ICDE, pp. 51–62, 2011.
Franconi, E., and D. Toman, "Fixpoints in Temporal Description Logics.", IJCAI, pp. 875–880, 2011.
Kamali, S., and F. Tompa, "Grammar Inference for Web Documents.", WebDB, 2011.
Smucker, M., and C. Jethani, "Measuring assessor accuracy: a comparison of nist assessors and user study participants.", SIGIR, pp. 1231–1232, 2011.
Türe, F., T. Elsayed, and J. Lin, "No free lunch: brute force vs. locality-sensitive hashing for cross-lingual pairwise similarity.", SIGIR, pp. 943–952, 2011.
Miller, R., F. Tompa, S. McIlraith, J. Slonim, and E. Yu, "NSERC business intelligence network: selected topics.", CASCON, pp. 313–315, 2011.
Ashkan, A., and C. Clarke, "