2024
"Adapting Standard Retrieval Benchmarks to Evaluate Generated Answers", European Conference on Information Retrieval (ECIR), 2024.
,
"Analysis of Open Government Datasets From a Data Design and Integration Perspective", International Conference on Extending Database Technology (EDBT), 2024.
,
"Construction of Paired Knowledge Graph - Text Datasets Informed By Cyclic Evaluation", International Conference on Computational Linguistics (COLING), 2024.
,
"Fréchet Distance for Offline Evaluation of Information Retrieval Systems With Sparse Labels", Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024.
,
"Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification", AAAI Conference on Artificial Intelligence (AAAI), 2024.
,
"KnowFIRES: A Knowledge-Graph Framework for Interpreting Retrieved Entities From Search", European Conference on Information Retrieval (ECIR), 2024.
,
"Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media", AAAI Conference on Artificial Intelligence (AAAI), 2024.
,
"On Integrating the Data-Science and Machine-Learning Pipelines For Responsible AI", Workshop in Governance, Understanding and Integration of Data for Effective and Responsible AI (GUIDE-AI), 2024.
,
"Optimizing Differential Computation for Large-Scale Graph Processing", International Workshop on Graph Data Management Experiences and Systems (GRADES), 2024.
,
"Practical Hardware Transactional vEB Trees", ACM Symposium on Principles & Practice of Parallel Programming (PPoPP), 2024.
,
"The Future of Graph Analytics", ACM International Conference on Management of Data (SIGMOD), 2024.
,
"The Search Futures Workshop", European Conference on Information Retrieval (ECIR), 2024.
,
"Towards Automated End-to-End Health Misinformation Free Search With A Large Language Model", European Conference on Information Retrieval (ECIR), 2024.
,
"Vector Search With OpenAI Embeddings: Lucene Is All You Need", Web Search and Data Mining (WSDM), 2024.
,
"A Comparison of Methods for Evaluating Generative IR", ArXiv, vol. abs/2404.04044, 2024.
,
"Adapting Standard Retrieval Benchmarks to Evaluate Generated Answers", ArXiv, vol. abs/2401.04842, 2024.
,
"Assessing and Verifying Task Utility in LLM-Powered Applications", ArXiv, vol. abs/2405.02178, 2024.
,
"Explaining Expert Search and Team Formation Systems With ExES", ArXiv, vol. abs/2405.12881, 2024.
,
"FLAME: Factuality-Aware Alignment for Large Language Models", ArXiv, vol. abs/2405.01525, 2024.
,
"Fréchet Distance for Offline Evaluation of Information Retrieval Systems With Sparse Labels", ArXiv, vol. abs/2401.17543, 2024.
,
"Generative Information Retrieval Evaluation", ArXiv, vol. abs/2404.08137, 2024.
,
"Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification", ArXiv, vol. abs/2404.15279, 2024.
,
"LLMs Can Patch Up Missing Relevance Judgments in Evaluation", ArXiv, vol. abs/2405.04727, 2024.
,
"Nearest Neighbor Speculative Decoding for LLM Generation and Attribution", ArXiv, vol. abs/2405.19325, 2024.
,
"PromptReps: Prompting Large Language Models to Generate Dense And Sparse Representations for Zero-Shot Document Retrieval", ArXiv, vol. abs/2404.18424, 2024.
,
"RAGE Against the Machine: Retrieval-Augmented LLM Explanations", ArXiv, vol. abs/2405.13000, 2024.
,
"Rumour Evaluation With Very Large Language Models", ArXiv, vol. abs/2404.16859, 2024.
,
"Technical Perspective: Synthetic Data Needs a Reproducibility Benchmark", SIGMOD Record, vol. 53, issue 1, pp. 64, 2024.
,
"Toward Best Practices for Training Multilingual Dense Retrieval Models", ACM Transactions on Information Systems (TOIS), vol. 42, issue 2, pp. 39:1--39:33, 2024.
,
"Towards Better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications", ArXiv, vol. abs/2402.09015, 2024.
,
"UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models", ArXiv, vol. abs/2405.10311, 2024.
,
"Who Determines What Is Relevant? Humans or AI? Why Not Both?", Communications of the ACM, vol. 67, issue 4, pp. 31--34, 2024.
,
2023
""Low-Resource" Text Classification: A Parameter-Free Classification Method With Compressors", Association for Computational Linguistics (ACL), 2023.
,
"A Is for Adele: An Offline Evaluation Metric for Instant Search", International Conference on the Theory of Information Retrieval (ICTIR), 2023.
,
"A Preference Judgment Tool for Authoritative Assessment", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"An Experimental Analysis of Quantile Sketches Over Data Streams", International Conference on Extending Database Technology (EDBT), 2023.
,
"An Overview of Reachability Indexes on Graphs", ACM International Conference on Management of Data (SIGMOD), 2023.
,
"Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes", International Conference on Information and Knowledge Management (CIKM), 2023.
,
"Answer Retrieval for Math Questions Using Structural and Dense Retrieval", Conference and Labs of the Evaluation Forum (CLEF), 2023.
,
"AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"Better Quality Pre-Training Data and T5 Models for African Languages", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"CIRAL at FIRE 2023: Cross-Lingual Information Retrieval for African Languages", Forum for Information Retrieval Evaluation (FIRE), 2023.
,
"CITADEL: Conditional Token Interaction via Dynamic Lexical Routing For Efficient and Effective Multi-Vector Retrieval", Association for Computational Linguistics (ACL), 2023.
,
"CREDENCE: Counterfactual Explanations for Document Ranking", IEEE International Conference on Data Engineering (ICDE), 2023.
,
"dLSM: An LSM-Based Index for Memory Disaggregation", IEEE International Conference on Data Engineering (ICDE), 2023.
,
"EAGER: Explainable Question Answering Using Knowledge Graphs", International Workshop on Graph Data Management Experiences and Systems (GRADES), 2023.
,
"Enhancing Sparse Retrieval via Unsupervised Learning", ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region (SIGIR-AP), 2023.
,
"Evaluating Embedding APIs for Information Retrieval", Association for Computational Linguistics (ACL), 2023.
,
"Evaluating Open-Domain Question Answering in the Era of Large Language Models", Association for Computational Linguistics (ACL), 2023.
,
"FedFormer: Contextual Federation With Attention in Reinforcement Learning", International Joint Conference on Autonomous Agents & Multiagent Systems (AAMAS), 2023.
,
"FLEEK: Factual Error Detection and Correction With Evidence Retrieved From External Knowledge", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration", Association for Computational Linguistics (ACL), 2023.
,
"GAMMA: A Graph Pattern Mining Framework for Large Graphs on GPU", IEEE International Conference on Data Engineering (ICDE), 2023.
,
"gFOV: A Full-Stack SPARQL Query Optimizer & Plan Visualizer", International Conference on Information and Knowledge Management (CIKM), 2023.
,
"Governor: Turning Open Government Data Portals Into Interactive Databases", ACM Conference on Human Factors in Computing Systems (CHI), 2023.
,
"Growing and Serving Large Open-Domain Knowledge Graphs", ACM International Conference on Management of Data (SIGMOD), 2023.
,
"How Does Generative Retrieval Scale to Millions of Passages?", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"How to Train Your Dragon: Diverse Augmentation Towards Generalizable Dense Retrieval", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"InfoMoD: Information-Theoretic Model Diagnostics", International Conference on Statistical and Scientific Database Management (SSDBM), 2023.
,
"iORDER: Mining Implicit Domain Orders", IEEE International Conference on Data Engineering (ICDE), 2023.
,
"KÙZU Graph Database Management System", Conference on Innovative Data Systems Research (CIDR), 2023.
,
"Limitations of Open-Domain Question Answering Benchmarks for Document-Level Reasoning", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"Made to Measure: A Workshop on Human-Centred Metrics for Information Seeking", Conference on Human Information Interaction and Retrieval (CHIIR), 2023.
,
"mAggretriever: A Simple Yet Effective Approach to Zero-Shot Multilingual Dense Retrieval", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"MMEAD: MS MARCO Entity Annotations and Disambiguations", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"On the Data Quality of Remotely Sensed Forest Maps", Very Large Data Bases Conference (VLDB), 2023.
,
"One Blade for One Purpose: Advancing Math Information Retrieval Using Hybrid Search", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers", Association for Computational Linguistics (ACL), 2023.
,
"Overview of the CIRAL Track at FIRE 2023: Cross-Lingual Information Retrieval for African Languages", Forum for Information Retrieval Evaluation (FIRE), 2023.
,
"Path Description Dependencies in Feature-Based DLs", International Workshop on Description Logics (DL), 2023.
,
"Perspectives on Large Language Models for Relevance Judgment", International Conference on the Theory of Information Retrieval (ICTIR), 2023.
,
"Pre-Processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering", European Conference on Information Retrieval (ECIR), 2023.
,
"Precise Zero-Shot Dense Retrieval Without Relevance Labels", Association for Computational Linguistics (ACL), 2023.
,
"Preface QDB", Very Large Data Bases Conference (VLDB), 2023.
,
"Preface SDA", Very Large Data Bases Conference (VLDB), 2023.
,
"Preference-Based Offline Evaluation", Web Search and Data Mining (WSDM), 2023.
,
"PyGaggle: A Gaggle of Resources for Open-Domain Question Answering", European Conference on Information Retrieval (ECIR), 2023.
,
"Real-Time LSM-Trees for HTAP Workloads", IEEE International Conference on Data Engineering (ICDE), 2023.
,
"Retrieving Supporting Evidence for Generative Question Answering", ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region (SIGIR-AP), 2023.
,
"SLIM: Sparsified Late Interaction for Multi-Vector Retrieval With Inverted Indexes", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"Spacerini: Plug-and-Play Search Engines With Pyserini and Hugging Face", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
,
"SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-Shot Neural Sparse Retrieval", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"Technology-Assisted Review for Spreadsheets and Noisy Text", ACM Symposium on Document Engineering (DocEng), 2023.
,
"Tevatron: An Efficient and Flexible Toolkit for Neural Retrieval", International Conference on Research and Development in Information Retrieval (SIGIR), 2023.
,
"To Join or Not to Join: An Analysis on the Usefulness of Joining Tables In Open Government Data Portals", Very Large Data Bases Conference (VLDB), 2023.
,
"What the DAAM: Interpreting Stable Diffusion Using Cross Attention", Association for Computational Linguistics (ACL), 2023.
,
"A Dense Representation Framework for Lexical and Semantic Matching", ACM Transactions on Information Systems (TOIS), vol. 41, issue 4, pp. 110:1--110:29, 2023.
,
"Accurate Summary-Based Cardinality Estimation Through the Lens Of Cardinality Estimation Graphs", SIGMOD Record, vol. 52, issue 1, pp. 94--102, 2023.
,
"Aggretriever: A Simple Approach to Aggregate Textual Representations For Robust Dense Passage Retrieval", Transactions of the Association for Computational Linguistics, vol. 11, pp. 436--452, 2023.
,
"Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes", ArXiv, vol. abs/2304.12139, 2023.
,
"Approximating Human-Like Few-Shot Learning With GPT-based Compression", ArXiv, vol. abs/2308.06942, 2023.
,
"AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation", ArXiv, vol. abs/2304.01961, 2023.
,
"Autonomously Computable Information Extraction", Proceedings of the VLDB Endowment (PVLDB), vol. 16, issue 10, pp. 2431--2443, 2023.
,
"Caerus: Low-Latency Distributed Transactions for Geo-Replicated Systems", Proceedings of the VLDB Endowment (PVLDB), vol. 17, issue 3, pp. 469--482, 2023.
,
"Construction of Paired Knowledge Graph-Text Datasets Informed by Cyclic Evaluation", ArXiv, vol. abs/2309.11669, 2023.
,
"CREDENCE: Counterfactual Explanations for Document Ranking", ArXiv, vol. abs/2302.04983, 2023.
,
"Data Science - A Systematic Treatment", Communications of the ACM, vol. 66, issue 7, pp. 106--116, 2023.
,
"Data Science: A Systematic Treatment", ArXiv, vol. abs/2301.13761, 2023.
,
"Differentially Private Data Generation With Missing Data", ArXiv, vol. abs/2310.11548, 2023.
,
"DProvDB: Differentially Private Query Processing With Multi-Analyst Provenance", ArXiv, vol. abs/2309.10240, 2023.
,
"Efficient Document-at-a-Time and Score-at-a-Time Query Evaluation For Learned Sparse Representations", ACM Transactions on Information Systems (TOIS), vol. 41, issue 4, pp. 96:1--96:28, 2023.
,
"Efficient Execution of SPARQL Queries With OPTIONAL and UNION Expressions", ArXiv, vol. abs/2303.13844, 2023.
,
"End-to-End Retrieval With Learned Dense and Sparse Representations Using Lucene", ArXiv, vol. abs/2311.18503, 2023.
,
"Evaluating Embedding APIs for Information Retrieval", ArXiv, vol. abs/2305.06300, 2023.
,
"Evaluating Open-Domain Question Answering in the Era of Large Language Models", ArXiv, vol. abs/2305.06984, 2023.
,
"Fact Ranking Over Large-Scale Knowledge Graphs With Reasoning Embedding Models", IEEE Data Engineering Bulletin, vol. 46, issue 2, pp. 126--139, 2023.
,
"Fine-Tuning LLaMA for Multi-Stage Text Retrieval", ArXiv, vol. abs/2310.08319, 2023.
,
"FLEEK: Factual Error Detection and Correction With Evidence Retrieved From External Knowledge", ArXiv, vol. abs/2310.17119, 2023.
,
"Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models", ArXiv, vol. abs/2310.07712, 2023.
,
"GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration", ArXiv, vol. abs/2306.01481, 2023.
,
"Generate, Filter, and Fuse: Query Expansion via Multi-Step Keyword Generation for Zero-Shot Neural Rankers", ArXiv, vol. abs/2311.09175, 2023.
,
"Growing and Serving Large Open-Domain Knowledge Graphs", ArXiv, vol. abs/2305.09464, 2023.
,
"HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking With Attribution", ArXiv, vol. abs/2307.16883, 2023.
,
"High-Throughput Vector Similarity Search in Knowledge Graphs", ArXiv, vol. abs/2304.01926, 2023.
,
"High-Throughput Vector Similarity Search in Knowledge Graphs", Proceedings of the ACM on Management of Data, vol. 1, issue 2, pp. 197:1--197:25, 2023.
,
"How Does Generative Retrieval Scale to Millions of Passages?", ArXiv, vol. abs/2305.11841, 2023.
,
"How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval", ArXiv, vol. abs/2302.07452, 2023.
,
"Improving Out-of-Distribution Generalization of Neural Rerankers With Contextualized Late Interaction", ArXiv, vol. abs/2302.06589, 2023.
,
"Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs", ArXiv, vol. abs/2311.15781, 2023.
,
"Indexing Techniques for Graph Reachability Queries", ArXiv, vol. abs/2311.03542, 2023.
,
"Kùzu: A Database Management System for "Beyond Relational" Workloads", SIGMOD Record, vol. 52, issue 3, pp. 39--40, 2023.
,
"Leveraging LLMs for Synthesizing Training Data Across Many Languages In Multilingual Dense Retrieval", ArXiv, vol. abs/2311.05800, 2023.
,
"MIRACL: A Multilingual Retrieval Dataset Covering 18 Diverse Languages", Transactions of the Association for Computational Linguistics, vol. 11, pp. 1114--1131, 2023.
,
"MMEAD: MS MARCO Entity Annotations and Disambiguations", ArXiv, vol. abs/2309.07574, 2023.
,
"Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media", ArXiv, vol. abs/2307.09312, 2023.
,
"NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation", ArXiv, vol. abs/2312.11361, 2023.
,
"Open Domain Knowledge Extraction for Knowledge Graphs", ArXiv, vol. abs/2312.09424, 2023.
,
"Perspectives on Large Language Models for Relevance Judgment", ArXiv, vol. abs/2304.09161, 2023.
,
"POEM: Pattern-Oriented Explanations of Convolutional Neural Networks", Proceedings of the VLDB Endowment (PVLDB), vol. 16, issue 11, pp. 3192--3200, 2023.
,
"Predicting Hateful Discussions on Reddit Using Graph Transformer Networks And Communal Context", ArXiv, vol. abs/2301.04248, 2023.
,
"Qualitative Analysis of a Graph Transformer Approach to Addressing Hate Speech: Adapting to Dynamically Changing Content", ArXiv, vol. abs/2301.10871, 2023.
,
"Rank-Without-Gpt: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models", ArXiv, vol. abs/2312.02969, 2023.
,
"RankVicuna: Zero-Shot Listwise Document Reranking With Open-Source Large Language Models", ArXiv, vol. abs/2309.15088, 2023.
,
"RankZephyr: Effective and Robust Zero-Shot Listwise Reranking Is A Breeze!", ArXiv, vol. abs/2312.02724, 2023.
,
"Regex-Augmented Domain Transfer Topic Classification Based on a Pre-Trained Language Model: An Application in Financial Domain", ArXiv, vol. abs/2305.18324, 2023.
,
"Report on the Dagstuhl Seminar on Frontiers of Information Access Experimentation for Research and Education", SIGIR Forum, vol. 57, issue 1, pp. 7:1--7:28, 2023.
,
"Resources for Brewing BEIR: Reproducible Reference Models and An Official Leaderboard", ArXiv, vol. abs/2306.07471, 2023.
,
"Retrieving Supporting Evidence for Generative Question Answering", ArXiv, vol. abs/2309.11392, 2023.
,
"Retrieving Supporting Evidence for LLMs Generated Answers", ArXiv, vol. abs/2306.13781, 2023.
,
"Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking With Seq2seq Encoder-Decoder Models", ArXiv, vol. abs/2312.16098, 2023.
,
"Searching Dense Representations With Inverted Indexes", ArXiv, vol. abs/2312.01556, 2023.
,
"sGrow: Explaining the Scale-Invariant Strength Assortativity of Streaming Butterflies", ACM Transactions on the Web, vol. 17, issue 3, pp. 24:1--24:46, 2023.
,
"SGSI - A Scalable GPU-Friendly Subgraph Isomorphism Algorithm", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 35, issue 11, pp. 11899--11916, 2023.
,
"Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval", ArXiv, vol. abs/2304.01019, 2023.
,
"SLIM: Sparsified Late Interaction for Multi-Vector Retrieval With Inverted Indexes", ArXiv, vol. abs/2302.06587, 2023.
,
"SmartProbe: A Virtual Moderator for Market Research Surveys", ArXiv, vol. abs/2305.08271, 2023.
,
"Spacerini: Plug-and-Play Search Engines With Pyserini and Hugging Face", ArXiv, vol. abs/2302.14534, 2023.
,
"SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-Shot Neural Sparse Retrieval", ArXiv, vol. abs/2307.10488, 2023.
,
"TECHNICAL PERSPECTIVE: Ad Hoc Transactions: What They Are And Why We Should Care", SIGMOD Record, vol. 52, issue 1, pp. 6, 2023.
,
"Unsupervised Chunking With Hierarchical RNN", ArXiv, vol. abs/2309.04919, 2023.
,
"Vector Search With OpenAI Embeddings: Lucene Is All You Need", ArXiv, vol. abs/2308.14963, 2023.
,
"What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations", ArXiv, vol. abs/2311.18812, 2023.
,
"Which Model Shall I Choose? Cost/Quality Trade-Offs for Text Classification Tasks", ArXiv, vol. abs/2301.07006, 2023.
,
"Zero-Shot Cross-Lingual Reranking With Large Language Models for Low-Resource Languages", ArXiv, vol. abs/2312.16159, 2023.
,
"Zero-Shot Listwise Document Reranking With a Large Language Model", ArXiv, vol. abs/2305.02156, 2023.
,
2022
"A Common Framework for Exploring Document-at-a-Time and Score-at-a-Time Retrieval Methods", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Accessing Document Data Sources Using Referring Expression Types", International Workshop on Description Logics (DL), 2022.
,
"AfriCLIRMatrix: Enabling Cross-Lingual Information Retrieval for African Languages", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"Aligning the Research and Practice of Building Search Applications: Elasticsearch and Pyserini", Web Search and Data Mining (WSDM), 2022.
,
"Analyzing Climate Change Discussions on Reddit", International Conference on Computational Science and Computational Intelligence (CSCI), 2022.
,
"Another Look at DPR: Reproduction of Training and Replication Of Retrieval", European Conference on Information Retrieval (ECIR), 2022.
,
"Another Look at Information Retrieval as Statistical Translation", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Applying Structural and Dense Semantic Matching for the ARQMath Lab 2022, Clef", Conference and Labs of the Evaluation Forum (CLEF), 2022.
,
"Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"Creating a User Model to Support User-Specific Explanations of AI Systems", User Modeling, Adaptation, and Personalization (UMAP), 2022.
,
"Cross-Lingual Text-to-SQL Semantic Parsing With Representation Mixup", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"Discovering Domain Orders via Order Dependencies", IEEE International Conference on Data Engineering (ICDE), 2022.
,
"Document Expansion Baselines and Learned Sparse Lexical Representations For MS MARCO V1 and V2", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Dowsing for Answers to Math Questions: Doing Better With Less", Conference and Labs of the Evaluation Forum (CLEF), 2022.
,
"Early Stage Sparse Retrieval With Entity Linking", International Conference on Information and Knowledge Management (CIKM), 2022.
,
"Evaluating Complex Queries on Streaming Graphs", IEEE International Conference on Data Engineering (ICDE), 2022.
,
"Evaluating Token-Level and Passage-Level Dense Retrieval Models For Math Information Retrieval", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"Exploiting Hierarchical Parallelism and Reusability in Tensor Kernel Processing on Heterogeneous HPC Systems", IEEE International Conference on Data Engineering (ICDE), 2022.
,
"Few-Shot Non-Parametric Learning With Deep Latent Variable Model", Conference on Neural Information Processing Systems (NeurIPS), 2022.
,
"Fine-Tuning Dependencies With Parameters", International Conference on Extending Database Technology (EDBT), 2022.
,
"First Order Rewritability in Ontology-Mediated Querying in Horn Description Logics", AAAI Conference on Artificial Intelligence (AAAI), 2022.
,
"Flipping the Script: Inverse Information Seeking Dialogues for Market Research", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Fostering Coopetition While Plugging Leaks: The Design and Implementation Of the MS MARCO Leaderboards", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Gender Differences in Early Career Performance Reviews: A Text Mining Study", International Conference on Extending Database Technology (EDBT), 2022.
,
"GRADES-NDA'22: 5th International Workshop on Graph Data Management Experiences and Systems (GRADES) and Network Data Analytics (NDA)", ACM International Conference on Management of Data (SIGMOD), 2022.
,
"GRainDB: A Relational-Core Graph-Relational DBMS", Conference on Innovative Data Systems Research (CIDR), 2022.
,
"GRS: Combining Generation and Revision in Unsupervised Sentence Simplification", Association for Computational Linguistics (ACL), 2022.
,
"Human Preferences as Dueling Bandits", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Hydrozoa: Dynamic Hybrid-Parallel DNN Training on Serverless Containers", Conference on Machine Learning and Systems (MLSys), 2022.
,
"Improving Precancerous Case Characterization via Transformer-Based Ensemble Learning", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"Improving Query Representations for Dense Retrieval With Pseudo Relevance Feedback: A Reproducibility Study", European Conference on Information Retrieval (ECIR), 2022.
,
"Integration of Text and Geospatial Search for Hydrographic Datasets Using the Lucene Search Library", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2022.
,
"Learning Trustworthy Web Sources to Derive Correct Answers and Reduce Health Misinformation in Search", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Magic Sets in Interpolation-Based Rule Driven Query Optimization", International Web Rule Symposium (RuleML), 2022.
,
"MPC: Minimum Property-Cut RDF Graph Partitioning", IEEE International Conference on Data Engineering (ICDE), 2022.
,
"Neural Query Synthesis and Domain-Specific Ranking Templates for Multi-Stage Clinical Trial Matching", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Overview of the TREC 2022 Deep Learning Track", Text Retrieval Conference (TREC), 2022.
,
"Predicting Hateful Discussions on Reddit Using Graph Transformer Networks And Communal Context", IEEE/WIC/ACM International Conference on Web Intelligence (WI), 2022.
,
"Proteus: Autonomous Adaptive Storage for Mixed Workloads", ACM International Conference on Management of Data (SIGMOD), 2022.
,
"Pseudo-Relevance Feedback With Dense Retrievers in Pyserini", Australasian Document Computing Symposium (ADCS), 2022.
,
"REBL: Entity Linking at Scale (Prototype)", Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES), 2022.
,
"Saga: A Platform for Continuous Construction and Serving of Knowledge At Scale", ACM International Conference on Management of Data (SIGMOD), 2022.
,
"Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval", Text Retrieval Conference (TREC), 2022.
,
"SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"Squeezing Water From a Stone: A Bag of Tricks for Further Improving Cross-Encoder Effectiveness for Reranking", European Conference on Information Retrieval (ECIR), 2022.
,
"Temporal Early Exiting for Streaming Speech Commands Recognition", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022.
,
"The Dark Side of Relevance: The Effect of Non-Relevant Results On Search Behavior", Conference on Human Information Interaction and Retrieval (CHIIR), 2022.
,
"The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection", AAAI Conference on Artificial Intelligence (AAAI), 2022.
,
"To Interpolate or Not to Interpolate: PRF, Dense and Sparse Retrievers", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Too Many Relevants: Whither Cranfield Test Collections?", International Conference on Research and Development in Information Retrieval (SIGIR), 2022.
,
"Translating Human Mobility Forecasting Through Natural Language Generation", Web Search and Data Mining (WSDM), 2022.
,
"Understanding Document Data Sources Using Ontologies With Referring Expressions", Australian Joint Conference on Artificial Intelligence (AUS-AI), 2022.
,
"Unsupervised Question Clarity Prediction Through Retrieved Item Coherency", International Conference on Information and Knowledge Management (CIKM), 2022.
,
"UWaterlooMDS at the TREC 2022 Health Misinformation Track", Text Retrieval Conference (TREC), 2022.
,
"VoxelCache: Accelerating Online Mapping in Robotics and 3D Reconstruction Tasks", International Conference on Parallel Architectures and Compilation Techniques (PACT), 2022.
,
"WaterlooClarke at the TREC 2022 Conversational Assistant Track", Text Retrieval Conference (TREC), 2022.
,
"XRICL: Cross-Lingual Retrieval-Augmented in-Context Learning For Cross-Lingual Text-to-SQL Semantic Parsing", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
,
"A Dense Representation Framework for Lexical and Semantic Matching", ArXiv, vol. abs/2206.09912, 2022.
,
"Accurate Summary-Based Cardinality Estimation Through the Lens Of Cardinality Estimation Graphs", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 8, pp. 1533--1545, 2022.
,
"Aggretriever: A Simple Approach to Aggregate Textual Representation For Robust Dense Passage Retrieval", ArXiv, vol. abs/2208.00511, 2022.
,
"Better Than Whitespace: Information Retrieval for Languages Without Custom Tokenizers", ArXiv, vol. abs/2210.05481, 2022.
,
"Building a Culture of Reproducibility in Academic Research", ArXiv, vol. abs/2212.13534, 2022.
,
"Building an Efficiency Pipeline: Commutativity and Cumulativeness Of Efficiency Operators for Transformers", ArXiv, vol. abs/2208.00483, 2022.
,
"Cache Me if You Can: Accuracy-Aware Inference Engine for Differentially Private Data Exploration", ArXiv, vol. abs/2211.15732, 2022.
,
"Cache Me if You Can: Accuracy-Aware Inference Engine for Differentially Private Data Exploration", Proceedings of the VLDB Endowment (PVLDB), vol. 16, issue 4, pp. 574--586, 2022.
,
"Can Old TREC Collections Reliably Evaluate Modern Neural Retrieval Models?", ArXiv, vol. abs/2201.11086, 2022.
,
"Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking", ArXiv, vol. abs/2205.09638, 2022.
,
"CITADEL: Conditional Token Interaction via Dynamic Lexical Routing For Efficient and Effective Multi-Vector Retrieval", ArXiv, vol. abs/2211.10411, 2022.
,
"Computer-Assisted Cohort Identification in Practice", ACM Transactions on Computing for Healthcare, vol. 3, issue 2, pp. 17:1--17:28, 2022.
,
"Contextual Data Cleaning With Ontology Functional Dependencies", Journal of Data and Information Quality, vol. 14, issue 3, pp. 20:1--20:26, 2022.
,
"Continuous Active Learning Using Pretrained Transformers", ArXiv, vol. abs/2208.06955, 2022.
,
"Data Errors: Symptoms, Causes and Origins", IEEE Data Engineering Bulletin, vol. 45, issue 1, pp. 4--9, 2022.
,
"Domain Adaptation for Memory-Efficient Dense Retrieval", ArXiv, vol. abs/2205.11498, 2022.
,
"Don't Be a Tattle-Tale: Preventing Leakages Through Data Dependencies On Access Control Protected Data", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 11, pp. 2437--2449, 2022.
,
"Don't Be a Tattle-Tale: Preventing Leakages Through Data Dependencies On Access Control Protected Data", ArXiv, vol. abs/2207.08757, 2022.
,
"Early Stage Sparse Retrieval With Entity Linking", ArXiv, vol. abs/2208.04887, 2022.
,
"Editorial", Information Systems, vol. 109, pp. 102088, 2022.
,
"Effective Keyword Search Over Weighted Graphs", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 34, issue 2, pp. 601--616, 2022.
,
"Evaluating Token-Level and Passage-Level Dense Retrieval Models For Math Information Retrieval", ArXiv, vol. abs/2203.11163, 2022.
,
"Exploring Data Using Patterns: A Survey", Information Systems, vol. 108, pp. 101985, 2022.
,
"FedFormer: Contextual Federation With Attention in Reinforcement Learning", ArXiv, vol. abs/2205.13697, 2022.
,
"Few-Shot Non-Parametric Learning With Deep Latent Variable Model", ArXiv, vol. abs/2206.11573, 2022.
,
"G-Thinker: A General Distributed Framework for Finding Qualified Subgraphs In a Big Graph With Load Balancing", The VLDB Journal, vol. 31, issue 2, pp. 287--320, 2022.
,
"GRS: Combining Generation and Revision in Unsupervised Sentence Simplification", ArXiv, vol. abs/2203.09742, 2022.
,
"Human Preferences as Dueling Bandits", ArXiv, vol. abs/2204.10362, 2022.
,
"Improving Precancerous Case Characterization via Transformer-Based Ensemble Learning", ArXiv, vol. abs/2212.05150, 2022.
,
"Introduction to the special issue on self‑managing and hardware‑optimized database systems 2020", Distributed and Parallel Databases, vol. 40, issue 1, pp. 1--3, 2022.
,
"Introduction to the Special Section on Edge/Fog Computing for Infectious Disease Intelligence", ACM Transactions on Internet Technology (TOIT), vol. 22, issue 3, pp. 63e:1--63e:2, 2022.
,
"Less Is More: Parameter-Free Text Classification With Gzip", ArXiv, vol. abs/2212.09410, 2022.
,
"Machine Learning and Data Cleaning: Which Serves the Other?", Journal of Data and Information Quality, vol. 14, issue 3, pp. 13:1--13:11, 2022.
,
"Making a MIRACL: Multilingual Information Retrieval Across a Continuum Of Languages", ArXiv, vol. abs/2210.09984, 2022.
,
"Making RDBMSs Efficient on Graph Workloads Through Predefined Joins", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 5, pp. 1011--1023, 2022.
,
"MIDE: Accuracy Aware Minimally Invasive Data Exploration for Decision Support", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 11, pp. 2653--2665, 2022.
,
"Modern Techniques for Querying Graph-Structured Relations: Foundations, System Implementations, and Open Challenges", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 12, pp. 3762--3765, 2022.
,
"On the Interaction Between Differential Privacy and Gradient Compression In Deep Learning", ArXiv, vol. abs/2211.00734, 2022.
,
"Optimizing Differentially-Maintained Recursive Queries on Dynamic Graphs", ArXiv, vol. abs/2208.00273, 2022.
,
"Optimizing Differentially-Maintained Recursive Queries on Dynamic Graphs", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 11, pp. 3186--3198, 2022.
,
"POEM: Pattern-Oriented Explanations of CNN Models", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 12, pp. 3618--3621, 2022.
,
"Precise Zero-Shot Dense Retrieval Without Relevance Labels", ArXiv, vol. abs/2212.10496, 2022.
,
"Query Expansion Using Contextual Clue Sampling With Language Models", ArXiv, vol. abs/2210.07093, 2022.
,
"Reminiscences on Influential Papers", SIGMOD Record, vol. 51, issue 2, pp. 44--46, 2022.
,
"Report on the 16th Round of NII Testbeds and Community for Information Access Research (NTCIR-16)", SIGIR Forum, vol. 56, issue 2, pp. 7:1--7:8, 2022.
,
"Saga: A Platform for Continuous Construction and Serving of Knowledge At Scale", ArXiv, vol. abs/2204.07309, 2022.
,
"sGrapp: Butterfly Approximation in Streaming Graphs", ACM Transactions on Knowledge Discovery from Data, vol. 16, issue 4, pp. 76:1--76:43, 2022.
,
"Shallow Pooling for Sparse Labels", Information Retrieval Journal, vol. 25, issue 4, pp. 365--385, 2022.
,
"Space-Efficient Subgraph Search Over Streaming Graph With Timing Order Constraint", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 34, issue 9, pp. 4453--4467, 2022.
,
"SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale", ArXiv, vol. abs/2211.11740, 2022.
,
"Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval", ArXiv, vol. abs/2203.05765, 2022.
,
"The Case for Distributed Shared-Memory Databases With RDMA-Enabled Memory Disaggregation", ArXiv, vol. abs/2207.03027, 2022.
,
"The Case for Distributed Shared-Memory Databases With RDMA-Enabled Memory Disaggregation", Proceedings of the VLDB Endowment (PVLDB), vol. 16, issue 1, pp. 15--22, 2022.
,
"Tiresias: Enabling Predictive Autonomous Storage and Indexing", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 11, pp. 3126--3136, 2022.
,
"To Interpolate or Not to Interpolate: PRF, Dense and Sparse Retrievers", ArXiv, vol. abs/2205.00235, 2022.
,
"Towards Best Practices for Training Multilingual Dense Retrieval Models", ArXiv, vol. abs/2204.02363, 2022.
,
"Unsupervised Question Clarity Prediction Through Retrieved Item Coherency", ArXiv, vol. abs/2208.04882, 2022.
,
"Visualizing Privacy-Utility Trade-Offs in Differentially Private Data Releases", ArXiv, vol. abs/2201.05964, 2022.
,
"Visualizing Privacy-Utility Trade-Offs in Differentially Private Data Releases", Proceedings on Privacy Enhancing Technologies (PoPETs), vol. 2022, issue 2, pp. 601--618, 2022.
,
"VoxelCache: Accelerating Online Mapping in Robotics and 3D Reconstruction Tasks", ArXiv, vol. abs/2210.08729, 2022.
,
"What the DAAM: Interpreting Stable Diffusion Using Cross Attention", ArXiv, vol. abs/2210.04885, 2022.
,
"XRICL: Cross-Lingual Retrieval-Augmented in-Context Learning For Cross-Lingual Text-to-SQL Semantic Parsing", ArXiv, vol. abs/2210.13693, 2022.
,
2021
Pretrained Transformers for Text Ranking: BERT and Beyond: Morgan & Claypool, 2021.
,
"A+ Indexes: Tunable and Space-Efficient Adjacency Lists in Graph Database Management Systems", IEEE International Conference on Data Engineering (ICDE), 2021.
,
"Academic Integrity in Online Education During the COVID-19 Pandemic: A Social Media Mining Study", Educational Data Mining (EDM), 2021.
,
"Analyzing Ranking Strategies to Characterize Competition for Co-Operative Work Placements", Educational Data Mining (EDM), 2021.
,
"Approach Zero and Anserini at the CLEF-2021 ARQMath Track: Applying Substructure Search and BM25 on Operator Tree Path Tokens", Conference and Labs of the Evaluation Forum (CLEF), 2021.
,
"Are Machine Learning Corpora "Fair Dealing" Under Canadian Law?", International Conference on Computational Creativity (ICCC), 2021.
,
"BERxiT: Early Exiting for BERT With Better Fine-Tuning and Extension To Regression", Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
,
"Box Covers and Domain Orderings for Beyond Worst-Case Join Processing", International Conference on Database Theory (ICDT), 2021.
,
"Chatty Goose: A Python Framework for Conversational Search", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Comparing Score Aggregation Approaches for Document Retrieval With Pretrained Transformers", European Conference on Information Retrieval (ECIR), 2021.
,
"Contextualized Query Embeddings for Conversational Search", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
,
"Dendrite: Bolt-on Adaptivity for Data Systems", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"Don't Change Me! User-Controllable Selective Paraphrase Generation", Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
,
"Dowsing for Answers to Math Questions: Ongoing Viability of Traditional MathIR", Conference and Labs of the Evaluation Forum (CLEF), 2021.
,
"Dowsing for Math Answers", Conference and Labs of the Evaluation Forum (CLEF), 2021.
,
"DPGraph: A Benchmark Platform for Differentially Private Graph Analysis", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"Effective Keyword Search in Weighted Graphs (Extended Abstract)", IEEE International Conference on Data Engineering (ICDE), 2021.
,
"Efficient Discovery of Approximate Order Dependencies", International Conference on Extending Database Technology (EDBT), 2021.
,
"Efficiently Teaching an Effective Dense Retriever With Balanced Topic Aware Sampling", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Evaluation Measures Based on Preference Graphs", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Exploring Data Using Pa Erns: A Survey and Open Problems", International Workshop on Data Warehousing and OLAP (DOLAP), 2021.
,
"Exploring Listwise Evidence Reasoning With T5 for Fact Verification", Association for Computational Linguistics (ACL), 2021.
,
"Federated Deep Learning Architecture for Personalized Healthcare", Medical Informatics Europe (MIE), 2021.
,
"FO Rewritability for OMQ Using Beth Definability and Interpolation", International Workshop on Description Logics (DL), 2021.
,
"Graphsurge: Graph Analytics on View Collections Using Differential Computation", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"How Does BERT Rerank Passages? An Attribution Analysis With Information Bottlenecks", Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2021.
,
"In-Batch Negatives for Knowledge Distillation With Tightly-Coupled Teachers for Dense Retrieval", Workshop on Representation Learning for NLP (RepL4NLP), 2021.
,
"Klink: Progress-Aware Scheduling for Streaming Data Systems", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"KTabulator: Interactive Ad Hoc Table Creation Using Knowledge Graphs", ACM Conference on Human Factors in Computing Systems (CHI), 2021.
,
"Learning to Rank in the Age of Muppets: Effectiveness-Efficiency Tradeoffs In Multi-Stage Ranking", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
,
"MS MARCO: Benchmarking Ranking Models in the Large-Data Regime", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Multi-Task Dense Retrieval via Model Uncertainty Fusion for Open-Domain Question Answering", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
,
"NIR-Tree: A Non-Intersecting R-Tree", International Conference on Statistical and Scientific Database Management (SSDBM), 2021.
,
"On the Separation of Logical and Physical Ranking Models for Text Retrieval Applications", Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES), 2021.
,
"Overview of the TREC 2021 Deep Learning Track", Text Retrieval Conference (TREC), 2021.
,
"Overview of the TREC 2021 Health Misinformation Track", Text Retrieval Conference (TREC), 2021.
,
"PCOR: Private Contextual Outlier Release via Differentially Private Search", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"Practical Security and Privacy for Database Systems", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"Predicting Efficiency/Effectiveness Trade-Offs for Dense vs. Sparse Retrieval Strategy Selection", International Conference on Information and Knowledge Management (CIKM), 2021.
,
"Pretrained Transformers for Text Ranking: BERT and Beyond", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Pretrained Transformers for Text Ranking: BERT and Beyond", Web Search and Data Mining (WSDM), 2021.
,
"Projective Beth Definability and Craig Interpolation for Relational Query Optimization (Material to Accompany Invited Talk)", International Conference on Principles of Knowledge Representation and Reasoning (KR), 2021.
,
"Properties of Inconsistency Measures for Databases", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"PYA0: A Python Toolkit for Accessible Math-Aware Search", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Pyserini: A Python Toolkit for Reproducible Information Retrieval Research With Sparse and Dense Representations", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"R2GSync and Edge Views: Practical RDBMS to GDBMS Synchronization", ACM International Conference on Management of Data (SIGMOD), 2021.
,
"Rescuing Historical Climate Observations to Support Hydrological Research: A Case Study of Solar Radiation Data", ACM Symposium on Document Engineering (DocEng), 2021.
,
"RW-Team: Robust Team Formation Using Random Walk", International Conference on Information and Knowledge Management (CIKM), 2021.
,
"Scientific Claim Verification With VerT5erini", International Workshop on Health Text Mining and Information Analysis (Louhi), 2021.
,
"Segatron: Segment-Aware Transformer for Language Modeling and Understanding", AAAI Conference on Artificial Intelligence (AAAI), 2021.
,
"Semantics of the Unwritten: The Effect of End of Paragraph and Sequence Tokens on Text Generation With GPT2", Association for Computational Linguistics (ACL), 2021.
,
"Serverless BM25 Search and BERT Reranking", Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES), 2021.
,
"Significant Improvements Over the State of the Art? A Case Study Of the MS MARCO Document Ranking Leaderboard", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Simple and Effective Unsupervised Redundancy Elimination to Compress Dense Vectors for Passage Retrieval", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
,
"The Art of Abstention: Selective Prediction and Error Regularization For Natural Language Processing", Association for Computational Linguistics (ACL), 2021.
,
"The Simplest Thing That Can Possibly Work: (Pseudo-)Relevance Feedback Via Text Classification", International Conference on the Theory of Information Retrieval (ICTIR), 2021.
,
"TimeFabric: Trusted Time for Permissioned Blockchains", International Symposium on Foundations and Applications of Blockchain (FAB) , 2021.
,
"Unsupervised Chunking as Syntactic Structure Induction With a Knowledge-Transfer Approach", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
,
"UWaterlooMDS at the TREC 2021 Health Misinformation Track", Text Retrieval Conference (TREC), 2021.
,
"Vera: Prediction Techniques for Reducing Harmful Misinformation In Consumer Health Search", International Conference on Research and Development in Information Retrieval (SIGIR), 2021.
,
"Visualizing Searcher Gaze Patterns", Conference on Human Information Interaction and Retrieval (CHIIR), 2021.
,
"Voice Query Auto Completion", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
,
"WaterlooClarke at the TREC 2021 Conversational Assistant Track", Text Retrieval Conference (TREC), 2021.
,
"A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework For Information Retrieval Techniques", ArXiv, vol. abs/2106.14807, 2021.
,
"A Proposed Conceptual Framework for a Representational Approach To Information Retrieval", ArXiv, vol. abs/2110.01529, 2021.
,
"A Proposed Conceptual Framework for a Representational Approach To Information Retrieval", SIGIR Forum, vol. 55, issue 2, pp. 4:1--4:29, 2021.
,
"A Replication Study of Dense Passage Retriever", ArXiv, vol. abs/2104.05740, 2021.
,
"Accurate Summary-Based Cardinality Estimation Through the Lens Of Cardinality Estimation Graphs", ArXiv, vol. abs/2105.08878, 2021.
,
"Assessing Top- Preferences", ACM Transactions on Information Systems (TOIS), vol. 39, issue 3, pp. 33:1--33:21, 2021.
,
"Catch a Blowfish Alive: A Demonstration of Policy-Aware Differential Privacy for Interactive Data Exploration", Proceedings of the VLDB Endowment (PVLDB), vol. 14, issue 12, pp. 2859--2862, 2021.
,
"Climate Action During COVID-19 Recovery and Beyond: A Twitter Text Mining Study", ArXiv, vol. abs/2105.12190, 2021.
,
"Columnar Storage and List-Based Processing for Graph Database Management Systems", Proceedings of the VLDB Endowment (PVLDB), vol. 14, issue 11, pp. 2491--2504, 2021.
,
"Contextualized Query Embeddings for Conversational Search", ArXiv, vol. abs/2104.08707, 2021.
,
"Cross-Lingual Training With Dense Retrieval for Document Retrieval", ArXiv, vol. abs/2109.01628, 2021.
,
"Densifying Sparse Representations for Passage Retrieval by Representational Slicing", ArXiv, vol. abs/2112.04666, 2021.
,
"Differential Privacy for Databases", Foundations and Trends in Databases, vol. 11, issue 2, pp. 109--225, 2021.
,
"Discovery and Contextual Data Cleaning With Ontology Functional Dependencies", ArXiv, vol. abs/2105.08105, 2021.
,
"Distributed Database Systems: The Case for NewSQL", Transactions on Large-Scale Data- and Knowledge-Centered Systems, vol. 48, pp. 1--15, 2021.
,
"DP-cryptography: Marrying Differential Privacy and Cryptography In Emerging Applications", Communications of the ACM, vol. 64, issue 2, pp. 84--93, 2021.
,
"Efficient Discovery of Approximate Order Dependencies", ArXiv, vol. abs/2101.02174, 2021.
,
"Efficiently Teaching an Effective Dense Retriever With Balanced Topic Aware Sampling", ArXiv, vol. abs/2104.06967, 2021.
,
"Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins", ArXiv, vol. abs/2106.01501, 2021.
,
"Ember: No-Code Context Enrichment via Similarity-Based Keyless Joins", Proceedings of the VLDB Endowment (PVLDB), vol. 15, issue 3, pp. 699--712, 2021.
,
"Encoder Adaptation of Dense Passage Retrieval for Open-Domain Question Answering", ArXiv, vol. abs/2110.01599, 2021.
,
"Evaluating Complex Queries on Streaming Graphs", ArXiv, vol. abs/2101.12305, 2021.
,
"Fostering Community Engagement Through Datathon Events: The Archives Unleashed Experience", Digital Humanities Quarterly, vol. 15, issue 1, 2021.
,
"GSmart: An Efficient SPARQL Query Engine Using Sparse Matrix Algebra - Full Version", ArXiv, vol. abs/2106.14038, 2021.
,
"Improving Query Representations for Dense Retrieval With Pseudo Relevance Feedback: A Reproducibility Study", ArXiv, vol. abs/2112.06400, 2021.
,
"Integrating Column-Oriented Storage and Query Processing Techniques Into Graph Database Management Systems", ArXiv, vol. abs/2103.02284, 2021.
,
"Investigating the Limitations of the Transformers With Simple Arithmetic Tasks", ArXiv, vol. abs/2102.13019, 2021.
,
"Kamino: Constraint-Aware Differentially Private Data Synthesis", Proceedings of the VLDB Endowment (PVLDB), vol. 14, issue 10, pp. 1886--1899, 2021.
,
"Making RDBMSs Efficient on Graph Workloads Through Predefined Joins", ArXiv, vol. abs/2108.10540, 2021.
,
"Mr. TyDi: A Multi-Lingual Benchmark for Dense Retrieval", ArXiv, vol. abs/2108.08787, 2021.
,
"MS MARCO: Benchmarking Ranking Models in the Large-Data Regime", ArXiv, vol. abs/2105.04021, 2021.
,
"Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting", ACM Transactions on Information Systems (TOIS), vol. 39, issue 4, pp. 48:1--48:29, 2021.
,
"Optimizing Multi-Query Evaluation in Federated RDF Systems", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 33, issue 4, pp. 1692--1707, 2021.
,
"Optimizing One-Time and Continuous Subgraph Queries Using Worst-Case Optimal Joins", ACM Transactions on Database Systems (TODS), vol. 46, issue 2, pp. 6:1--6:45, 2021.
,
"PCOR: Private Contextual Outlier Release via Differentially Private Search", ArXiv, vol. abs/2103.05173, 2021.
,
"Predicting Efficiency/Effectiveness Trade-Offs for Dense vs. Sparse Retrieval Strategy Selection", ArXiv, vol. abs/2109.10739, 2021.
,
"Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research With Sparse and Dense Representations", ArXiv, vol. abs/2102.10073, 2021.
,
"Real-Time LSM-Trees for HTAP Workloads", ArXiv, vol. abs/2101.06801, 2021.
,
"Report on the 15th Round of NII Testbeds and Community for Information Access Research (NTCIR-15)", SIGIR Forum, vol. 55, issue 2, pp. 21:1--21:6, 2021.
,
"Scale-Invariant Strength Assortativity of Streaming Butterflies", ArXiv, vol. abs/2111.12217, 2021.
,
"sGrapp: Butterfly Approximation in Streaming Graphs", ArXiv, vol. abs/2101.12334, 2021.
,
"Shallow Pooling for Sparse Labels", ArXiv, vol. abs/2109.00062, 2021.
,
"Significant Improvements Over the State of the Art? A Case Study Of the MS MARCO Document Ranking Leaderboard", ArXiv, vol. abs/2102.12887, 2021.
,
"Sparsifying Sparse Representations for Passage Retrieval by Top-K Masking", ArXiv, vol. abs/2112.09628, 2021.
,
"The eDiscovery Medicine Show", ArXiv, vol. abs/2109.13908, 2021.
,
"The Expando-Mono-Duo Design Pattern for Text Ranking With Pretrained Sequence-to-Sequence Models", ArXiv, vol. abs/2101.05667, 2021.
,
"The Future Is Big Graphs: A Community View on Graph Processing Systems", Communications of the ACM, vol. 64, issue 9, pp. 62--71, 2021.
,
"The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction", Environmental Modelling and Software, vol. 135, pp. 104926, 2021.
,
"The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection", ArXiv, vol. abs/2111.04906, 2021.
,
"Translating Human Mobility Forecasting Through Natural Language Generation", ArXiv, vol. abs/2112.11481, 2021.
,
"Unbiased Statistical Estimation and Valid Confidence Intervals Under Differential Privacy", ArXiv, vol. abs/2110.14465, 2021.
,
"Wacky Weights in Learned Sparse Representations and the Revenge Of Score-at-a-Time Query Evaluation", ArXiv, vol. abs/2110.11540, 2021.
,
2020
Principles of Distributed Database Systems, 4th Edition: Springer, 2020.
,
"A Framework for Extracted View Maintenance", ACM Symposium on Document Engineering (DocEng), 2020.
,
"A Lightweight Environment for Learning Experimental IR Research Practices", International Conference on Research and Development in Information Retrieval (SIGIR), 2020.
,
"A Little Bit Is Worse Than None: Ranking With Limited Training Data", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"A Mixed-Method Analysis of Text and Audio Search Interfaces With Varying Task Complexity", International Conference on the Theory of Information Retrieval (ICTIR), 2020.
,
"A Think-Aloud Study to Understand Factors Affecting Online Health Search", Conference on Human Information Interaction and Retrieval (CHIIR), 2020.
,
"An Open-Source Interface to the Canadian Surface Prediction Archive", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2020.
,
"Approximate Nearest Neighbor Search and Lightweight Dense Vector Reranking In Multi-Stage Retrieval Architectures", International Conference on the Theory of Information Retrieval (ICTIR), 2020.
,
"Attention-Based Learning for Missing Data Imputation in HoloClean", Conference on Machine Learning and Systems (MLSys), 2020.
,
"Capreolus: A Toolkit for End-to-End Neural Ad Hoc Retrieval", Web Search and Data Mining (WSDM), 2020.
,
"ChronoCache: Predictive and Adaptive Mid-Tier Query Result Caching", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"Computing Local Sensitivities of Counting Queries With Joins", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"Consentio: Managing Consent to Data Access Using Permissioned Blockchains", IEEE International Conference on Blockchain and Cryptocurrency (ICBC), 2020.
,
"Content-Based Exploration of Archival Images Using Neural Networks", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2020.
,
"Covidex: Neural Ranking Models and Keyword Search Infrastructure For The COVID-19 Open Research Dataset", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"Cross-Lingual Training of Neural Models for Document Ranking", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"Crypt?: Crypto-Assisted Differential Privacy on Untrusted Servers", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"Cydex: Neural Search Infrastructure for the Scholarly Literature", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference", Association for Computational Linguistics (ACL), 2020.
,
"Designing Templates for Eliciting Commonsense Knowledge From Pretrained Sequence-to-Sequence Models", International Conference on Computational Linguistics (COLING), 2020.
,
"Distant Supervision for Multi-Stage Fine-Tuning in Retrieval-Based Question Answering", The Web Conference (WWW), 2020.
,
"Document Ranking With a Pretrained Sequence-to-Sequence Model", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"Dowsing for Math Answers With Tangent-L", Conference and Labs of the Evaluation Forum (CLEF), 2020.
,
"DynaMast: Adaptive Dynamic Mastering for Replicated Systems", IEEE International Conference on Data Engineering (ICDE), 2020.
,
"Early Exiting BERT for Efficient Document Ranking", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"ELite: Cost-Effective Approximation of Exploration-Based Graph Analysis", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"Erratum for Discovering Order Dependencies Through Order Compatibility (Edbt 2019)", International Conference on Extending Database Technology (EDBT), 2020.
,
"Evaluating Pretrained Transformer Models for Citation Recommendation", International Workshop on Bibliometric-enhanced Information Retrieval (BIR), 2020.
,
"Exploring the Limits of Simple Learners in Knowledge Distillation For Document Classification With DocBERT", Workshop on Representation Learning for NLP (RepL4NLP), 2020.
,
"First Order Rewritability for Ontology Mediated Querying in Horn-DLFD", International Workshop on Description Logics (DL), 2020.
,
"Flexible IR Pipelines With Capreolus", International Conference on Information and Knowledge Management (CIKM), 2020.
,
"From MAXSCORE to Block-Max Wand: The Story of How Lucene Significantly Improved Query Evaluation Performance", European Conference on Information Retrieval (ECIR), 2020.
,
"G-Thinker: A Distributed Framework for Mining Subgraphs in a Big Graph", IEEE International Conference on Data Engineering (ICDE), 2020.
,
"Generalized and Scalable Optimal Sparse Decision Trees", International Conference on Machine Learning (ICML), 2020.
,
"GSI: GPU-friendly Subgraph Isomorphism", IEEE International Conference on Data Engineering (ICDE), 2020.
,
"H2oloo at TREC 2020: When All You Got Is a Hammer... Deep Learning, Health Misinformation, and Precision Medicine", Text Retrieval Conference (TREC), 2020.
,
"Inserting Information Bottleneck for Attribution in Transformers", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
,
"Iterative Edit-Based Unsupervised Sentence Simplification", Association for Computational Linguistics (ACL), 2020.
,
"Leaving Stragglers at the Window: Low-Latency Stream Sampling With Accuracy Guarantees", Distributed Event-Based Systems (DEBS), 2020.
,
"Linear and Range Counting Under Metric-Based Local Differential Privacy", International Symposium on Information Theory (ISIT), 2020.
,
"Locating Influential Agents in Social Networks: Budget-Constrained Seed Set Selection", Canadian Conference on Artificial Intelligence (AI), 2020.
,
"Made to Measure: A Workshop on Human-Centred Metrics for Information Seeking", Conference on Human Information Interaction and Retrieval (CHIIR), 2020.
,
"Message From the General Chairs of DSC 2020", International Conference on Data Science in Cyberspace (DSC), 2020.
,
"MRG_UWaterloo Participation in the TREC 2020 Precision Medicine Track", Text Retrieval Conference (TREC), 2020.
,
"Offline Evaluation by Maximum Similarity to an Ideal Ranking", International Conference on Information and Knowledge Management (CIKM), 2020.
,
"Offline Evaluation Without Gain", International Conference on the Theory of Information Retrieval (ICTIR), 2020.
,
"Overview of the TREC 2020 Health Misinformation Track", Text Retrieval Conference (TREC), 2020.
,
"Parallel Scheduling of Data-Intensive Tasks", European Conference on Parallel Processing (Euro-Par), 2020.
,
"Reddit Mining to Understand Gendered Movements", International Conference on Extending Database Technology (EDBT), 2020.
,
"Reddit Mining to Understand Women's Issues in STEM", International Conference on Extending Database Technology (EDBT), 2020.
,
"Regular Path Query Evaluation on Streaming Graphs", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"Reproducibility Is a Process, Not an Achievement: The Replicability Of IR Reproducibility Experiments", European Conference on Information Retrieval (ECIR), 2020.
,
"Research Challenges in Deep Reinforcement Learning-Based Join Query Optimization", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"ReSpark: Automatic Caching for Iterative Applications in Apache Spark", IEEE International Conference on Big Data (IEEE BigData), 2020.
,
"Sentinel: Understanding Data Systems", ACM International Conference on Management of Data (SIGMOD), 2020.
,
"Showing Your Work Doesn't Always Work", Association for Computational Linguistics (ACL), 2020.
,
"SimClusters: Community-Based Representations for Heterogeneous Recommendations At Twitter", ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2020.
,
"Social Media Mining to Understand the Impact of Co-Operative Education On Mental Health", Educational Data Mining (EDM), 2020.
,
"Streaming Graph Processing and Analytics", Distributed Event-Based Systems (DEBS), 2020.
,
"Supporting Interoperability Between Open-Source Search Engines With The Common Index File Format", International Conference on Research and Development in Information Retrieval (SIGIR), 2020.
,
"Text Mining of COVID-19 Discussions on Reddit", IEEE/WIC/ACM International Conference on Web Intelligence (WI), 2020.
,
"The Archives Unleashed Project: Technology, Process, and Community To Improve Scholarly Access to Web Archives", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2020.
,
"TREC 2020 Notebook: CAsT Track", Text Retrieval Conference (TREC), 2020.
,
"Two Birds, One Stone: A Simple, Unified Model for Text Generation From Structured and Unstructured Data", Association for Computational Linguistics (ACL), 2020.
,
"Update Delivery Mechanisms for Prospective Information Needs: A Reproducibility Study", Conference on Human Information Interaction and Retrieval (CHIIR), 2020.
,
"WaterlooClarke at the Trec 2020 Conversational Assistant Track", Text Retrieval Conference (TREC), 2020.
,
"We Could, but Should We?: Ethical Considerations for Providing Access To GeoCities and Other Historical Digital Collections", Conference on Human Information Interaction and Retrieval (CHIIR), 2020.
,
"Which BM25 Do You Mean? A Large-Scale Reproducibility Study Of Scoring Variants", European Conference on Information Retrieval (ECIR), 2020.
,
"XOX Fabric: A Hybrid Approach to Blockchain Transaction Execution", IEEE International Conference on Blockchain and Cryptocurrency (ICBC), 2020.
,
"A Data Scientist's Guide to Streamflow Prediction", ArXiv, vol. abs/2006.12975, 2020.
,
"A Prototype of Serverless Lucene", ArXiv, vol. abs/2002.01447, 2020.
,
"A Systematic View of Data Science", IEEE Data Engineering Bulletin, vol. 43, issue 3, pp. 3--11, 2020.
,
"A+ Indexes: Lightweight and Highly Flexible Adjacency Lists For Graph Database Management Systems", ArXiv, vol. abs/2004.00130, 2020.
,
"aeSpTV: An Adaptive and Efficient Framework for Sparse Tensor-Vector Product Kernel on a High-Performance Computing Platform", IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 31, issue 10, pp. 2329--2345, 2020.
,
"Approximate Denial Constraints", Proceedings of the VLDB Endowment (PVLDB), vol. 13, issue 10, pp. 1682--1695, 2020.
,
"Approximate Denial Constraints", ArXiv, vol. abs/2005.08540, 2020.
,
"Assessing Top-K Preferences", ArXiv, vol. abs/2007.11682, 2020.
,
"Batchwise Probabilistic Incremental Data Cleaning", ArXiv, vol. abs/2011.04730, 2020.
,
"Building Community at Distance: A Datathon During COVID-19", Digital Library Perspectives, vol. 36, issue 4, pp. 415--428, 2020.
,
"Compact Group Discovery in Attributed Graphs and Social Networks", Information Processing and Management, vol. 57, issue 2, pp. 102054, 2020.
,
"Computing Local Sensitivities of Counting Queries With Joins", ArXiv, vol. abs/2004.04656, 2020.
,
"Conversational Question Reformulation via Sequence-to-Sequence Architectures And Pretrained Language Models", ArXiv, vol. abs/2004.01909, 2020.
,
"Covidex: Neural Ranking Models and Keyword Search Infrastructure For The COVID-19 Open Research Dataset", ArXiv, vol. abs/2007.07846, 2020.
,
"DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference", ArXiv, vol. abs/2004.12993, 2020.
,
"Detecting Opportunities for Differential Maintenance of Extracted Views", ArXiv, vol. abs/2007.01973, 2020.
,
"Discovering Domain Orders Through Order Dependencies", ArXiv, vol. abs/2005.14068, 2020.
,
"Distilling Dense Representations for Ranking Using Tightly-Coupled Teachers", ArXiv, vol. abs/2010.11386, 2020.
,
"Document Ranking With a Pretrained Sequence-to-Sequence Model", ArXiv, vol. abs/2003.06713, 2020.
,
"DP-Cryptography: Marrying Differential Privacy and Cryptography In Emerging Applications", ArXiv, vol. abs/2004.08887, 2020.
,
"Evaluating Sentence-Level Relevance Feedback for High-Recall Information Retrieval", Information Retrieval Journal, vol. 23, issue 1, pp. 1--26, 2020.
,
"FastFabric: Scaling Hyperledger Fabric to 20 000 Transactions Per Second", International Journal of Network Management, vol. 30, issue 5, 2020.
,
"Generalized Optimal Sparse Decision Trees", ArXiv, vol. abs/2006.08690, 2020.
,
"Graphsurge: Graph Analytics on View Collections Using Differential Computation", ArXiv, vol. abs/2004.05297, 2020.
,
"Howl: A Deployed, Open-Source Wake Word Detection System", ArXiv, vol. abs/2008.09606, 2020.
,
"Inserting Information Bottlenecks for Attribution in Transformers", ArXiv, vol. abs/2012.13838, 2020.
,
"Introduction to the Special Issue on Self-Managing and Hardware-Optimized Database Systems 2019", Distributed and Parallel Databases, vol. 38, issue 4, pp. 767--769, 2020.
,
"Iterative Edit-Based Unsupervised Sentence Simplification", ArXiv, vol. abs/2006.09639, 2020.
,
"Kamino: Constraint-Aware Differentially Private Data Synthesis", ArXiv, vol. abs/2012.15713, 2020.
,
"Latte-Mix: Measuring Sentence Semantic Similarity With Latent Categorical Mixtures", ArXiv, vol. abs/2010.11351, 2020.
,
"Micro-Journal Mining to Understand Mood Triggers", Computing, vol. 102, issue 5, pp. 1227--1244, 2020.
,
"MorphoSys: Automatic Physical Design Metamorphosis for Distributed Database Systems", Proceedings of the VLDB Endowment (PVLDB), vol. 13, issue 13, pp. 3573--3587, 2020.
,
"Navigation-Based Candidate Expansion and Pretrained Language Models For Citation Recommendation", Scientometrics, vol. 125, issue 3, pp. 3001--3016, 2020.
,
"Navigation-Based Candidate Expansion and Pretrained Language Models For Citation Recommendation", ArXiv, vol. abs/2001.08687, 2020.
,
"On Sampling From Data With Duplicate Records", ArXiv, vol. abs/2008.10549, 2020.
,
"Participation in TREC 2020 COVID Track Using Continuous Active Learning", ArXiv, vol. abs/2011.01453, 2020.
,
"Pretrained Transformers for Text Ranking: BERT and Beyond", ArXiv, vol. abs/2010.06467, 2020.
,
"Query Reformulation Using Query History for Passage Retrieval in Conversational Search", ArXiv, vol. abs/2005.02230, 2020.
,
"Rainfall-Runoff Prediction at Multiple Timescales With a Single Long Short-Term Memory Network", ArXiv, vol. abs/2010.07921, 2020.
,
"Rapid Adaptation of BERT for Information Extraction on Domain-Specific Business Documents", ArXiv, vol. abs/2002.01861, 2020.
,
"Rapidly Bootstrapping a Question Answering Dataset for COVID-19", ArXiv, vol. abs/2004.11339, 2020.
,
"Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned", ArXiv, vol. abs/2004.05125, 2020.
,
"Record Fusion: A Learning Approach", ArXiv, vol. abs/2006.10208, 2020.
,
"Regular Path Query Evaluation on Streaming Graphs", ArXiv, vol. abs/2004.02012, 2020.
,
"Robust Keyword Search in Large Attributed Graphs", Information Retrieval Journal, vol. 23, issue 5, pp. 502--524, 2020.
,
"SAQE: Practical Privacy-Preserving Approximate Query Processing For Data Federations", Proceedings of the VLDB Endowment (PVLDB), vol. 13, issue 11, pp. 2691--2705, 2020.
,
"Scalable Mining of Maximal Quasi-Cliques: An Algorithm-System Codesign Approach", Proceedings of the VLDB Endowment (PVLDB), vol. 14, issue 4, pp. 573--585, 2020.
,
"Scalable Mining of Maximal Quasi-Cliques: An Algorithm-System Codesign Approach", ArXiv, vol. abs/2005.00081, 2020.
,
"Scientific Claim Verification With VERT5ERINI", ArXiv, vol. abs/2010.11930, 2020.
,
"SegaBERT: Pre-Training of Segment-Aware BERT for Language Understanding", ArXiv, vol. abs/2004.14996, 2020.
,
"Semantics of the Unwritten", ArXiv, vol. abs/2004.02251, 2020.
,
"Sentinel: Universal Analysis and Insight for Data Systems", Proceedings of the VLDB Endowment (PVLDB), vol. 13, issue 11, pp. 2720--2733, 2020.
,
"Showing Your Work Doesn't Always Work", ArXiv, vol. abs/2004.13705, 2020.
,
"Special Issue on Best Papers of DaMoN 2018", The VLDB Journal, vol. 29, issue 2-3, pp. 755, 2020.
,
"Special Issue on Best Papers of VLDB 2017", The VLDB Journal, vol. 29, issue 1, pp. 483--484, 2020.
,
"Supporting Interoperability Between Open-Source Search Engines With The Common Index File Format", ArXiv, vol. abs/2003.08276, 2020.
,
"The Archives Unleashed Project: Technology, Process, and Community To Improve Scholarly Access to Web Archives", ArXiv, vol. abs/2001.05399, 2020.
,
"The Future Is Big Graphs! A Community View on Graph Processing Systems", ArXiv, vol. abs/2012.06171, 2020.
,
"The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey", The VLDB Journal, vol. 29, issue 2-3, pp. 595--618, 2020.
,
"To Paraphrase or Not to Paraphrase: User-Controllable Selective Paraphrase Generation", ArXiv, vol. abs/2008.09290, 2020.
,
"TTTTTackling WinoGrande Schemas", ArXiv, vol. abs/2003.08380, 2020.
,
"Using Feature-Based Description Logics to Avoid Duplicate Elimination In Object-Relational Query Languages", German Journal of Artificial Intelligence (KI), vol. 34, issue 3, pp. 355--363, 2020.
,
2019
Data Cleaning: ACM, 2019.
,
"Data Unification at Scale: Data Tamer", Making Databases Work: the Pragmatic Wisdom of Michael Stonebraker: ACM / Morgan & Claypool, 2019.
,
"Graph Query Processing", Encyclopedia of Big Data Technologies: Springer, 2019.
,
"Types of Stream Processing Algorithms", Encyclopedia of Big Data Technologies: Springer, 2019.
,
"A Formal Framework for Probabilistic Unclean Databases", International Conference on Database Theory (ICDT), 2019.
,
"A Semi-Supervised Framework of Clustering Selection for De-Duplication", IEEE International Conference on Data Engineering (ICDE), 2019.
,
"Aligning Cross-Lingual Entities With Multi-Aspect Information", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"APEx: Accuracy-Aware Differentially Private Data Exploration", ACM International Conference on Management of Data (SIGMOD), 2019.
,
"Applying BERT to Document Retrieval With Birch", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"Approximate Inference in Structured Instances With Noisy Categorical Observations", Conference on Uncertainty in Artificial Intelligence (UAI), 2019.
,
"Bridging the Gap Between Relevance Matching and Semantic Matching For Short Text Similarity Modeling", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"Bring Order to Data", Alberto Mendelzon International Workshop on Foundations of Data Management (AMW), 2019.
,
"Building Community and Tools for Analyzing Web Archives Through Datathons", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2019.
,
"Building Scalable Machine Learning Solutions for Data Cleaning", Datenbanksysteme für Business, Technologie und Web(BTW), 2019.
,
"Challenges and Opportunities in Understanding Spoken Queries Directed At Modern Entertainment Platforms", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Critically Examining the "Neural Hype": Weak Baselines and the Additivity Of Effectiveness Gains From Neural Ranking Models", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"DaMoN 19: The 15th International Workshop on Data Management on New Hardware", ACM International Conference on Management of Data (SIGMOD), 2019.
,
"Detecting Customer Complaint Escalation With Recurrent Neural Networks And Manually-Engineered Features", North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
,
"Distributed Discovery of Functional Dependencies", IEEE International Conference on Data Engineering (ICDE), 2019.
,
"DPI: The Data Processing Interface for Modern Networks", Conference on Innovative Data Systems Research (CIDR), 2019.
,
"Dynamic Sampling Meets Pooling", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"End-to-End Open-Domain Question Answering With BERTserini", North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
,
"Exhaustive Query Answering via Referring Expressions", International Workshop on Description Logics (DL), 2019.
,
"Experimental Analysis of Streaming Algorithms for Graph Partitioning", ACM International Conference on Management of Data (SIGMOD), 2019.
,
"ExplIQuE: Interactive Databases Exploration With SQL", International Conference on Information and Knowledge Management (CIKM), 2019.
,
"FastFabric: Scaling Hyperledger Fabric to 20, 000 Transactions Per Second", IEEE International Conference on Blockchain and Cryptocurrency (ICBC), 2019.
,
"Finding ALL Answers to OBDA Queries Using Referring Expressions", Australian Joint Conference on Artificial Intelligence (AUS-AI), 2019.
,
"FunDL - A Family of Feature-Based Description Logics, With Applications In Querying Structured Data Sources", Description Logic, Theory Combination, and All That - Essays Dedicated to Franz Baader, 2019.
,
"Gender Differences in Science and Engineering: A Data Mining Approach", International Conference on Extending Database Technology (EDBT), 2019.
,
"Gender Differences in Work-Integrated Learning Assessments", Educational Data Mining (EDM), 2019.
,
"GraphWrangler: An Interactive Graph View on Relational Data", ACM International Conference on Management of Data (SIGMOD), 2019.
,
"HoloDetect: Few-Shot Learning for Error Detection", ACM International Conference on Management of Data (SIGMOD), 2019.
,
"Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"Identification and Ranking of Biomedical Informatics Researcher Citation Statistics Through a Google Scholar Scraper", American Medical Informatics Association Annual Symposium (AMIA), 2019.
,
"Identity Resolution in Ontology Based Data Access to Structured Data Sources", Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2019.
,
"Incorporating Contextual and Syntactic Structures Improves Semantic Similarity Modeling", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"Information Retrieval Meets Scalable Text Analytics: Solr Integration With Spark", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Informative Summarization of Numeric Data", International Conference on Statistical and Scientific Database Management (SSDBM), 2019.
,
"Length Normalization in the Era of Neural Rankers", International Workshop on Evaluating Information Access (EVIA), 2019.
,
"Mitigating Trust Issues in Electric Vehicle Charging Using a Blockchain", Energy-Efficient Computing and Networking (e-Energy), 2019.
,
"Multi-Perspective Relevance Matching With Hierarchical ConvNets For Social Media Search", AAAI Conference on Artificial Intelligence (AAAI), 2019.
,
"Natural Language Generation for Effective Knowledge Distillation", Workshop on Deep Learning Approaches for Low-Resource Natural Language Processing (DeepLo), 2019.
,
"On Limited Conjunctions and Partial Features in Parameter-Tractable Feature Logics", AAAI Conference on Artificial Intelligence (AAAI), 2019.
,
"On Special Description Logics for Processes and Plans", International Workshop on Description Logics (DL), 2019.
,
"Online Abuse Detection: The Value of Preprocessing and Neural Attention Models", Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), 2019.
,
"Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019)", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Patterns of Search Result Examination: Query to First Action", International Conference on Information and Knowledge Management (CIKM), 2019.
,
"Predictable and Consistent Information Extraction", ACM Symposium on Document Engineering (DocEng), 2019.
,
"Privacy Changes Everything", Very Large Data Bases Conference (VLDB), 2019.
,
"Quantifying Bias and Variance of System Rankings", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Query and Answer Expansion From Conversation History", Text Retrieval Conference (TREC), 2019.
,
"Reproducing and Generalizing Semantic Term Matching in Axiomatic Information Retrieval", European Conference on Information Retrieval (ECIR), 2019.
,
"Rethinking Complex Neural Network Architectures for Document Classification", North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
,
"Scalable Content-Based Analysis of Images in Web Archives With TensorFlow And the Archives Unleashed Toolkit", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2019.
,
"Semi-Supervised Clustering for De-Duplication", International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.
,
"Sift: Resource-Efficient Consensus With RDMA", Conference on Emerging Network Experiment and Technology (CoNEXT), 2019.
,
"Simple Attention-Based Representation Learning for Ranking Short Social Media Posts", North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
,
"Simple Techniques for Cross-Collection Relevance Feedback", European Conference on Information Retrieval (ECIR), 2019.
,
"Solr Integration in the Anserini Information Retrieval Toolkit", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"T-Thinker: A Task-Centric Distributed Framework for Compute-Intensive Divide-and-Conquer Algorithms", ACM Symposium on Principles & Practice of Parallel Programming (PPoPP), 2019.
,
"The Archives Unleashed Notebook: Madlibs for Jumpstarting Scholarly Exploration of Web Archives", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2019.
,
"The Cost of a WARC: Analyzing Web Archives in the Cloud", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2019.
,
"The Impact of Score Ties on Repeatability in Document Ranking", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"The SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019)", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Time Constrained Continuous Subgraph Search Over Streaming Graphs", IEEE International Conference on Data Engineering (ICDE), 2019.
,
"Time-Limits and Summaries for Faster Relevance Assessing", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Unbiased Low-Variance Estimators for Precision and Related Information Retrieval Effectiveness Measures", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Universal Voice-Enabled User Interfaces Using JavaScript", International Conference on Intelligent User Interfaces (IUI), 2019.
,
"University of Waterloo Docker Images for OSIRRC at SIGIR 2019", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Unsupervised String Transformation Learning for Entity Consolidation", IEEE International Conference on Data Engineering (ICDE), 2019.
,
"UWaterlooMDS at the TREC 2019 Decision Track", Text Retrieval Conference (TREC), 2019.
,
"Warclight: A Rails Engine for Web Archive Discovery", ACM/IEEE Joint Conference on Digital Libraries (JCDL), 2019.
,
"WatDFS: A Project for Understanding Distributed Systems in the Undergraduate Curriculum", Technical Symposium on Computer Science Education (SIGCSE), 2019.
,
"WaterlooClarke at the TREC 2019 Conversational Assistant Track", Text Retrieval Conference (TREC), 2019.
,
"What Part of the Neural Network Does This? Understanding LSTMs By Measuring and Dissecting Neurons", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
,
"Yelling at Your TV: An Analysis of Speech Recognition Errors And Subsequent User Behavior on Entertainment Systems", International Conference on Research and Development in Information Retrieval (SIGIR), 2019.
,
"Aligning Cross-Lingual Entities With Multi-Aspect Information", ArXiv, vol. abs/1910.06575, 2019.
,
"Approximate Inference in Structured Instances With Noisy Categorical Observations", ArXiv, vol. abs/1907.00141, 2019.
,
"Attentive Student Meets Multi-Task Teacher: Improved Knowledge Distillation For Pretrained Models", ArXiv, vol. abs/1911.03588, 2019.
,
"Box Covers and Domain Orderings for Beyond Worst-Case Join Processing", ArXiv, vol. abs/1909.12102, 2019.
,
"Building Self-Clustering RDF Databases Using Tunable-LSH", The VLDB Journal, vol. 28, issue 2, pp. 173--195, 2019.
,
"Consentio: Managing Consent to Data Access Using Permissioned Blockchains", ArXiv, vol. abs/1910.07110, 2019.
,
"Correlation Constraint Shortest Path Over Large Multi-Relation Graphs", Proceedings of the VLDB Endowment (PVLDB), vol. 12, issue 5, pp. 488--501, 2019.
,
"Critically Examining the "Neural Hype": Weak Baselines and the Additivity Of Effectiveness Gains From Neural Ranking Models", ArXiv, vol. abs/1904.09171, 2019.
,
"Cross-Lingual Relevance Transfer for Document Retrieval", ArXiv, vol. abs/1911.02989, 2019.
,
"Cross-Lingual Text Alignment for Fine-Grained Plagiarism Detection", Journal of Information Science, vol. 45, issue 4, 2019.
,
"Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering", ArXiv, vol. abs/1904.06652, 2019.
,
"Design of Algorithms Under Policy-Aware Local Differential Privacy: Utility-Privacy Trade-Offs", ArXiv, vol. abs/1909.11778, 2019.
,
"DimmStore: Memory Power Optimization for Database Systems", Proceedings of the VLDB Endowment (PVLDB), vol. 12, issue 11, pp. 1499--1512, 2019.
,
"Distilling Task-Specific Knowledge From BERT Into Simple Neural Networks", ArXiv, vol. abs/1903.12136, 2019.
,
"Distributed Dependency Discovery", ArXiv, vol. abs/1903.05228, 2019.
,
"Distributed Implementations of Dependency Discovery Algorithms", Proceedings of the VLDB Endowment (PVLDB), vol. 12, issue 11, pp. 1624--1636, 2019.
,
"DocBERT: BERT for Document Classification", ArXiv, vol. abs/1904.08398, 2019.
,
"Document Expansion by Query Prediction", ArXiv, vol. abs/1904.08375, 2019.
,
"End-to-End Open-Domain Question Answering With BERTserini", ArXiv, vol. abs/1902.01718, 2019.
,
"Errata Note: Discovering Order Dependencies Through Order Compatibility", ArXiv, vol. abs/1905.02010, 2019.
,
"Explicit Pairwise Word Interaction Modeling Improves Pretrained Transformers For English Semantic Similarity Tasks", ArXiv, vol. abs/1911.02847, 2019.
,
"Exploiting Token and Path-Based Representations of Code for Identifying Security-Relevant Commits", ArXiv, vol. abs/1911.07620, 2019.
,
"FastFabric: Scaling Hyperledger Fabric to 20, 000 Transactions Per Second", ArXiv, vol. abs/1901.00910, 2019.
,
"GSI: GPU-friendly Subgraph Isomorphism", ArXiv, vol. abs/1906.03420, 2019.
,
"HoloDetect: Few-Shot Learning for Error Detection", ArXiv, vol. abs/1904.02285, 2019.
,
"Investigating Statistical Privacy Frameworks From the Perspective Of Hypothesis Testing", Proceedings on Privacy Enhancing Technologies (PoPETs), vol. 2019, issue 3, pp. 233--254, 2019.
,
"Lucene for Approximate Nearest-Neighbors Search on Arbitrary Dense Vectors", ArXiv, vol. abs/1910.10208, 2019.
,
"Matching Entities Across Different Knowledge Graphs With Graph Embeddings", ArXiv, vol. abs/1903.06607, 2019.
,
"Multi-Stage Document Ranking With BERT", ArXiv, vol. abs/1910.14424, 2019.
,
"Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins", Proceedings of the VLDB Endowment (PVLDB), vol. 12, issue 11, pp. 1692--1704, 2019.
,
"Optimizing Subgraph Queries by Combining Binary and Worst-Case Optimal Joins", ArXiv, vol. abs/1903.02076, 2019.
,
"Outis: Crypto-Assisted Differential Privacy on Untrusted Servers", ArXiv, vol. abs/1902.07756, 2019.
,
"Principles of Progress Indicators for Database Repairing", ArXiv, vol. abs/1904.06492, 2019.
,
"PrivateSQL: A Differentially Private SQL Query Engine", Proceedings of the VLDB Endowment (PVLDB), vol. 12, issue 11, pp. 1371--1384, 2019.
,
"Secure Multi-Party Functional Dependency Discovery", Proceedings of the VLDB Endowment (PVLDB), vol. 13, issue 2, pp. 184--196, 2019.
,
"Simple Applications of BERT for Ad Hoc Document Retrieval", ArXiv, vol. abs/1903.10972, 2019.
,
"Simple BERT Models for Relation Extraction and Semantic Role Labeling", ArXiv, vol. abs/1904.05255, 2019.
,
"Technical Report: Optimizing Human Involvement for Entity Matching And Consolidation", ArXiv, vol. abs/1906.06574, 2019.
,
"The Neural Hype, Justified!: A Recantation", SIGIR Forum, vol. 53, issue 2, pp. 88--93, 2019.
,
"The Performance Envelope of Inverted Indexing on Modern Hardware", ArXiv, vol. abs/1910.11028, 2019.
,
"The Proper Care and Feeding of CAMELS: How Limited Training Data Affects Streamflow Prediction", ArXiv, vol. abs/1911.07249, 2019.
,
"The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification", ArXiv, vol. abs/1904.08861, 2019.
,
"Two Birds, One Stone: A Simple, Unified Model for Text Generation From Structured and Unstructured Data", ArXiv, vol. abs/1909.10158, 2019.
,
"What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning", ArXiv, vol. abs/1911.03090, 2019.
,
"XOX Fabric: A Hybrid Approach to Transaction Execution", ArXiv, vol. abs/1906.11229, 2019.
,
2018
Data Profiling: Morgan & Claypool, 2018.
,
Encyclopedia of Database Systems, Second Edition: Springer, 2018.
,
"Abstract Versus Concrete Temporal Query Languages", Encyclopedia of Database Systems: Springer, 2018.
,
"Analyzing Your Location Data With Provable Privacy Guarantees", Springer Handbooks: Springer, 2018.
,
"Client-Server Architecture", Encyclopedia of Database Systems: Springer, 2018.
,
"Data Manipulation Language (DML)", Encyclopedia of Database Systems: Springer, 2018.
,
"Data Stream", Encyclopedia of Database Systems: Springer, 2018.
,
"Database", Encyclopedia of Database Systems: Springer, 2018.
,
"Database Administrator (DBA)", Encyclopedia of Database Systems: Springer, 2018.
,
"Document Databases", Encyclopedia of Database Systems: Springer, 2018.
,
"Enterprise Content Management", Encyclopedia of Database Systems: Springer, 2018.
,
"Hypertexts", Encyclopedia of Database Systems: Springer, 2018.
,
"Point-Stamped Temporal Models", Encyclopedia of Database Systems: Springer, 2018.
,
"Rank-Aware Query Processing", Encyclopedia of Database Systems: Springer, 2018.
,
"Rank-Join", Encyclopedia of Database Systems: Springer, 2018.
,
"Sagas", Encyclopedia of Database Systems: Springer, 2018.
,
"Stream Models", Encyclopedia of Database Systems: Springer, 2018.
,
"Summarization", Encyclopedia of Database Systems: Springer, 2018.
,
"Temporal Logic in Database Query Languages", Encyclopedia of Database Systems: Springer, 2018.
,
"Temporal Relational Calculus", Encyclopedia of Database Systems: Springer, 2018.
,
"Temporal Vacuuming", Encyclopedia of Database Systems: Springer, 2018.
,
"Top-K Queries", Encyclopedia of Database Systems: Springer, 2018.
,
"Web Question Answering", Encyclopedia of Database Systems: Springer, 2018.
,
"A Study of Immediate Requery Behavior in Search", Conference on Human Information Interaction and Retrieval (CHIIR), 2018.
,
"A System for Efficient High-Recall Retrieval", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"Algorithmic Aspects of Parallel Query Processing", ACM International Conference on Management of Data (SIGMOD), 2018.
,
"An Experimental Analysis of the Power Consumption of Convolutional Neural Networks for Keyword Spotting", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018.
,
"Apollo: Learning Query Correlations for Predictive Caching in Geo-Distributed Systems", International Conference on Extending Database Technology (EDBT), 2018.
,
"Beyond Pooling", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"Building Data Civilizer Pipelines With an Advanced Workflow Engine", IEEE International Conference on Data Engineering (ICDE), 2018.
,
"Carousel: Low-Latency Transaction Processing for Globally-Distributed Data", ACM International Conference on Management of Data (SIGMOD), 2018.
,
"Choosing Math Features for BM25 Ranking With Tangent-L", ACM Symposium on Document Engineering (DocEng), 2018.
,
"CNNs for NLP in the Browser: Client-Side Deployment and Visualization Opportunities", North American Chapter of the Association for Computational Linguistics (NAACL), 2018.
,
"Computing Without Servers, V8, Rocket Ships, and Other Batsh*t Crazy Ideas in Data Systems", Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES), 2018.
,
"Contextual Data Cleaning", IEEE International Conference on Data Engineering (ICDE), 2018.
,
"Data Analytics to Improve Co-Operative Education", International Conference on Extending Database Technology (EDBT), 2018.
,
"Deep Residual Learning for Small-Footprint Keyword Spotting", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018.
,
"Distribution-Aware Stream Partitioning for Distributed Stream Processing Systems", ACM International Conference on Management of Data (SIGMOD), 2018.
,
"EC-Store: Bridging the Gap Between Storage and Latency in Distributed Erasure Coded Systems", IEEE International Conference on Distributed Computing Systems (ICDCS), 2018.
,
"Effective Team Formation in Expert Networks", Alberto Mendelzon International Workshop on Foundations of Data Management (AMW), 2018.
,
"Effective User Interaction for High-Recall Retrieval: Less Is More", International Conference on Information and Knowledge Management (CIKM), 2018.
,
"Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia", International Conference on Computational Linguistics (COLING), 2018.
,
"Fashioning a Search Engine to Support Humanities Research", ACM Symposium on Document Engineering (DocEng), 2018.
,
"FASTOD: Bringing Order to Data", IEEE International Conference on Data Engineering (ICDE), 2018.
,
"FastOFD: Contextual Data Cleaning With Ontology Functional Dependencies", International Conference on Extending Database Technology (EDBT), 2018.
,
"Gender Differences in Undergraduate Engineering Applicants: A Text Mining Approach", Educational Data Mining (EDM), 2018.
,
"H2oloo at TREC 2018: Cross-Collection Relevance Transfer for The Common Core Track", Text Retrieval Conference (TREC), 2018.
,
"Identity Resolution in Conjunctive Querying Over DL-Based Knowledge Bases", International Workshop on Description Logics (DL), 2018.
,
"Job Description Mining to Understand Work-Integrated Learning", Educational Data Mining (EDM), 2018.
,
"MRG_UWaterloo Participation in the TREC 2018 Common Core Track", Text Retrieval Conference (TREC), 2018.
,
"Multi-Query Optimization in Federated RDF Systems", International Conference on Database Systems for Advanced Applications (DASFAA), 2018.
,
"Multi-Task Learning With Neural Networks for Voice Query Understanding On an Entertainment Platform", ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2018.
,
"On Limited Conjunctions in Polynomial Feature Logics, With Applications In OBDA", International Conference on Principles of Knowledge Representation and Reasoning (KR), 2018.
,
"Overview of the TREC 2018 Real-Time Summarization Track", Text Retrieval Conference (TREC), 2018.
,
"Pay-Per-Request Deployment of Neural Network Models Using Serverless Architectures", North American Chapter of the Association for Computational Linguistics (NAACL), 2018.
,
"Query Driven Algorithm Selection in Early Stage Retrieval", Web Search and Data Mining (WSDM), 2018.
,
"RaMP: A Lightweight RDMA Abstraction for Loosely Coupled Applications", USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2018.
,
"RecService: Distributed Real-Time Graph Processing at Twitter", USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2018.
,
"Refresh Strategies in Continuous Active Learning", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"Renormalization of NoSQL Database Schemas", International Conference on Conceptual Modeling (ER), 2018.
,
"Robust, Scalable, Real-Time Event Time Series Aggregation at Twitter", ACM International Conference on Management of Data (SIGMOD), 2018.
,
"Seeping Semantics: Linking Datasets Using Word Embeddings for Data Discovery", IEEE International Conference on Data Engineering (ICDE), 2018.
,
"Serverless Data Analytics With Flint", IEEE International Conference on Cloud Computing (CLOUD), 2018.
,
"Spectral Measures of Distortion for Change Detection in Dynamic Graphs", International Workshop on Complex Networks & Their Applications, 2018.
,
"Split-Lists and Initial Thresholds for WAND-based Search", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"Stream WatDiv: A Streaming RDF Benchmark", ACM International Conference on Management of Data (SIGMOD), 2018.
,
"Strong Baselines for Simple Question Answering Over Knowledge Graphs With and Without Neural Networks", North American Chapter of the Association for Computational Linguistics (NAACL), 2018.
,
"Technology-Assisted Review in Empirical Medicine: Waterloo Participation In CLEF eHealth 2018", Conference and Labs of the Evaluation Forum (CLEF), 2018.
,
"The Evolution of Content Analysis for Personalized Recommendations At Twitter", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"The Quest for Total Recall", ACM Symposium on Document Engineering (DocEng), 2018.
,
"The Utility of the Abstract Relational Model and Attribute Paths In SQL", International Conference Knowledge Engineering and Knowledge Management (EKAW), 2018.
,
"Tutorial: Adaptive Replication and Partitioning in Data Systems", International Middleware Conference (Middleware), 2018.
,
"Update Delivery Mechanisms for Prospective Information Needs: An Analysis Of Attention in Mobile Users", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"UWaterlooMDS at the TREC 2018 Common Core Track", Text Retrieval Conference (TREC), 2018.
,
"What Do Viewers Say to Their TVs?: An Analysis of Voice Queries To Entertainment Systems", International Conference on Research and Development in Information Retrieval (SIGIR), 2018.
,
"Workload-Aware CPU Performance Scaling for Transactional Database Systems", ACM International Conference on Management of Data (SIGMOD), 2018.
,
"A Formal Framework for Probabilistic Unclean Databases", ArXiv, vol. abs/1801.06750, 2018.
,
"A Location-Query-Browse Graph for Contextual Recommendation", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 30, issue 2, pp. 204--218, 2018.
,
"Adaptive Pruning of Neural Language Models for Mobile Devices", ArXiv, vol. abs/1809.10282, 2018.
,
"Algorithmic Aspects of Parallel Data Processing", Foundations and Trends in Databases, vol. 8, issue 4, pp. 239--370, 2018.
,
"Anserini: Reproducible Ranking Baselines Using Lucene", Journal of Data and Information Quality, vol. 10, issue 4, pp. 16:1--16:20, 2018.
,
"Bikeshare Pool Sizing for Bike-and-Ride Multimodal Transit", IEEE Transactions on Intelligent Transportation Systems, vol. 19, issue 7, pp. 2279--2289, 2018.
,
"Data Integration: The Current Status and the Way Forward", IEEE Data Engineering Bulletin, vol. 41, issue 2, pp. 3--9, 2018.
,
"Distributed Evaluation of Subgraph Queries Using Worst-Case Optimal And Low-Memory Dataflows", Proceedings of the VLDB Endowment (PVLDB), vol. 11, issue 6, pp. 691--704, 2018.
,
"Distributed Evaluation of Subgraph Queries Using Worstcase Optimal LowMemory Dataflows", ArXiv, vol. abs/1802.03760, 2018.
,
"Effective and Complete Discovery of Bidirectional Order Dependencies Via Set-Based Axioms", The VLDB Journal, vol. 27, issue 4, pp. 573--591, 2018.
,
"Evaluating Computational Creativity: An Interdisciplinary Tutorial", ACM Computing Surveys, vol. 51, issue 2, pp. 28:1--28:34, 2018.
,
"Evaluating Sentence-Level Relevance Feedback for High-Recall Information Retrieval", ArXiv, vol. abs/1803.08988, 2018.
,
"Evaluation-as-a-Service for the Computational Sciences: Overview And Outlook", Journal of Data and Information Quality, vol. 10, issue 4, pp. 15:1--15:32, 2018.
,
"Experimental Analysis of Distributed Graph Systems", Proceedings of the VLDB Endowment (PVLDB), vol. 11, issue 10, pp. 1151--1164, 2018.
,
"Experimental Analysis of Distributed Graph Systems", ArXiv, vol. abs/1806.08082, 2018.
,
"Explanation Tables", IEEE Data Engineering Bulletin, vol. 41, issue 3, pp. 43--51, 2018.
,
"FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks", ArXiv, vol. abs/1811.03060, 2018.
,
"In-Browser Split-Execution Support for Interactive Analytics in The Cloud", ArXiv, vol. abs/1804.08822, 2018.
,
"JavaScript Convolutional Neural Networks for Keyword Spotting in The Browser: An Experimental Analysis", ArXiv, vol. abs/1810.12859, 2018.
,
"Multi-Perspective Relevance Matching With Hierarchical ConvNets For Social Media Search", ArXiv, vol. abs/1805.08159, 2018.
,
"Progress and Tradeoffs in Neural Language Models", ArXiv, vol. abs/1811.00942, 2018.
,
"Repeatability Corner Cases in Document Ranking: The Impact of Score Ties", ArXiv, vol. abs/1807.05798, 2018.
,
"Report on NTCIR-13: The Thirteenth Round of NII Testbeds and Community For Information Access Research", SIGIR Forum, vol. 52, issue 1, pp. 102--110, 2018.
,
"Research Frontiers in Information Retrieval: Report From the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018)", SIGIR Forum, vol. 52, issue 1, pp. 34--90, 2018.
,
"Response to "Scale Up or Scale Out for Graph Processing"", IEEE Internet Computing, vol. 22, issue 5, pp. 18--24, 2018.
,
"Sapphire: Querying RDF Data Made Simple", ArXiv, vol. abs/1805.11728, 2018.
,
"Scale Up or Scale Out for Graph Processing?", IEEE Internet Computing, vol. 22, issue 3, pp. 72--78, 2018.
,
"Semi-Supervised Clustering for De-Duplication", ArXiv, vol. abs/1810.04361, 2018.
,
"Serverless Data Analytics With Flint", ArXiv, vol. abs/1803.06354, 2018.
,
"Shrinkwrap: Differentially-Private Query Processing in Private Data Federations", ArXiv, vol. abs/1810.01816, 2018.
,
"ShrinkWrap: Efficient SQL Query Processing in Differentially Private Data Federations", Proceedings of the VLDB Endowment (PVLDB), vol. 12, issue 3, pp. 307--320, 2018.
,
"Simple Attention-Based Representation Learning for Ranking Short Social Media Posts", ArXiv, vol. abs/1811.01013, 2018.
,
"Streaming Voice Query Recognition Using Causal Convolutional Recurrent Neural Networks", ArXiv, vol. abs/1812.07754, 2018.
,
"The Neural Hype and Comparisons Against Weak Baselines", SIGIR Forum, vol. 52, issue 2, pp. 40--51, 2018.
,
"Time Constrained Continuous Subgraph Search Over Streaming Graphs", ArXiv, vol. abs/1801.09240, 2018.
,
Policy Driven Data Sharing With Provable Privacy Guarantees: Duke University, Durham, NC, USA, 2018.
,
2017
"Comparative Assessment of Alignment Algorithms for NGS Data: Features, Considerations, Implementations, and Future", Algorithms for Next-Generation Sequencing Data, Techniques, Approaches, and Applications: Springer, 2017.
,
"A Comparison of Document-at-a-Time and Score-at-a-Time Query Evaluation", Web Search and Data Mining (WSDM), 2017.
,
"A Comparison of Nuggets and Clusters for Evaluating Timeline Summaries", International Conference on Information and Knowledge Management (CIKM), 2017.
,
"A Demo of the Data Civilizer System", ACM International Conference on Management of Data (SIGMOD), 2017.
,
"An Analysis of Memory Power Consumption in Database Systems", International Workshop on Data Management on New Hardware (DaMoN), 2017.
,
"An Exploration of Serverless Architectures for Information Retrieval", International Conference on the Theory of Information Retrieval (ICTIR), 2017.
,
"An Insight Extraction System on BioMedical Literature With Deep Neural Networks", Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017.
,
"An Interpolation-Based Compiler and Optimizer for Relational Queries (System Design Report)", International Conference on Logic Programming and Automated Reasoning (LPAR), 2017.
,
"Anserini: Enabling the Use of Lucene for Information Retrieval Research", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Authority-Based Team Discovery in Social Networks", International Conference on Extending Database Technology (EDBT), 2017.
,
"Automatic and Semi-Automatic Document Selection for Technology-Assisted Review", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Automatically Extracting High-Quality Negative Examples for Answer Selection in Question Answering", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Composing Differential Privacy and Secure Computation: A Case Study On Scaling Private Record Linkage", Conference on Computer and Communications Security (CCS), 2017.
,
"Concerning Referring Expressions in Query Answers", International Joint Conference on Artificial Intelligence (IJCAI), 2017.
,
"Data Profiling: A Tutorial", ACM International Conference on Management of Data (SIGMOD), 2017.
,
"Deep Learning-Based Assessment of Tumor-Associated Stroma for Diagnosing Breast Cancer in Histopathology Images", IEEE International Symposium on Biomedical Imaging (ISBI), 2017.
,
"Differential Privacy in the Wild: A Tutorial on Current Practices & Open Challenges", ACM International Conference on Management of Data (SIGMOD), 2017.
,
"Do We Need Specialized Graph Databases?: Benchmarking Real-Time Social Networking Applications", International Workshop on Graph Data Management Experiences and Systems (GRADES), 2017.
,
"Efficient Discovery of Ontology Functional Dependencies", International Conference on Information and Knowledge Management (CIKM), 2017.
,
"Event Detection on Curated Tweet Streams", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Experiments With Convolutional Neural Network Models for Answer Selection", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Exploring Conversational Search With Humans, Assistants, and Wizards", ACM Conference on Human Factors in Computing Systems (CHI), 2017.
,
"Finally, a Downloadable Test Collection of Tweets", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Graph Mining to Characterize Competition for Employment", ACM International Conference on Management of Data (SIGMOD), 2017.
,
"Graphflow: An Active Graph Database", ACM International Conference on Management of Data (SIGMOD), 2017.
,
"GYM: A Multiround Distributed Join Algorithm", International Conference on Database Theory (ICDT), 2017.
,
"How Similar Is the Usage of Electric Cars and Electric Bicycles?", Energy-Efficient Computing and Networking (e-Energy), 2017.
,
"In-Browser Interactive SQL Analytics With Afterburner", ACM International Conference on Management of Data (SIGMOD), 2017.
,
"Incorporating Novelty, Meaning, Reaction and Craft Into Computational Poetry: A Negative Experimental Result", International Conference on Computational Creativity (ICCC), 2017.
,
"Managing Sensor Data Streams: Lessons Learned From the WeBike Project", International Conference on Statistical and Scientific Database Management (SSDBM), 2017.
,
"Mining the Temporal Statistics of Query Terms for Searching Social Media Posts", International Conference on the Theory of Information Retrieval (ICTIR), 2017.
,
"MRG_UWaterloo and WaterlooCormack Participation in the TREC 2017 Common Core Track", Text Retrieval Conference (TREC), 2017.
,
"MRG_UWaterloo and WaterlooCormack Participation in the TREC 2017 Common Core Track", Text Retrieval Conference (TREC), 2017.
,
"Navigating Imprecision in Relevance Assessments on the Road to Total Recall: Roger and Me", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Netstore: Leveraging Network Optimizations to Improve Distributed Transaction Processing Performance", International Middleware Conference (Middleware), 2017.
,
"On Partial Features in the DLF Dialects of Description Logic With Inverse Features", International Workshop on Description Logics (DL), 2017.
,
"On the Reusability of "Living Labs" Test Collections: : A Case Study Of Real-Time Summarization", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Online in-Situ Interleaved Evaluation of Real-Time Push Notification Systems", International Conference on Research and Development in Information Retrieval (SIGIR), 2017.
,
"Optimal Reducer Placement to Minimize Data Transfer in MapReduce-style Processing", IEEE International Conference on Big Data (IEEE BigData), 2017.
,
"Overview of the TREC 2017 Real-Time Summarization Track", Text Retrieval Conference (TREC), 2017.
,