Lin, S.-C., Gao, L., Oguz, B., Xiong, W., Lin, J., Yih, W.- tau, & Chen, X. (2024). FLAME: Factuality-Aware Alignment for Large Language Models ArXiv, abs/2405.01525. https://doi.org/10.48550/ARXIV.2405.01525
References
Filter by:
Bonifati, A., Ozsu, T., Tian, Y., Voigt, H., Yu, W., & Zhang, W. (2024). The Future of Graph Analytics Presented at the The Future of Graph Analytics conference. https://doi.org/10.1145/3626246.3658369
Arabzadeh, N., Bigdeli, A., & Clarke, C. (2024). Adapting Standard Retrieval Benchmarks to Evaluate Generated Answers Presented at the Adapting Standard Retrieval Benchmarks to Evaluate Generated Answers conference. https://doi.org/10.1007/978-3-031-56060-6_26
Arabzadeh, N., & Clarke, C. (2024). A Comparison of Methods for Evaluating Generative IR ArXiv, abs/2404.04044. https://doi.org/10.48550/ARXIV.2404.04044
Lin, J., Li, J., Gao, J., Ma, W., & Liu, Y. (2024). Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification Presented at the Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification conference. https://doi.org/10.1609/AAAI.V38I12.29288
Arabzadeh, N., & Clarke, C. (2024). A Comparison of Methods for Evaluating Generative IR ArXiv, abs/2404.04044. https://doi.org/10.48550/ARXIV.2404.04044
Faggioli, G., Dietz, L., Clarke, C., Demartini, G., Hagen, M., Hauff, C., … Wachsmuth, H. (2024). Who Determines What Is Relevant? Humans or AI? Why Not Both? Communications of the ACM, 67, 31-34. https://doi.org/10.1145/3624730
Arabzadeh, N., & Clarke, C. (2024). Fréchet Distance for Offline Evaluation of Information Retrieval Systems With Sparse Labels ArXiv, abs/2401.17543. https://doi.org/10.48550/ARXIV.2401.17543
Arabzadeh, N., & Clarke, C. (2024). Fr\ echet Distance for Offline Evaluation of Information Retrieval Systems With Sparse Labels Presented at the Fr\ Echet Distance for Offline Evaluation of Information Retrieval Systems With Sparse Labels conference. Retrieved from https://aclanthology.org/2024.eacl-long.26
Arabzadeh, N., Huo, S., Mehta, N., Wu, Q., Wang, C., Awadallah, A., … Kiseleva, J. (2024). Assessing and Verifying Task Utility in LLM-Powered Applications ArXiv, abs/2405.02178. https://doi.org/10.48550/ARXIV.2405.02178