Nowaczyk, Slawomir; Biecek, Przemyslaw; Chung, Neo Christopher; Vallati, Mauro; Skruch, Pawel; Jaworek-Korjakowska, Joanna; Parkinson, Simon; Nikitas, Alexandros; Atzmüller, Martin; Tomás, Kliegr; Schmid, Ute; Bobek, Szymon; Lavrac, Nada; Peeters, Marieke; van Dierendonck, Roland; Robben, Saskia; Mercier-Laurent, Eunika; Kayakutlu, Gülgün; Owoc, Mieczyslaw Lech; Mason, Karl; Wahid, Abdul; Bruno, Pierangela; Calimeri, Francesco; Cauteruccio, Francesco; Terracina, Giorgio; Wolter, Diedrich; Leidner, Jochen L.; Kohlhase, Michael; Dimitrova, Vania (2023)
Nowaczyk, Slawomir; Biecek, Przemyslaw; Chung, Neo Christopher; Vallati, Mauro...
Communications in Computer and Information Science (CCIS) 1948.
DOI: 10.1007/978-3-031-56066-8_22
Nugent, Timothy; Leidner, Jochen L.; Gkotsis, George (2023)
Advances in Information Retrieval: Proceedings of the 45th European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland, April 2-6, 2023 2, 3-15.
DOI: 10.1007/978-3-031-28238-6_1
To date, automatic summarization methods have been mostly developed for (and applied to) general news articles, whereas other document types have been neglected. In this paper, we introduce the task of summarizing financial earnings call transcripts, and we present a method for summarizing this text type essential for the financial industry. Earnings calls are briefing events common for public companies in many countries, typically in the form of conference calls held between company executives and analysts that consist of a spoken monologue part followed by moderated questions and answers.
We show that traditional methods work less well in this domain, we present a method suitable for summarizing earnings calls. Our large-scale evaluation on a new human-annotated corpus of summary-worthy sentences shows that this method outperforms a set of strong baselines, including a new one that we propose specifically for earnings calls. To the best of our knowledge, this is the first application of summarization to financial earnings calls transcripts, a primary source of information for financial professionals.
Menzner, T.; Mittag, Florian; Leidner, Jochen L. (2023)
Advances in Information Retrieval: Proceedings of the 45th European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland, April 2-6, 2023 3, 275–280.
DOI: 10.1007/978-3-031-28241-6_26
In this demonstration, we present Country Guesser, a live system that guesses the country that a photo is taken in. In particular, given a Google Street View image, our federated ranking model uses a combination of computer vision, machine learning and text retrieval methods to compute a ranking of likely countries of the location shown in a given image from Street View. Interestingly, using text-based features to probe large pre-trained language models can assist to provide cross-modal supervision. We are not aware of previous country guessing systems informed by visual and textual features.
Bartel, Holger; Kraft, Mirko; Leidner, Jochen L. (2023)
Zeitschrift für die gesamte Versicherungswissenschaft 112 (1), 1-30.
Mathur, Puneet; Goyal, Mihir; Sawhney, Ramit; Mathur, Ritik; Leidner, Jochen L.; Dernoncourt, Franck; Manocha, Dinesh (2022)
Mathur, Puneet; Goyal, Mihir; Sawhney, Ramit; Mathur, Ritik; Leidner, Jochen L....
Findings of the Association for Computational Linguistics: EMNLP 2022 (Empirical Methods in Natural Language Processing), December 2022, Abu Dhabi, United Arab Emirates, 1933-1940.
Financial prediction is complex due to the stochastic nature of the stock market. Semi-structured financial documents present comprehensive financial data in tabular formats, such as earnings, profit-loss statements, and balance sheets, and can often contain rich technical analysis along with a textual discussion of corporate history, and management analysis, compliance, and risks. Existing research focuses on the textual and audio modalities of financial disclosures from company conference calls to forecast stock volatility and price movement, but ignores the rich tabular data available in financial reports. Moreover, the economic realm is still plagued with a severe under-representation of various communities spanning diverse demographics, gender, and native speakers. In this work, we show that combining tabular data from financial semi-structured documents with text transcripts and audio recordings not only improves stock volatility and price movement prediction by 5-12% but also reduces gender bias caused due to audio-based neural networks by over 30%.
Leidner, Jochen L. (2022)
Invited Talk, Joint Webinar of the CFA organization (French Chapter) and the London Stock Exchange Group (LSEG), London/Paris/online, November 18, 2022.
McDonald, Thomas; Dong, Ziqing; Zhang, Yingji; Hampson, Rebekah; Young, James; Cao, Qianyu; Leidner, Jochen L.; Stevenson, Mark (2022)
McDonald, Thomas; Dong, Ziqing; Zhang, Yingji; Hampson, Rebekah; Young, James...
Cross Language Evaluation Forum (CLEF) Working Notes 2020: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, September 22-25, 2020. 2696, 162.
Leidner, Jochen L. (2022)
Proceedings of the 27th International Conference on Applications of Natural Language to Information Systems (NLDB 2022), Valencia, Spain, June 15-17, 2022, 517-523.
While for data mining projects (for example in the context of e-commerce) some methodologies have already been developed (e.g. CRISP-DM, SEMMA, KDD), these do not account for (1) early evaluation in order to de-risk a project (2) dealing with text corpora (“unstructured” data) and associated natural language processing processes, and (3) non-technical considerations (e.g. legal, ethical, project management aspects). To address these three shortcomings, a new methodology, called “Data to Value”, is introduced, which is guided by a detailed catalog of questions in order to avoid a disconnect of large-scale NLP project teams with the topic when facing rather abstract box-and-arrow diagrams commonly associated with methodologies.
Holtorf, Christian; Leidner, Jochen L. (2022)
Rahmen der Literaturtage „Coburg liest“ 2022. 2022.
Nugent, Tim; Stelea, Nicole; Leidner, Jochen L. (2021)
Proceedings of the 14th International Conference on Flexible Query Answering Systems (FQAS 2021), Bratislava, Slovakia, September 19–24, 2021, 157-169.
DOI: 10.1007/978-3-030-86967-0_12
Despite recent advances in deep learning-based language modelling, many natural language processing (NLP) tasks in the financial domain remain challenging due to the paucity of appropriately labelled data. Other issues that can limit task performance are differences in word distribution between the general corpora – typically used to pre-train language models – and financial corpora, which often exhibit specialized language and symbology. Here, we investigate two approaches that can help to mitigate these issues. Firstly, we experiment with further language model pre-training using large amounts of in-domain data from business and financial news. We then apply augmentation approaches to increase the size of our data-set for model fine-tuning. We report our findings on an Environmental, Social and Governance (ESG) controversies data-set and demonstrate that both approaches are beneficial to accuracy in classification tasks.
Leidner, Jochen L. (2021)
Handbook of Big Geospatial Data, 429–457.
DOI: 10.1007/978-3-030-55462-0_16
Leidner, Jochen L.; Martins, Bruno; McDonough, Katherine; Purves, Ross S. (2020)
Proceedings of the 42nd European Conference on Information Retrieval Research (ECIR 2020), Lisbon, Portugal, April 14–17, 2020 II, 669-673.
DOI: 10.1007/978-3-030-45442-5_89
In this half-day tutorial, we will review the basic concepts of, methods for, and applications of geographic information retrieval, also showing some possible applications in fields such as the digital humanities. The tutorial is organized in four parts. First we introduce some basic ideas about geography, and demonstrate why text is a powerful way of exploring relevant questions. We then introduce a basic end-to-end pipeline discussing geographic information in documents, spatial and multi-dimensional indexing [19], and spatial retrieval and spatial filtering. After showing a range of possible applications, we conclude with suggestions for future work in the area.
Fakultät Wirtschaftswissenschaften (FW)
T +49 9561 317 422 Jochen.Leidner[at]hs-coburg.de