Leidner, Jochen L.; Reiche, Michael (2024)
Development Methodologies for Big Data Analytics Systems.
A number of machine learning process models (SEMMA, KDD, CRISP-DM, CRISP-ML, Data-to-Value1 etc.) have been recently proposed to facilitate the development of machine learning models in their organizational context. While the existing proposals vary with respect to complexity and suitability for particular tasks, it would be desirable to have software tools that embody or support these process models, and make it easier for project teams to capture, share among team members and stakeholders and preserve the relevant project information pertaining to the various process stages. In particular, recorded past statistics may be applied to predict the duration of stages or the overall project effort.
Presently, to the best of our knowledge, no requirement analysis exists that stipulates the detailed needs. To this end, we present a first collection and analysis of a requirements document for the software tooling for machine learning process models. We describe the functional and non-functional requirements of a Computer-Aided Machine Learning Modeling (CAMLM) tool, the soft-computing world’s counter-part to a CASE (Computer Aided Software Engineering) tool.
Various software cover sub-areas such as team management and communication management (Confluence, Jira, Slack, Zoom...) or project management (CRISP-DM, Scrum, Kanban-Board...) or data and information management (model management [Weber, Christian; Hirmer, Pascal; Reimann, Peter; Schwarz, Holger (2019): A New Process Model for the Comprehensive Management of Machine Learning Models. In: Proceedings of the 21st International Conference on Enterprise Information Systems: SCITEPRESS - Science and Technology Publications.] ). What is not available to our knowledge, however, is software that covers the entire sub-areas and the entire life cycle of machine learning projects in detail.
Kohlhase, Michael; Leidner, Jochen L.; Schmid, Ute; Wolter, Diedrich (2024)
held at KI 2024: 47. Deutsche Jahrestagung für Künstliche Intelligenz, Würzburg, 23.09. - 27.09.2024.
Blümlein, Markus; Leidner, Jochen L. (2024)
Poster, presented at Networking for Research – German Universities of Applied Sciences and Researchers from Scotland (UDIF-HAW), DFG (German Research Foundation) and SULSA (Scottish Universities Life Sciences Alliances), online, 2024-04-24.
Leidner, Jochen L.; Jung, Luca (2024)
Proceedings of the 21st International Symposium on Web and Wireless Geographical Information Systems (W2GIS 2024), June, 17-18, 2024, Yverdon-les-Bains, Switzerland
, S. 95-104.
DOI: 10.1007/978-3-031-60796-7_7
Menzner, T.; Leidner, Jochen L. (2024)
Proceedings of the 29th European Conference on Information Retrieval (ECIR 2024), Glasgow, Scotland, UK, March 24-28, 2024 IV, S. 270-284.
DOI: 10.1007/978-3-031-56066-8_22
Leidner, Jochen L. (2024)
Invited Keynote, Seventh International Workshop on Narrative Extraction from Texts (Text2Story 2024) held in conjunction with the 46th European Conference on Information Retrieval (ECIR 2024), Glasgow, Scotland, UK, 23 March 2024.
Rüdel, Thomas; Leidner, Jochen L. (2023)
Technical Report, ArXiv Pre-Print Server.
DOI: 10.48550/arXiv.2311.11701
Nowaczyk, Slawomir; Biecek, Przemyslaw; Chung, Neo Christopher; Vallati, Mauro; Skruch, Pawel; Jaworek-Korjakowska, Joanna; Parkinson, Simon; Nikitas, Alexandros; Atzmüller, Martin; Tomás, Kliegr; Schmid, Ute; Bobek, Szymon; Lavrac, Nada; Peeters, Marieke; van Dierendonck, Roland; Robben, Saskia; Mercier-Laurent, Eunika; Kayakutlu, Gülgün; Owoc, Mieczyslaw Lech; Mason, Karl; Wahid, Abdul; Bruno, Pierangela; Calimeri, Francesco; Cauteruccio, Francesco; Terracina, Giorgio; Wolter, Diedrich; Leidner, Jochen L.; Kohlhase, Michael; Dimitrova, Vania (2023)
Nowaczyk, Slawomir; Biecek, Przemyslaw; Chung, Neo Christopher; Vallati, Mauro...
Communications in Computer and Information Science (CCIS) 1947.
Leidner, Jochen L.; Reiche, Michael (2023)
Workshop on AI for AI Learning Held at ECAI 2023, Kakow, Poland, September 30, 2023.
Reiche, Michael; Leidner, Jochen L. (2023)
Workshop on AI for AI Learning Held at ECAI 2023, Kakow, Poland, September 30, 2023.
Dimitsas, Markos; Leidner, Jochen L. (2023)
Workshop on AI for AI Learning Held at ECAI 2023, Kakow, Poland, September 30, 2023.
Nowaczyk, Slawomir; Biecek, Przemyslaw; Chung, Neo Christopher; Vallati, Mauro; Skruch, Pawel; Jaworek-Korjakowska, Joanna; Parkinson, Simon; Nikitas, Alexandros; Atzmüller, Martin; Tomás, Kliegr; Schmid, Ute; Bobek, Szymon; Lavrac, Nada; Peeters, Marieke; van Dierendonck, Roland; Robben, Saskia; Mercier-Laurent, Eunika; Kayakutlu, Gülgün; Owoc, Mieczyslaw Lech; Mason, Karl; Wahid, Abdul; Bruno, Pierangela; Calimeri, Francesco; Cauteruccio, Francesco; Terracina, Giorgio; Wolter, Diedrich; Leidner, Jochen L.; Kohlhase, Michael; Dimitrova, Vania (2023)
Nowaczyk, Slawomir; Biecek, Przemyslaw; Chung, Neo Christopher; Vallati, Mauro...
Communications in Computer and Information Science (CCIS) 1948.
DOI: 10.1007/978-3-031-56066-8_22
Nugent, Timothy; Leidner, Jochen L.; Gkotsis, George (2023)
Advances in Information Retrieval: Proceedings of the 45th European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland, April 2-6, 2023 2, S. 3-15.
DOI: 10.1007/978-3-031-28238-6_1
To date, automatic summarization methods have been mostly developed for (and applied to) general news articles, whereas other document types have been neglected. In this paper, we introduce the task of summarizing financial earnings call transcripts, and we present a method for summarizing this text type essential for the financial industry. Earnings calls are briefing events common for public companies in many countries, typically in the form of conference calls held between company executives and analysts that consist of a spoken monologue part followed by moderated questions and answers.
We show that traditional methods work less well in this domain, we present a method suitable for summarizing earnings calls. Our large-scale evaluation on a new human-annotated corpus of summary-worthy sentences shows that this method outperforms a set of strong baselines, including a new one that we propose specifically for earnings calls. To the best of our knowledge, this is the first application of summarization to financial earnings calls transcripts, a primary source of information for financial professionals.
Menzner, T.; Mittag, Florian; Leidner, Jochen L. (2023)
Advances in Information Retrieval: Proceedings of the 45th European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland, April 2-6, 2023 3, S. 275–280.
DOI: 10.1007/978-3-031-28241-6_26
In this demonstration, we present Country Guesser, a live system that guesses the country that a photo is taken in. In particular, given a Google Street View image, our federated ranking model uses a combination of computer vision, machine learning and text retrieval methods to compute a ranking of likely countries of the location shown in a given image from Street View. Interestingly, using text-based features to probe large pre-trained language models can assist to provide cross-modal supervision. We are not aware of previous country guessing systems informed by visual and textual features.
Bartel, Holger; Kraft, Mirko; Leidner, Jochen L. (2023)
Zeitschrift für die gesamte Versicherungswissenschaft 112 (1), S. 1-30.
Mathur, Puneet; Goyal, Mihir; Sawhney, Ramit; Mathur, Ritik; Leidner, Jochen L.; Dernoncourt, Franck; Manocha, Dinesh (2022)
Mathur, Puneet; Goyal, Mihir; Sawhney, Ramit; Mathur, Ritik; Leidner, Jochen L....
Findings of the Association for Computational Linguistics: EMNLP 2022 (Empirical Methods in Natural Language Processing), December 2022, Abu Dhabi, United Arab Emirates, S. 1933-1940.
Financial prediction is complex due to the stochastic nature of the stock market. Semi-structured financial documents present comprehensive financial data in tabular formats, such as earnings, profit-loss statements, and balance sheets, and can often contain rich technical analysis along with a textual discussion of corporate history, and management analysis, compliance, and risks. Existing research focuses on the textual and audio modalities of financial disclosures from company conference calls to forecast stock volatility and price movement, but ignores the rich tabular data available in financial reports. Moreover, the economic realm is still plagued with a severe under-representation of various communities spanning diverse demographics, gender, and native speakers. In this work, we show that combining tabular data from financial semi-structured documents with text transcripts and audio recordings not only improves stock volatility and price movement prediction by 5-12% but also reduces gender bias caused due to audio-based neural networks by over 30%.
Leidner, Jochen L. (2022)
Invited Talk, Joint Webinar of the CFA organization (French Chapter) and the London Stock Exchange Group (LSEG), London/Paris/online, November 18, 2022.
McDonald, Thomas; Dong, Ziqing; Zhang, Yingji; Hampson, Rebekah; Young, James; Cao, Qianyu; Leidner, Jochen L.; Stevenson, Mark (2022)
McDonald, Thomas; Dong, Ziqing; Zhang, Yingji; Hampson, Rebekah; Young, James...
Cross Language Evaluation Forum (CLEF) Working Notes 2020: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, September 22-25, 2020. 2696, 162.
Leidner, Jochen L. (2022)
Proceedings of the 27th International Conference on Applications of Natural Language to Information Systems (NLDB 2022), Valencia, Spain, June 15-17, 2022, S. 517-523.
While for data mining projects (for example in the context of e-commerce) some methodologies have already been developed (e.g. CRISP-DM, SEMMA, KDD), these do not account for (1) early evaluation in order to de-risk a project (2) dealing with text corpora (“unstructured” data) and associated natural language processing processes, and (3) non-technical considerations (e.g. legal, ethical, project management aspects). To address these three shortcomings, a new methodology, called “Data to Value”, is introduced, which is guided by a detailed catalog of questions in order to avoid a disconnect of large-scale NLP project teams with the topic when facing rather abstract box-and-arrow diagrams commonly associated with methodologies.
Holtorf, Christian; Leidner, Jochen L. (2022)
Rahmen der Literaturtage „Coburg liest“ 2022. 2022.
Fakultät Wirtschaftswissenschaften (FW)
T +49 9561 317 422 Jochen.Leidner[at]hs-coburg.de