Banca de DEFESA: Eric Hans Messias da Silva

Uma banca de DEFESA de MESTRADO foi cadastrada pelo programa.
STUDENT : Eric Hans Messias da Silva
DATE: 13/07/2023
TIME: 16:00
LOCAL: Teams
TITLE:

Abstractive Summarization of Long Documents Used in Inspections and Procedural Instructions


KEY WORDS:

Natural Language Processing, Abstractive Summarization, Long Documents, Legal Documents


PAGES: 100
BIG AREA: Ciências Exatas e da Terra
AREA: Ciência da Computação
SUBÁREA: Metodologia e Técnicas da Computação
SUMMARY:

The Brazilian Federal Court of Accounts organizes its work by processes and, throughout their life cycle, each of them usually contains from tens to hundreds of legal documents. Each document easily reaches a few dozen pages. The number of processes and documents only tends to grow over time, which generates a huge amount of material for reading and with a very rich content, but difficult to consume, as it takes considerable time to read each process. The processes are usually read to verify if they have relevant content for any fiscalization or procedural instruction in progress. In addition to the high cost of reading a process, part of this content is discarded by the auditor because it is not linked to their current work, which generates a waste of time in this activity. To improve the efficiency of this process, we proposed in this work the development of an automatic text summarization solution using machine learning applied to natural language processing. This solution will use the abstractive summarization approach applied to long documents and with legal content, using models that are state-of-the-art in the task and based on transformers with linear attention mechanism. The solution will be made available as an Web Apllication with a microservice for better integration with applications that make up the auditor’s work process. The summaries generated by the models will be evaluated mainly by metrics that focus more on the semantics of the generated text and, as a result, will have a better adherence to the desired content. The user will provide feedback on the generated summaries and they will be used to feed back the model later.


COMMITTEE MEMBERS:
Presidente - 402520 - MARCELO LADEIRA
Interno - 1821656 - THIAGO DE PAULO FALEIROS
Externo à Instituição - ANDREI LIMA QUEIROZ - UnB
Externo à Instituição - THIAGO ALEXANDRE SALGUEIRO PARDO - USP
Notícia cadastrada em: 13/07/2023 09:00
SIGAA | Secretaria de Tecnologia da Informação - STI - (61) 3107-0102 | Copyright © 2006-2024 - UFRN - app16.sigaa16