Classification of initial petitions in the National Council of the Public Ministry
Machine Learning
Natural Language Processing
Text Classification
BERT
Initial Petition
This academic work proposes the application of Natural Language Processing technique, based on the BERT language model, to enhance efficiency and accuracy in classifying initial petitions by the National Council of the Public Ministry. The merit of the proposal lies in addressing the challenges faced by the organization, such as the delay and cost associated with analyzing diverse documents, the imbalance in the number of cases across procedural classes, and the low quality of textual data. The main idea is to evaluate, through the implementation of text preprocessing, comparison of techniques for reducing text sequences, and approaches to handling data imbalance, different machine learning algorithms, measuring their performances, and using the best model to develop a prototype with a web interface for interaction with the classification system. The ultimate goal is to assess the model from the user's perspective, continuously monitor its performance, incorporate adjustments based on results and feedback, and demonstrate its effectiveness in optimizing the process of analyzing initial petitions for the organization during the tool's usage period.