Endow Bonapen, . (2024) IMPLEMENTASI METODE CONVOLUTIONAL NEURAL NETWORK UNTUK PENGENALAN EMOSI BERDASARKAN SUARA MANUSIA MENGGUNAKAN REPRESENTASI CITRA SPECTROGRAM. Skripsi thesis, Universitas Pembangunan Nasional Veteran Jakarta.
Text
ABSTRAK.pdf Download (233kB) |
|
Text
AWAL.pdf Download (893kB) |
|
Text
BAB 1.pdf Restricted to Repository UPNVJ Only Download (249kB) |
|
Text
BAB 2.pdf Restricted to Repository UPNVJ Only Download (714kB) |
|
Text
BAB 3.pdf Restricted to Repository UPNVJ Only Download (354kB) |
|
Text
BAB 4.pdf Restricted to Repository UPNVJ Only Download (1MB) |
|
Text
BAB 5.pdf Download (238kB) |
|
Text
DAFTAR PUSTAKA.pdf Download (242kB) |
|
Text
RIWAYAT HIDUP.pdf Restricted to Repository UPNVJ Only Download (264kB) |
|
Text
LAMPIRAN.pdf Restricted to Repository UPNVJ Only Download (519kB) |
|
Text
HASIL PLAGIARISME.pdf Restricted to Repository staff only Download (7MB) |
|
Text
ARTIKEL KI.pdf Restricted to Repository staff only Download (732kB) |
Abstract
Speech Emotion Recognition (SER) is the identification of human emotions based on the voice generated during a particular state or situation. In everyday life, the range of emotions expressed through voice can vary among individuals. However, how can a computer have the capability to distinguish the variety of emotions contained in a sound signal, much like humans can discern them? Addressing this issue, this research aims to create and build a model for recognizing human emotions based on voice, which can be used to differentiate and identify various human emotions. Therefore, this study will apply Mel Frequency Cepstral Coefficients (MFCC) Spectrogram as a feature extraction method, and the extracted features will be processed through emotion recognition using Deep Learning. The chosen method is the 2D Convolutional Neural Network (CNN). The emotions considered in this research include Angry, Disgust, Fear, Happy, Sad, Neutral, and Surprise. The data collection process involves a total of 4665 voice samples, followed by pre-processing, data ratio splitting, training using training and validation data, and testing using test data. The results of building the emotion recognition model based on human voice using the CNN method achieved the highest accuracy of 88%, with precision at 90%, recall at 87%, and an F1-score of 88%. The accuracy for each emotion type is as follows: Angry 96.80%, Disgust 97.86%, Fear 96.16%, Happy 95.52%, Neutral 96.37%, Sad 96.80%, and Surprise 95.52%.
Item Type: | Thesis (Skripsi) |
---|---|
Additional Information: | [No.Panggil: 2010511010] [Pembimbing 1: Widya Cholil] [Pembimbing 2: Neny Rosmawarni] [Penguji 1: Yuni Widiastiwi] [Penguji 2: Muhammad Panji Muslim] |
Uncontrolled Keywords: | Speech Emotion Recognition, Spectrogram, CNN 2D, MFCC. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General) |
Divisions: | Fakultas Ilmu Komputer > Program Studi Informatika (S1) |
Depositing User: | Endow Bonapen |
Date Deposited: | 13 Mar 2024 07:30 |
Last Modified: | 13 Mar 2024 07:30 |
URI: | http://repository.upnvj.ac.id/id/eprint/29215 |
Actions (login required)
View Item |