IMPLEMENTASI METODE CONVOLUTIONAL NEURAL NETWORK UNTUK PENGENALAN EMOSI BERDASARKAN SUARA MANUSIA MENGGUNAKAN REPRESENTASI CITRA SPECTROGRAM

Endow Bonapen, . (2024) IMPLEMENTASI METODE CONVOLUTIONAL NEURAL NETWORK UNTUK PENGENALAN EMOSI BERDASARKAN SUARA MANUSIA MENGGUNAKAN REPRESENTASI CITRA SPECTROGRAM. Skripsi thesis, Universitas Pembangunan Nasional Veteran Jakarta.

[img] Text
ABSTRAK.pdf

Download (233kB)
[img] Text
AWAL.pdf

Download (893kB)
[img] Text
BAB 1.pdf
Restricted to Repository UPNVJ Only

Download (249kB)
[img] Text
BAB 2.pdf
Restricted to Repository UPNVJ Only

Download (714kB)
[img] Text
BAB 3.pdf
Restricted to Repository UPNVJ Only

Download (354kB)
[img] Text
BAB 4.pdf
Restricted to Repository UPNVJ Only

Download (1MB)
[img] Text
BAB 5.pdf

Download (238kB)
[img] Text
DAFTAR PUSTAKA.pdf

Download (242kB)
[img] Text
RIWAYAT HIDUP.pdf
Restricted to Repository UPNVJ Only

Download (264kB)
[img] Text
LAMPIRAN.pdf
Restricted to Repository UPNVJ Only

Download (519kB)
[img] Text
HASIL PLAGIARISME.pdf
Restricted to Repository staff only

Download (7MB)
[img] Text
ARTIKEL KI.pdf
Restricted to Repository staff only

Download (732kB)

Abstract

Speech Emotion Recognition (SER) is the identification of human emotions based on the voice generated during a particular state or situation. In everyday life, the range of emotions expressed through voice can vary among individuals. However, how can a computer have the capability to distinguish the variety of emotions contained in a sound signal, much like humans can discern them? Addressing this issue, this research aims to create and build a model for recognizing human emotions based on voice, which can be used to differentiate and identify various human emotions. Therefore, this study will apply Mel Frequency Cepstral Coefficients (MFCC) Spectrogram as a feature extraction method, and the extracted features will be processed through emotion recognition using Deep Learning. The chosen method is the 2D Convolutional Neural Network (CNN). The emotions considered in this research include Angry, Disgust, Fear, Happy, Sad, Neutral, and Surprise. The data collection process involves a total of 4665 voice samples, followed by pre-processing, data ratio splitting, training using training and validation data, and testing using test data. The results of building the emotion recognition model based on human voice using the CNN method achieved the highest accuracy of 88%, with precision at 90%, recall at 87%, and an F1-score of 88%. The accuracy for each emotion type is as follows: Angry 96.80%, Disgust 97.86%, Fear 96.16%, Happy 95.52%, Neutral 96.37%, Sad 96.80%, and Surprise 95.52%.

Item Type: Thesis (Skripsi)
Additional Information: [No.Panggil: 2010511010] [Pembimbing 1: Widya Cholil] [Pembimbing 2: Neny Rosmawarni] [Penguji 1: Yuni Widiastiwi] [Penguji 2: Muhammad Panji Muslim]
Uncontrolled Keywords: Speech Emotion Recognition, Spectrogram, CNN 2D, MFCC.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Divisions: Fakultas Ilmu Komputer > Program Studi Informatika (S1)
Depositing User: Endow Bonapen
Date Deposited: 13 Mar 2024 07:30
Last Modified: 13 Mar 2024 07:30
URI: http://repository.upnvj.ac.id/id/eprint/29215

Actions (login required)

View Item View Item