0% found this document useful (0 votes)
20 views4 pages

Music Generation With NLP-6

The document discusses advancements in music generation using Generative Adversarial Networks (GANs). It highlights the potential of GANs in creating melodies and improving training processes through the integration of various models. The authors propose a framework for future research in music generation leveraging deep learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Music Generation With NLP-6

The document discusses advancements in music generation using Generative Adversarial Networks (GANs). It highlights the potential of GANs in creating melodies and improving training processes through the integration of various models. The authors propose a framework for future research in music generation leveraging deep learning techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2021 International Conference on Computer Engineering and Application (ICCEA)

,PSOHPHQW0XVLF*HQHUDWLRQZLWK*$1$
6\VWHPDWLF5HYLHZ
2021 International Conference on Computer Engineering and Application (ICCEA) | 978-1-6654-2616-9/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICCEA53728.2021.00075


+DRKDQJ=KDQJ ‚ /HWLDQ;LH ‚
6FKRRORI0DWKHPDWLFDO6FLHQFHV 'HSDUWPHQWRI&RPSXWLQJ
)XGDQ8QLYHUVLW\ ;L¶DQ-LDRWRQJOLYHUSRRO8QLYHUVLW\
6KDQJKDL&KLQD 6X]KRX-LDQJVX&KLQD
#IXGDQHGXFQ /HWLDQ;LH#VWXGHQW[MWOXHGXFQ
 ‚7KHVHDXWKRUVFRQWULEXWHGHTXDOO\

.DL\L4L ‚ 
&ROOHJHRI6FLHQFHDQG7HFKQRORJ\ 
:HQ]KRX.HDQ8QLYHUVLW\ 
:HQ]KRX-LDQJVX&KLQD 
NDL\LT#NHDQHGX

Abstract²0XVLFJHQHUDWLRQKDVDORQJKLVWRU\ZKLFKFDQEHD WKH JHQHUDWRU DQG GLVFULPLQDWRU OHDUQ IURP HDFK RWKHU DQG
WRROWRGHFUHDVHKXPDQLQWHUYHQWLRQLQWKHSURFHVV5HFHQWO\LWLV XSGDWHWKHSDUDPHWHUVLQWKHWUDLQLQJSURFHVV
ZLGHO\ DFKLHYHG WR JHQHUDWH PHOOLIOXRXV PXVLF EDVHG RQ
JHQHUDWLYH DGYHUVDULDO QHWZRUN *$1  ZKLFK LV RQH RI WKH GHHS 7KH JHQHUDWRU FRXOG RQO\ OHDUQ E\ GLVFULPLQDWRU WR XSGDWH
OHDUQLQJPRGHOVRQXQVXSHUYLVHGOHDUQLQJ2QHRIWKHDGYDQWDJHV LWVHOI ZKLOH WKH GLVFULPLQDWRU FRXOG REWDLQ WKH GDWD IURP ERWK
RI*$1LVWKDWLWXVHVJHQHUDWLYHPRGHODQGGLVFULPLQDWLYHPRGHO IDNH VDPSOHV E\ GLVFULPLQDWRU DQG UHDO VDPSOHV $IWHU
WROHDUQPXWXDOO\ZLWKPRUHUHDOLVWLFDQGKLJKHUDFFXUDF\,QWKLV LGHQWLI\LQJ ZKHWKHU WKH GDWD LV IURP UHDO LPDJHV RU IURP
UHYLHZ ZH IRFXV RQ WKH RYHUYLHZ DFKLHYHPHQW ZLWK *$1 WR JHQHUDWRU WKH GLVFULPLQDWRU ZRXOG VHQG WKH VLJQDO WR WKH
JHQHUDWHPXVLF6SHFLILFDOO\WKHGHILQLWLRQDQG*$1PHWKRGVDUH JHQHUDWRU 7KH JHQHUDWRU ZLOO XVH WKH QHZHVW GDWD IURP
LQWURGXFHG ILUVW 6XEVHTXHQWO\ WKH DSSOLFDWLRQ LQ PXVLF GLVFULPLQDWRUWRWUDLQLHOHDGVWRJHQHUDWHWKHVDPSOHZKLFKLV
JHQHUDWLRQDVZHOODVWKHFRUUHVSRQGLQJGUDZEDFNVDUHGLVFXVVHG KDUGHUWREHGLVWLQJXLVKHGE\GLVFULPLQDWRU
DFFRUGLQJO\ 7KHVH UHVXOWV ZLOO RIIHU D JXLGHOLQH IRU IXWXUH
UHVHDUFKLQPXVLFJHQHUDWLRQZLWKPDFKLQHOHDUQLQJWHFKQLTXHV $FFRUGLQJ WR WKH EDVLF *$1¶V SURFHVV WKH ILUVW VWHS LV WR
KDYH DQ RULJLQDO JHQHUDWRU WR JHQHUDWH LPDJHV 7KH
Keywords- GAN; Music Generation GLVFULPLQDWRU ZRXOG EH FKDUDFWHUL]HG DV D IXQFWLRQ PDSSLQJ
LPDJHGDWDWRWKHGLVWULEXWLRQRIWKHUHDOGDWD¶VSUREDELOLW\>@
, ,1752'8&7,21 7KHGLVFULPLQDWRU¶VRXWSXWVZRXOGYDU\IURPDVUHDO WRDV
IDNH7KHRXWSXWVDUHLGHQWLILHGDVUHDOGDWDLIWKH\DUHFORVHWR
0XVLFJHQHUDWLRQLVGHILQHGDVDFRPSRVLWLRQSURFHVVZKLFK  DQG LGHQWLILHG DV IDNH GDWD LI WKH\ DUH FORVH WR  ,I WKH
UHOLHV RQ WKH OHDVW KXPDQ LQWHUYHQWLRQ 7KH RULJLQDO UHFRUGHG GLVFULPLQDWRUVKRZVDJRRGLGHQWLILFDWLRQLWVKRXOGEHIUR]HQ
PXVLFJHQHUDWLRQLVWKH0XVLNDOLVFKHV:XUIHOVSLHO 'LFH0XVLF  7KH JHQHUDWRU QHHGV WR FRQWLQXH WUDLQLQJ WR GHFUHDVH WKH
E\ 0R]DUW >@ 7KH LPSOHPHQWDWLRQ ZDV DFKLHYHG ZLWK DFFXUDF\RI GLVFULPLQDWRU OHDGLQJ WR XSGDWH IDNH LPDJHV 7KH
VHOHFWLQJWRQHVZULWWHQE\KLPVHOIUDQGRPO\ E\WKURZLQJ SURFHVVLQFOXGHVGLVFULPLQDWRULQFUHDVLQJDFFXUDF\DQGVHQGLQJ
WZR GLFHV  LQ WKH HDUO\ V ,Q WKLV FDVH ,DQQLV ;HQDNLV RXWSXWV WR WKH JHQHUDWRU DQG JHQHUDWRU WUDLQLQJ WR FRQIXVH WKH
FUHDWHG UDQGRP PXVLF UHODWHG WR FRQFHSWV RI VWDWLVWLFV DQG GLVFULPLQDWRU 7KH PRGHO LV ILWWHG DV ORQJ DV GLVFULPLQDWRU
SUREDELOLW\ ZKLFK GHILQHG PXVLF DV D VHULHV RI DFFLGHQWDO FDQQRWJLYHDQDFFXUDWHFODVVLILFDWLRQRIUHDOGDWDDQGIDOVHGDWD
HOHPHQWVDQGGHVFULEHGLWYLDVWRFKDVWLFWKHRU\
$FFRUGLQJ WR WKH GHILQLWLRQ DQG FRQFHSW RI JHQHUDWRU DQG
&RQWHPSRUDULO\ GHHS OHDUQLQJ DUFKLWHFWXUH KDV EHFRPH D GLVFULPLQDWRU *$1 FRXOG SURFHVV RWKHU W\SHV RI GDWD ,W LV
SRSXODU WHFKQLTXH WR UHDOL]H PXVLF JHQHUDWLRQ GXH WR WKH VXLWDEOHWRLPSOHPHQWPXVLFJHQHUDWLRQZLWK*$1&RPSDUHG
VXSHULRU ILWWLQJ DELOLW\ RI GHHS OHDUQLQJ IRU ELJGDWD $PRQJ ZLWK WUDGLWLRQDO PRGHOV *$1 KDV WZR GLIIHUHQW DGYHUVDULDO
YDULRXV PRGHOV JHQHUDWLYH DGYHUVDULDO QHWZRUN *$1  LV WUDLQLQJQHWZRUNVWRJHQHUDWHPRUHFOHDUDQGDXWKHQWLFVDPSOHV
ZLGHO\ DGRSWHG EHFDXVH RI WKH QXPHURXV DGYDQWDJHV HJ 0DQ\ UHVHDUFKHUV KDG DGRSWHG *$1 LQ PXVLF JHQHUDWLRQ WR
JHQHUDWLYH WUDLQLQJ  *$1 LV D IUDPHZRUN RI GHHS OHDUQLQJ JHQHUDWHPXVLFZKLFKLVPRUHQDWXUDODQGKDUPRQLRXV
GHVLJQHG DQG LPSOHPHQWHG E\ *RRGIHOORZ DQG KLV WHDP LQ
 >@ ,Q SUDFWLFH *$1 KDV EHHQ SURYHG WR KDYH D JRRG
DELOLW\ WR GHDO ZLWK LPDJHV $ *$1 PRGHO FRXOG EH GLYLGHG ,, 7(50,12/2*<
LQWRJHQHUDWRUDQGGLVFULPLQDWRU7KHJHQHUDWRU¶VPLVVLRQLVWR
A. LSTM
FUHDWH IDNH LPDJHV ZKLFK LV FORVH WR UHDO LPDJH 7KH
GLVFULPLQDWRU ZRXOG UHFHLYH ERWK UHDO LPDJHV DQG JHQHUDWRU¶V /670 LV WKH DEEUHYLDWLRQ RI ORQJ VKRUWWHUP PHPRU\
IDNHLPDJHVDQGLGHQWLI\ZKHWKHUWKHLPDJHLVDXWKHQWLF%RWK ZKLFK LV DQ DUWLILFLDO UHFXUUHQW QHXUDO QHWZRUN DUFKLWHFWXUH
ZKLFKLVXVHGLQGHHSOHDUQLQJ

978-1-6654-2616-9/21/$31.00 ©2021 IEEE 352


DOI 10.1109/ICCEA53728.2021.00075
Authorized licensed use limited to: PES University Bengaluru. Downloaded on January 06,2025 at 08:51:35 UTC from IEEE Xplore. Restrictions apply.
B. Cross-entropy
,W LV DQ LPSRUWDQW FRQFHSW RI LQIRUPDWLRQ WKHRU\ ZKLFK LV
EDVLFDOO\ XVHG LQ PHDVXULQJ WKH GLIIHUHQW LQIRUPDWLRQEHWZHHQ
WZRSUREDELOLW\GLVWULEXWLRQV

C. ReLu & LeakyReLu


7KH\ DUH WKH DEEUHYLDWLRQ RI 5HFWLILHG /LQHDU 8QLW DQG
/HDN\ 5HFWLILHG /LQHDU 8QLW ZKLFK EHORQJ WR DFWLYDWLRQ
IXQFWLRQLQDUWLILFLDOQHXUDOQHWZRUN
D. Max Pooling & Average Pooling
0D[ 3RROLQJ UHIHUV WR WKH PD[LPXP YDOXH LQ WKH ORFDO
DFFHSWDQFH GRPDLQ ZKLOH $YHUDJH 3RROLQJ FRUUHVSRQGV WR WKH 
DYHUDJHYDOXHRIDOOYDOXHVLQWKHORFDODFFHSWDQFHGRPDLQ
)LJXUH$VNHWFKRI660*$10RGHO>@
,,, *$1$5&+,7(&785(
%RWK DERYH UHVHDUFKHV KDYH DFKLHYHG D VLJQLILFDQW
*$1 KDV EHHQ ZLGHO\ DSSOLHG LQ WKH ILHOG RI PXVLF EUHDNWKURXJKLQPXVLFJHQHUDWLRQZLWK*$1PRGHOVEXWWKHVH
JHQHUDWLRQ $FFRUGLQJ WR WKH W\SHV RI JHQHUDWLRQ WKH PXVLF PRGHOV EHORQJ WR V\PEROLF PXVLF JHQHUDWLRQ ,Q DGGLWLRQ WR
JHQHUDWLRQ PRGHOV EDVHG RQ *$1 FDQ EH GLYLGHG LQWR WKUHH V\PEROL]DWLRQ *$1 PRGHOV DUH DOVR FDSDEOH RI JHQHUDWLQJ
FDWHJRULHV JHQHUDWLRQ RI V\PEROV >@ JHQHUDWLRQ RI DUUDQJHPHQWV 6LQFH PXVLF LV XVXDOO\ FRPSRVHG RI PXOWLSOH
DUUDQJHPHQWV>@DQGVW\OHPLJUDWLRQ>@ LQVWUXPHQWV ZLWK GLIIHUHQW WHQVH V\PEROLF PXVLF JHQHUDWLRQ
7KH PXVLF JHQHUDWLRQ PRGHO EDVHG RQ JHQHUDWLYH DORQH LV OLPLWHG ,Q  'RQJ HW DO >@ SURSRVHG D *$1
DGYHUVDULDOQHWZRUNLPSOHPHQWDWLRQKDVEHHQSURSRVHGDVHDUO\ PRGHOIRUWKHSXUSRVHRIDUUDQJHPHQWJHQHUDWLRQZKLFKFDQEH
DV>@'LIIHUHQWIURPWKHWUDGLWLRQDO*$1VWUXFWXUHWKH\ GLYLGHG LQWR WKUHH VXEPRGHOV MDPPLQJ PRGHO FRPSRVHU
SURSRVHG WZR UHFXUUHQW QHXUDO QHWZRUN PRGHO ZLWK GLIIHUHQW PRGHODQGK\EULGPRGHO,QWKLVFDVHILYHW\SHVRIWUDFNV HJ
GHSWKV WUDLQHG E\ DGYHUVDULDO ZD\V 7KH 1DVK HTXLOLEULXP LV EDVVDQGGUXP DUHJHQHUDWHGZKLFKLVDOVRFDOOHG0XOWLWUDFN
]HUR ZKHQ WKH JHQHUDWRU JHQHUDWHV GDWD WKDW WKH GLVFULPLQDWRU VHTXHQWLDO *$1 0XVH*$1  7KH WKUHH VXEPRGHOV
FDQQRW DFFXUDWHO\ LGHQWLI\ (DFK XQLW LQ WKH JHQHUDWRU LV IHG D FRUUHVSRQG WR WKH WKUHH FDVHV RI FRUUHODWLRQ DPRQJ WKH ILYH
UDQGRP YHFWRU DQG FRPELQHG ZLWK WKH RXWSXW RI WKH SUHYLRXV WUDFNV  XQFRUUHODWHG IXOO\ FRUUHODWHG DQG PL[HG 7KH
XQLW ODWHU :KHUHDV WKHGLVFULPLQDWRU LV D ELGLUHFWLRQDO /670 FRUUHODWLRQ DPRQJ PXOWLSOH WUDFNV LV FRQWUROOHG E\ FKRRVLQJ
PRGHO WKDW FDQ HIIHFWLYHO\ VROYH WKH JUDGLHQW GLVDSSHDUDQFH ZKHWKHUWRVKDUHWKHQRLVHLQSXWRUQRWDQGJHQHUDWLQJZHLJKWV
SUREOHP 7KH VLJQDO LV PRGHOHG E\ IRXU VFDODUV WRQH OHQJWK ,QDGGLWLRQDWLPHVHULHVPRGHOLVDOVRSURSRVHGLQ0XVH7KH
IUHTXHQF\ LQWHQVLW\ DQG GXUDWLRQ 7KH PRGHO DFKLHYHV EDULVXVHGDVWKHWLPHXQLWDQGWKHQRLVHLVILUVWPDSSHGWRD
FRQWLQXRXV PXVLF JHQHUDWLRQ LQ D V\PEROLF PDQQHU WKURXJK KLGGHQYHFWRUE\WKHWHPSRUDOVWUXFWXUHJHQHUDWRUWRFRUUHVSRQG
DGYHUVDULDORYHUUHFXUVLYHQHXUDOQHWZRUNV7KLVLVDVLJQLILFDQW WR GLIIHUHQW WLPH VHULHV SRVLWLRQV 6XEVHTXHQWO\ WKH EDU
VWHSLQWKHILHOGRIPXVLFJHQHUDWLRQ JHQHUDWRU LV XVHG WR JHQHUDWH PXOWLSOH FRQVHFXWLYH EDUV RI
PXVLF E\ WKH KLGGHQ YHFWRU 7KH PXVLF JHQHUDWHG E\
7KH V\PEROLF PXVLF >@ KDV LQGHHG PDGH FRQVLGHUDEOH 0XVH*$1KDVDYHU\LPSUHVVLYHUHDOLVP$OWKRXJKLWLVQRWDV
DFKLHYHPHQWV +RZHYHU WKHUH DUH DOVR FHUWDLQ SUREOHPV 2QH JRRGDVKXPDQPXVLFLDQV FRPSRVLWLRQVLWVWLOOEUHDNVWKURXJK
RIWKHSUREOHPVLVWKDWWKHPXVLFJHQHUDWHGE\WKHPRGHOGRHV WKH H[SHFWHG HIIHFW DQG EHFRPH FORVHU WR IXOO\ DXWRPDWLF
QRW PHHW WKH FULWHULD IRU KXPDQ MXGJPHQW RI PXVLF \HW ,Q FRPSRVLWLRQV
DGGLWLRQ WR WKH FRQWLQXLW\ RI QRWHV PXVLF KDV PDQ\ RWKHU
IHDWXUHV 2QH RI WKHVH LPSRUWDQW IHDWXUHV LV VHOIUHSHWLWLRQ 6W\OH PLJUDWLRQ LV WKH ODVW FDWHJRU\ RI PXVLF JHQHUDWLRQ
PHQWLRQHGE\+DUVK-KDPWDQLDQG7D\ORU%HUJ.LUNSDWULFN>@ ZLWK*$17KHLGHDRIWKLVWDVNLVWRJHQHUDWHDPXVLFZLWKD
7KH DXWKRUV WRRN WKH VHOIUHSHWLWLRQ RI PXVLF DV DQ LPSRUWDQW WRWDOO\ QHZ VW\OH EDVHG RQ WKH H[LVWLQJ PXVLF 7KH SDSHU
LQGLFDWRU DQG SURSRVHG WKH VHOIVLPLODULW\PDWUL[ JHQHUDWLYH SXEOLVKHG E\ %UXQQHU HW DO >@ LQ  LV D JUHDW H[DPSOH
DGYHUVDULDOQHWZRUN 660*$1 DVVKRZQLQ)LJXUH>@7KH ZKHUH &\FOH*$1 LV DSSOLHG WR LPSOHPHQW PXVLF VW\OH
JHQHUDWRU LQ 660*$1 LV EDVHG RQ WKH 9DULDWLRQDO $XWR PLJUDWLRQ +RZHYHU WZR DGGLWLRQDO GLVFULPLQDWRUV DUH DGGHG
(QFRGHUV 9$(  DQG WKH QRWHV DUH HQFRGHG DV ORZ LQWRWKHPRGHOLQRUGHUWRLPSURYHWKHILGHOLW\7KH\DUHWUDLQHG
GLPHQVLRQDO PHWULF HPEHGGLQJV $IWHUZDUGV WKH HPEHGGLQJV WRGLVWLQJXLVKIDNHGDWDDVZHOODVWKHGDWDIURPWKHGRPDLQRI
ZLOO EH GHFRGHG E\ QRWHOHYHO GHFRGHU 7KH GLVFULPLQDWRU LV PXVLF WKDW LV GLIIHUHQW IURP WKH WDUJHW RQH 7KLV DSSURDFK LV
GLYLGHG LQWR VHOIVLPLODULW\ PDWUL[ '6  DQG VHTXHQFH RI KHOSIXO WR VSHFLI\ WKH JHQHUDWRU LH WKH JHQHUDWHG PXVLF LV
PHDVXUHV '/ '6LVWUDLQHGDIWHUHQFRGLQJWKHVHOIVLPLODULW\ PRUHFUHGLEOH
PDWUL[ XVLQJ D PXOWLOD\HU FRQYROXWLRQDO HQFRGHU ZKLOH '/ LV
EDVHGRQWKH/670WRGLVFULPLQDWHPXOWLSOHFRQVHFXWLYHPXVLF ,9 *$175$,1,1*6
PHDVXUHV 7KH WHVW UHVXOWV VKRZ WKDW ZLWK WKH LQFOXVLRQ RI WKH
,QIDFWPDQ\QHWZRUNVWUXFWXUHVDUHDEOHWREHLPSOHPHQWHG
VHOIVLPLODULW\ PDWUL[ GLVFULPLQDWRU WKH DXWKHQWLFLW\ RI WKH
LQ PXVLF JHQHUDWLRQ HJ :DYH 1HW RU /670 PRGHOV 7KH
PXVLFJHQHUDWHGE\WKHJHQHUDWRULVJUHDWO\LPSURYHG
VSHFLDOIHDWXUHRI*$1LVWKHLGHDRIDGYHUVDULDOZKLFKLVDOVR

353

Authorized licensed use limited to: PES University Bengaluru. Downloaded on January 06,2025 at 08:51:35 UTC from IEEE Xplore. Restrictions apply.
NQRZQ DV WZR SOD\HU JDPH 7KH JHQHUDWRU DQG WKH A. Music Composing
GLVFULPLQDWRUFDQEHYLHZHGDVWZRVHSDUDWHQHWZRUNVWUXFWXUHV ,Q SUHYLRXV VHFWLRQV ZH KDYH H[SODLQHG V\PEROLF PXVLF
WKDWLPSURYHHDFKRWKHU VSHUIRUPDQFHE\DGYHUVDULDOZD\V)RU JHQHUDWLRQ DQG JHQHUDWLQJ DUUDQJHPHQWV LQ GHWDLOV 7KH\ ERWK
WKHJHQHUDWRUWKHLQSXWLVDFRPSXWHUJHQHUDWHGUDQGRPQRLVH EHORQJ WR PXVLF FRPSRVLQJ ZKLFK DSSO\ PDFKLQH OHDUQLQJ
7KH SXUSRVH RI WKH JHQHUDWRU LV WR JHQHUDWH DV PXFK UHDOLVWLF WHFKQLTXHV WR JHQHUDWH QHYHUEHIRUHVHHQ WXQHV IURP VFUDWFK
GDWD DV SRVVLEOH IURP WKLV UDQGRP QRLVH DQG SDVV LW WR WKH :LWK WKH LPSURYHPHQW RI GHHS OHDUQLQJ VNLOOV WKH WXQHV DUH
GLVFULPLQDWRU %HVLGHV LW LV UHVSRQVLEOH IRU GHWHUPLQLQJ JHWWLQJFORVHUWRWKHDFWXDORQHV$VVWDWHGDERYH660*$1>@
ZKHWKHU WKH GDWD REWDLQHG LV UHDO RU QRW DQG WKH HUURU RI LQWURGXFHVHOIVLPLODULW\PDWUL[GLVFULPLQDWRUWRWUDLQWKHVFRUH
GLVFULPLQDWRU LV XVHG WR XSGDWH WKH JHQHUDWRU ,Q WKH DFWXDO WR EH UHSHWLWLYH 0XVH*$1 >@ GLVFXVVHG WKH SDWK WR UHODWH
WUDLQLQJSURFHVVQVDPSOHVZLOOEHH[WUDFWHGIURPWKHGDWDVHW GLIIHUHQW LQVWUXPHQWV WR HDFK RWKHU ,W PDWFKHV WLPEUH DQG
ZKLOH WKH JHQHUDWRU ZLOO DOVR JHQHUDWH Q VDPSOHV RI IDNH GDWD LQWHQVLW\ SURSHUO\ E\ VKDULQJ JHQHUDWRUV EHWZHHQ GLIIHUHQW
EDVHGRQUDQGRPQRLVH$WWKLVSRLQWWKHJHQHUDWRULVIL[HGDQG WUDFNV
WKHWZRVHWVRIVDPSOHVDUHFRPELQHGWRWUDLQWKHGLVFULPLQDWRU
7KH GLVFULPLQDWRU LV WUDLQHG P WLPHV DQG WKH JHQHUDWRU LV 7KH DWWUDFWLRQ RI DXWRPDWLF PXVLF FRPSRVLWLRQ LV WKH ORZ
XSGDWHG RQFHE\WKHGLVFULPLQDWRU HUURU7KHDERYHSURFHVV LV UR\DOWLHV RI PXVLF JHQHUDWHG E\ PDFKLQH 7KH PXVLF LV EDVHG
UHSHDWHG VHYHUDO WLPHV XQWLO LW UHDFKHV 1DVK HTXLOLEULXP LQ RQSXEOLFDXGLRVRXUFHVDQGQHYHUDSSHDUHGEHIRUH$,9$DQ
ZKLFKWKHJHQHUDWRUDFKLHYHVWKHGHVLUHGHIIHFWRIEHLQJDEOHWR HOHFWURQLFFRPSRVHUUHFRJQL]HGE\WKH6$&(0LVDQLQVWDQFH
LPLWDWHWKHUHDOGDWDLHWKHWUDLQLQJLVVWRSSHG DOUHDG\ LQ XVH ,W HPSRZHUV XVHUV WR XWLOL]H WKH PXVLF LQ
DGYHUWLVLQJPRYLHVRUYLGHRJDPHV
7R VXP XS WKHUH DUH PDQ\ WHFKQLTXHV LQ WUDLQLQJ *$1V
ZKLFK PDNHV LW HDVLHU WR DFKLHYH WKH JRDO E\ FRPSUHKHQVLYH
B. Music Genre Transfer
XQGHUVWDQGLQJ
/LNH VW\OH WUDQVIHU LQ LPDJH SURFHVV WKH WUDQVIRUPDWLRQ RI
A. Achieving a modified loss function PXVLFVW\OHLVDOVRDKRWWRSLFLQWKHILHOGRIPXVLFJHQHUDWLRQ
'LIIHUHQWIURPPXVLFFRPSRVLQJWKHVFRUHLVDOUHDG\JLYHQWR
0D[ ORJ'  LV D EHWWHU PHWKRG WR RSWLPL]H WKH JHQHUDWRU WKHPRGHOLQWKHSURFHVVRIPXVLFJHQUHWUDQVIHU7KHRXWSXWRI
WKDQ PLQ ORJ'  LQ WKHRULJLQDO *$1:LWK WKHGHYHORSPHQW WKHPRGHOVKDUHVVDPHPHORG\ZLWKWKHLQSXWZKLOHWKHVW\OHLV
RI*$1WKHXVHRIFURVVHQWURS\ORVVKDVEHFRPHOHVVIUHTXHQW GLIIHUHQW )LJXUH  >@ LOOXVWUDWHG WKH DQDORJ\ IURP LPDJH WR
EHFDXVHFURVVHQWURS\LVQRWVWDEOHDQGWRRLQHIILFLHQW,QPDQ\ DXGLRZLWKRXWWKHXVHRI*$1
QHZ *$1 PRGHOV HJ '&*$1 :*$1 DQG /HDVW 6TXDUHV
*$1 QHZPHWKRGVDUHPHQWLRQHGWRRSWLPL]H*$1PRGHOV

B. Avoid dilution gradients


'XULQJ WKH WUDLQLQJ SURFHVV RI *$1 LQWURGXFLQJ VSDUVH
JUDGLHQWV HJ 5H/8 DQG 0D[3RRO  ZLOO KDYH D JUHDW LPSDFW
RQ WKH VWDELOLW\ RI *$1 7KHUHIRUH RQH FDQ WU\ WR XVH
/HDN\5H/8 IRU DFWLYDWLRQ IXQFWLRQ VHOHFWLRQ DQG DYHUDJH
SRROLQJ RU &RQYGVWULGH IRU GRZQVDPSOLQJ ,Q UHDOLW\ LW LV
LPSRUWDQW WR DYRLG LQIRUPDWLRQ ORVV LQ WKH *$1 WUDLQLQJ
SURFHVVLHSRRLQJPHWKRGVKRXOGDOVREHDYRLGHGDVPXFKDV
SRVVLEOH

C. Awareness of training failure in advance )LJXUH'UDZLQJDQDORJ\IURPH[DPSOHEDVHGLPDJHVW\OL]DWLRQ
,W LV LPSRUWDQW WR UHDOL]H WKH WUDLQLQJ IDLOXUH LQ DGYDQFH WR
DYRLG XQQHFHVVDU\ ZDVWH RI WLPH LI WKH GLVFULPLQDWRU V ORVV 1HYHUWKHOHVV LW LV KDUG WR JHW SDLUHG PXVLF IRU WUDLQLQJ
ULVHVVWHDGLO\EHFRPHUHDOO\ODUJHRUIDOOVVWHDGLO\PDNHVLWWRR &\FOH*$1>@XVHVF\FOHFRQVLVWHQWPHWKRGVWRRYHUFRPHWKLV
VPDOO ,Q WKLV FDVH LW PHDQV WKDW WKH SHUIRUPDQFH RI WKH SUREOHP ,W KDV DFKLHYHG JUHDW VXFFHVV LQ XQSDLUHG LPDJHWR
JHQHUDWRUFDQQRORQJHUEHLPSURYHGDQGWKHGLVFULPLQDWRUFDQ LPDJH WUDQVODWLRQ ,QVSLUHG E\ WKLV PRGHO WHDP LQ (7+=
HDVLO\GHWHUPLQHWKHDXWKHQWLFLW\RIWKHLPDJH7KHUHIRUHLWLV UHDOL]HG V\PEROLF PXVLF JHQUH WUDQVIHU ZLWK PRGLILHG
EHWWHUWRVWRSDQGPRGLI\WKH*$1DVVRRQDVSRVVLEOH &\FOH*$1 >@ 7KH\ KDYH YHULILHG WKH IHDVLELOLW\ RI PXVLF
JHQUHWUDQVIHUE\*$1
9 $33/,&$7,2162)086,&*(1(5$7,1**$16
C. Audio Generating
7KHPXVLFJHQHUDWLQJ*$1KDVEHHQSXWLQWRH[WHQVLYHXVH
,QIROORZLQJSDUWVZHIRFXVRQVHYHUDOUHSUHVHQWDWLYHH[DPSOHV 'HVSLWHWKHVFRUHWKHDXGLRJHQHUDWLQJLVDOVRDQLPSRUWDQW
RIDSSOLFDWLRQVLQGLIIHUHQWILHOGVWKDWDSSHDUHGLQWKHOLWHUDWXUH LVVXH LQ PXVLF ,W LV DQ H[FHOOHQW ZD\ WR EULQJ PHFKDQLFDOO\
7KH\ KDYH EHHQ VXEVHTXHQWO\ UHILQHG LQ WKH SURFHVV RI JHQHUDWHG PXVLF WR OLIH E\ ILQGLQJ D ZD\ WR WUDLQ WKH PRGHO
DSSOLFDWLRQ SOD\LQJWKHLQVWUXPHQWVPRUHUHDOLVWLF
:DYH*$1 >@ WULHG WR XVH FRQYROXWLRQ IOH[LEO\ WR FDWFK
WKHSHULRGLFLW\RIVRXQGVLJQDO,WDOVRLQWURGXFHGSKDVHVKXIIOH
LQ GLVFULPLQDWRU WR DYRLG SOD\LQJ LQVWUXPHQWV PRQRWRQRXVO\

354

Authorized licensed use limited to: PES University Bengaluru. Downloaded on January 06,2025 at 08:51:35 UTC from IEEE Xplore. Restrictions apply.
+RZHYHU RQO\ ORFDO DXGLR VWUXFWXUHV FDQ EH PRGHOHG LQ DSSO\ *$1 LQ FHUWDLQ ZD\V DQG LPSURYH WKH SHUIRUPDQFH RI
:DYH*$1:RUNZLWK*$16\QWK>@VROYHVWKLVSUREOHPE\ H[LVWLQJZRUNV
LQWHUSRODWLQJ ODWHQW DQG SLWFK YHFWRUV )RU WKH VDNH RI
JHQHUDWLQJPRUHFRKHUHQWZDYHIRUPVWKH\DOVRUHSODFHVWULGHG 5()(5(1&(6
FRQYROXWLRQVZLWKORJPDJQLWXGHVSHFWURJUDPVDQGSKDVHV
>@ %ULRW - ³)URP DUWLILFLDO QHXUDO QHWZRUNV WR GHHS OHDUQLQJ IRU PXVLF
JHQHUDWLRQ KLVWRU\ FRQFHSWV DQG WUHQGV´ 1HXUDO &RPSXWLQJ DQG
9, ',6&866,21 $SSOLFDWLRQVSS
>@ *RRGIHOORZ,3RXJHW$EDGLH-0LU]D0;X%:DUGH)DUOH\'
$W SUHVHQW PRVW PXVLF JHQHUDWLQJ PRGHOV ODFN RI EDVLF 2]DLU6&RXUYLOOH$& %HQJLR<³*HQHUDWLYH$GYHUVDULDO1HWV´
PXVLF WKHRU\ ZKLFK OHDGV WR LQDSSURSULDWH SDXVHV LQ WKH 1,36
UHVXOWLQJ PXVLF ,W LV VWLOO DQ LVVXH ZKHWKHU D NQRZOHGJH RI >@ &UHVZHOO$:KLWH7'XPRXOLQ9$UXONXPDUDQ.6HQJXSWD%
PXVLF WKHRU\ ZLOO PDNH PRGHOV FRPSRVH PXVLF EHWWHU %KDUDWK$³*HQHUDWLYH$GYHUVDULDO1HWZRUNV$Q2YHUYLHZ´,(((
0RUHRYHU WKH DSSURDFK WR UHDOL]H WKH FDVH DOVR QHHGV WR EH 6LJQDO3URFHVVLQJ0DJD]LQHSS
LQYHVWLJDWHG >@ 0RJUHQ2³&511*$1&RQWLQXRXVUHFXUUHQWQHXUDOQHWZRUNVZLWK
DGYHUVDULDOWUDLQLQJ´XQSXEOLVKHG
$QRWKHU VLPLODU SUREOHP LV UHFRJQL]LQJ WKH IXQFWLRQ RI >@ -KDPWDQL +  %HUJ.LUNSDWULFN 7 ³0RGHOLQJ 6HOI5HSHWLWLRQ LQ
LQVWUXPHQWV)RUVRPHKLJKVWDQGDUGVRIGHPDQGLWLVUHTXLUHG 0XVLF*HQHUDWLRQXVLQJ*HQHUDWLYH$GYHUVDULDO1HWZRUNV´LQSUHVV
WRJHQHUDWHPXVLFZLWKPXOWLSOHWUDFNV+RZHYHUGLYHUVHWUDFNV >@ 'RQJ + +VLDR : <DQJ /  <DQJ < ³0XVH*$1 0XOWLWUDFN
SOD\ GLIIHUHQW UROHV LQ WKH DUUDQJHPHQW LH LW QHHGV WR EH 6HTXHQWLDO *HQHUDWLYH $GYHUVDULDO 1HWZRUNV IRU 6\PEROLF 0XVLF
GLVFXVVHGWKHZD\WROHW*$1FDWFKWKLVIHDWXUH *HQHUDWLRQDQG$FFRPSDQLPHQW´$$$,
>@ %UXQQHU*:DQJ<:DWWHQKRIHU5 =KDR6³6\PEROLF0XVLF
,WLVQRWLFHGWKDWPXVLFKDVVHOIUHSHWLWLYHSURSHUW\0RGHOV *HQUH 7UDQVIHU ZLWK &\FOH*$1´  ,((( WK ,QWHUQDWLRQDO
OLNH 660*$1 DUH DEOH WR JHQHUDWH VHOIUHSHWLWLRQ PXVLF &RQIHUHQFHRQ7RROVZLWK$UWLILFLDO,QWHOOLJHQFH ,&7$, SS
1HYHUWKHOHVV WKH UHSHDWDELOLW\ EHFRPHV PRQRWRQRXV DV WKH 
OHQJWK RI WKH PXVLF LQFUHDVHV ,W LV D FKDOOHQJH WR EDODQFH WKH >@ *ULQVWHLQ ( 'XRQJ 14 2]HURY $  3pUH] 3 ³$XGLR 6W\OH
UHSHWLWLRQDQGYDULHW\LQPXVLFJHQHUDWLQJ,QRWKHUZRUGVRQH 7UDQVIHU´  ,((( ,QWHUQDWLRQDO &RQIHUHQFH RQ $FRXVWLFV 6SHHFK
DQG6LJQDO3URFHVVLQJ ,&$663 SS
QHHGVWRDGMXVWWKHORQJWHUPVWUXFWXUHDQGVKRUWWHUPVWUXFWXUH
>@ =KX - 3DUN 7 ,VROD 3  (IURV $$ ³8QSDLUHG ,PDJHWR,PDJH
7UDQVODWLRQ8VLQJ&\FOH&RQVLVWHQW$GYHUVDULDO1HWZRUNV´,(((
9,,&21&/86,21 ,QWHUQDWLRQDO&RQIHUHQFHRQ&RPSXWHU9LVLRQ ,&&9 SS

,Q VXPPDU\ PXVLF JHQHUDWLRQ EHQHILWV IURP WKH >@ %UXQQHU*:DQJ<:DWWHQKRIHU5 =KDR6³6\PEROLF0XVLF
H[SRQHQWLDOO\ GHYHORSPHQWV RI *$1 PRGHOV ZKHUH JUHDW *HQUH 7UDQVIHU ZLWK &\FOH*$1´  ,((( WK ,QWHUQDWLRQDO
EUHDNWKURXJKVKDYHEHHQPDGHLQDOODVSHFWV,QWKLVSDSHUZH &RQIHUHQFHRQ7RROVZLWK$UWLILFLDO,QWHOOLJHQFH ,&7$, SS
KDYHLQWURGXFHGVHYHUDODFKLHYHPHQWVPDGHE\*$1DVZHOODV 
GLVFXVVHG WKH GHWDLOV DQG LVVXHV 7KHVH UHVXOWV SDYH D SDWK WR >@ 'RQDKXH & 0F$XOH\ -  3XFNHWWH 0 ³$GYHUVDULDO $XGLR
XQGHUVWDQG WKH UHFHQW SURJUHVV LQ PXVLF JHQHUDWLRQ +RZHYHU 6\QWKHVLV´,&/5
WKLV SHUVSHFWLYH MXVW SUHVHQWV WKH WLS RI WKH LFHEHUJ GXH WR WKH >@ (QJHO-$JUDZDO..&KHQ6*XOUDMDQL,'RQDKXH& 5REHUWV
$³*$16\QWK$GYHUVDULDO1HXUDO$XGLR6\QWKHVLV´,&/5
OHQJWK OLPLWV 7KHUH DUH VWLOO PDQ\ SRWHQWLDO RSSRUWXQLWLHV WR


355

Authorized licensed use limited to: PES University Bengaluru. Downloaded on January 06,2025 at 08:51:35 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy