An efficient, parallelized algorithm for optimal conditional entropy-based feature selection

Full metadata record
DC FieldValueLanguage
dc.contributorLCC - Laboratório de Ciclo Celularpt_BR
dc.contributorCentro de Toxinas, Resposta-imune e Sinalização Celular (CeTICS)pt_BR
dc.contributor.authorEstrela, Gustavopt_BR
dc.contributor.authorGubitoso, Marco Dimaspt_BR
dc.contributor.authorFerreira, Carlos Eduardopt_BR
dc.contributor.authorBarrera, Juniorpt_BR
dc.contributor.authorReis, Marcelo da Silvapt_BR
dc.date.accessioned2020-07-09T21:28:08Z-
dc.date.available2020-07-09T21:28:08Z-
dc.date.issued2020pt_BR
dc.identifier.citationEstrela G, Gubitoso MD, Ferreira CE, Barrera J, Reis MS. An efficient, parallelized algorithm for optimal conditional entropy-based feature selection. Entropy. 2020 Apr;22(4):492. doi:10.3390/e22040492.pt_BR
dc.identifier.urihttps://repositorio.butantan.gov.br/handle/butantan/3069-
dc.description.abstractIn Machine Learning, feature selection is an important step in classifier design. It consists of finding a subset of features that is optimum for a given cost function. One possibility to solve feature selection is to organize all possible feature subsets into a Boolean lattice and to exploit the fact that the costs of chains in that lattice describe U-shaped curves. Minimization of such cost function is known as the U-curve problem. Recently, a study proposed U-Curve Search (UCS), an optimal algorithm for that problem, which was successfully used for feature selection. However, despite of the algorithm optimality, the UCS required time in computational assays was exponential on the number of features. Here, we report that such scalability issue arises due to the fact that the U-curve problem is NP-hard. In the sequence, we introduce the Parallel U-Curve Search (PUCS), a new algorithm for the U-curve problem. In PUCS, we present a novel way to partition the search space into smaller Boolean lattices, thus rendering the algorithm highly parallelizable. We also provide computational assays with both synthetic data and Machine Learning datasets, where the PUCS performance was assessed against UCS and other golden standard algorithms in feature selectionpt_BR
dc.description.sponsorshipConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)pt_BR
dc.description.sponsorshipFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)pt_BR
dc.format.extent492pt_BR
dc.language.isoEnglishpt_BR
dc.relation.ispartofEntropypt_BR
dc.rightsOpen Accesspt_BR
dc.titleAn efficient, parallelized algorithm for optimal conditional entropy-based feature selectionpt_BR
dc.typeArticlept_BR
dc.identifier.doi10.3390/e22040492pt_BR
dc.identifier.urlhttps://doi.org/10.3390/e22040492pt_BR
dc.contributor.externalUniversidade de São Paulo (USP)¦¦Brasilpt_BR
dc.identifier.citationvolume22pt_BR
dc.identifier.citationissue4pt_BR
dc.subject.keywordmachine learningpt_BR
dc.subject.keywordsupervised learningpt_BR
dc.subject.keywordinformation theorypt_BR
dc.subject.keywordmean conditional entropypt_BR
dc.subject.keywordfeature selectionpt_BR
dc.subject.keywordclassifier designpt_BR
dc.subject.keywordSupport-Vector Machinept_BR
dc.subject.keywordU-curve problempt_BR
dc.subject.keywordBoolean latticept_BR
dc.relation.ispartofabbreviatedEntropypt_BR
dc.identifier.citationabntv. 22, n. 4, 492, abr. 2020pt_BR
dc.identifier.citationvancouver2020 Apr;22(4):492pt_BR
dc.contributor.butantanEstrela, Gustavo|:|:LCC - Laboratório de Ciclo Celular|:PrimeiroAutorpt_BR
dc.contributor.butantanReis, Marcelo da Silva|:Pesquisador|:LCC - Laboratório de Ciclo Celular:Centro de Toxinas, Resposta-imune e Sinalização Celular (CeTICS)|:Autor de correspondênciapt_BR
dc.sponsorship.butantanConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)¦¦pt_BR
dc.sponsorship.butantanFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)¦¦2013/07467-1pt_BR
dc.sponsorship.butantanFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)¦¦2015/01587-0pt_BR
dc.sponsorship.butantanFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)¦¦2016/25959-7.pt_BR
dc.identifier.bvsccBR78.1pt_BR
dc.identifier.bvsdbIBProdpt_BR
dc.description.dbindexedYespt_BR
item.openairetypeArticle-
item.fulltextCom Texto completo-
item.grantfulltextopen-
item.languageiso639-1English-
crisitem.author.dept#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.dept#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.dept#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.dept#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.dept#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.orcid0000-0002-3754-9115-
crisitem.author.parentorg#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.parentorg#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.author.parentorg#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.journal.journalissn#PLACEHOLDER_PARENT_METADATA_VALUE#-
crisitem.journal.journaleissn#PLACEHOLDER_PARENT_METADATA_VALUE#-
Appears in Collections:Artigos de periódicos


Files in This Item:

10.3390e22040492.pdf
Size: 1.07 MB
Format: Adobe PDF
View/Open
Show simple item record

The access to the publications deposited in this repository respects the licenses from journals and publishers.